Ideology and Power Identification in Parliamentary Debates 2025

Synopsis

  • Sub-Task 1: Given a parliamentary speech in one of several languages, identify the ideology of the speaker's party.
  • Sub-Task 2: Given a parliamentary speech in one of several languages, identify whether the speaker's party is currently governing or in opposition.
  • Sub-Task 3 (new!): Given a parliamentary speech, identify the position of the speaker's party in populist - pluralist scale.
  • Communication: [mailing lists: task, Touché, organizers]
  • Training data: [download]
  • Baseline and trial data: [baseline]
Register for participation Join the Touché mailing list

Important Dates

Subscribe to the Touché mailing list to receive notifications.

  • 2024-11: CLEF Registration opened [register]
  • 2025-05-10: Approaches submission deadline
  • 2025-05-30: Participant paper submission
  • 2025-06-10: Peer review notification
  • 2025-07-07: Camera-ready participant papers submission
  • 2025-09: CLEF Conference in Madrid and Touché Workshop

All deadlines are 23:59 CEST (UTC+2).

Task

Debates in national parliaments do not only affect the fundamental aspects of citizens' life, but often a broader area, or even the whole world. As a form of political debate, however, parliamentary speeches are often indirect and present a number of challenges to computational analyses. In this task, we focus on identifying two variables associated with speakers in a parliamentary debate: their political ideology and whether they belong to a governing party or a party in opposition.

Data

The data for this task comes from ParlaMint, a multilingual comparable corpora of parliamentary debates. The data is sampled from the ParlaMint in a way to reduce the potential confounding variables (e.g., speaker identity). Please join the task mailing listand the lab mailing list to stay up-to-date and report problems. The data is provided as tab-separated text files. The following shows a toy example:
id speaker sex text text_en orientation  power  populism 
gb01 spk1   F   First text. First text in English. 0 1  2 
gb02 spk2   M   Second text. Second text in English. 1 0  1 
gb03 spk3   M   Third text. Third text in English. 0 1  4 
gb04 spk4   F   Fourth text. Fourth text in English. 1 0  3 
gb06 spk5   M   Fifth text. Fifth text in English. 0 0  1 
  • id is a unique (arbitrary) ID for each text.
  • speaker is a unique (arbitrary) ID for each speaker. There may be multiple speeches from the same speaker.
  • sex is the (binary/biological) sex of the speaker. The values in this field can be Female, Memale, and Unspecified/Unknown.
  • text is the transcribed text of the parliamentary speech. Real examples may include line breaks, and other special sequences escaped or quoted.
  • text_en is the automatic translation of the text to English. This field may be empty - obviously for speeches in English, but the translation may also be missing for a small number of non-English speeches.
  • orientation is the binary label (0: left and 1: right). Orientation labels are based on Wikipedia.
  • power is the binary label for power role (0: opposition, 1: coalition, or the governing party). This information is based on the information provided by the ParlaMint contributors.
  • populism is a populism index based on multiple expert surveys (to increase the coverage). We focus on a particular dimension of populism in this task: the position of the party of the speaker in populist - pluralist spectrum. This is measured on a 4-point ordinal scale (1: Strongly Pluralist, 2: Moderately Pluralist 3: Moderately Populist, 4: Strongly Populist).
Not all values are present in all parliaments. Some parties/speakers are not covered by the data, and some values are missing due to failure to match the source identifies/names and ParlaMint identifiers. Missing values are indicated as 'NA'. We also included populism indices for some parliaments, even if there is a single value. Although this makes classification within the parliament meaningless, it can be helpful for multitask or multi-parliament/multilingual approaches. A small trial sample is provide at here.

Participants are not required to use all input fields, but may want to use them for improving the predictions (e.g., in a joint/multi-task learning model, or to explain away the effect of speaker style for better generalization). Similarly the field text_en is provided for convenience. It may help building quick multilingual classifiers, or help understanding and analyzing data in languages participants do not speak. The test files will also include exact same fields except for the speaker IDs and the orientation, power and populism labels.

We provide training data for the following national or regional parliaments:

  • Austria (at)
  • Bosnia and Herzegovina (ba)
  • Belgium (be)
  • Bulgaria (bg)
  • Czechia (cz)
  • Denmark (dk)
  • Estonia (ee)
  • Spain (es)
  • Catalonia (es-ct)
  • Galicia (es-ga)
  • Basque Country (es-pv)
  • Finland (fi)
  • France (fr)
  • Great Britain (gb)
  • Greece (gr)
  • Croatia (hr)
  • Hungary (hu)
  • Iceland (is)
  • Italy (it)
  • Latvia (lv)
  • The Netherlands (nl)
  • Norway (no)
  • Poland (pl)
  • Portugal (pt)
  • Serbia (rs)
  • Sweden (se)
  • Slovenia (si)
  • Turkey (tr)
  • Ukraine (ua)

Evaluation

For all subtasks will use macro-averaged F1-score as the main metric of evaluation. The submission system will evaluate runs automatically using F1-score, Precision and Recall.

Submission

The participants are welcome to participate in any of the task - parliament combinations. We do not provide a special track for multilingual models, but participants are encouraged to make use of cross-lingual approaches for improving their predictions. The participants are allowed to use any external datasets, except the source data from ParlaMint. Further information on the submission system will be provided soon.

Task Committee