Ideology and Power Identification in Parliamentary Debates 2024

Synopsis

This task consists of two subtasks on identifying two important aspects of a speaker in parliamentary debates:
  • Sub-Task 1: Given a parliamentary speech in one of several languages, identify the ideology of the speaker's party.
  • Sub-Task 2: Given a parliamentary speech in one of several languages, identify whether the speaker's party is currently governing or in opposition.
  • Communication: [mailing lists: task, organizers]
  • Training data: [download]
Register now

Important Dates

Subscribe to the mailing list to receive notifications.

  • Dec. 18, 2023: CLEF Registration opens. [register]
  • May 6, 2024: Approaches submission deadline.
  • May 31, 2024: Participant paper submission.
  • June 21, 2024: Peer review notification.
  • July 8, 2024: Camera-ready participant papers submission.
  • Sep. 9-12, 2024: CLEF Conference in Grenoble and Touché Workshop.

All deadlines are 23:59 anywhere on earth (UTC-12).

Task

Debates in national parliaments do not only affect the fundamental aspects of citizens' life, but often a broader area, or even the whole world. As a form of political debate, however, parliamentary speeches are often indirect and present a number of challenges to computational analyses. In this task, we focus on identifying two variables associated with speakers in a parliamentary debate: their political ideology and whether they belong to a governing party or a party in opposition. Both subtasks are formulated as binary classification tasks.

Data

The data for this task comes from ParlaMint, a multilingual comparable corpora of parliamentary debates. The data is sampled from the ParlaMint in a way to reduce the potential confounding variables (e.g., speaker identity). Please join the task mailing list to stay up-to-date and report problems. The data is provided as tab-separated text files. The following shows a toy example:
id speaker sex text text_en label 
gb01 spk1   F   First text. First text in English. 0
gb02 spk2   M   Second text. Second text in English. 1
gb03 spk3   M   Third text. Third text in English. 0
gb04 spk4   F   Fourth text. Fourth text in English. 1
gb06 spk5   M   Fifth text. Fifth text in English. 0
  • id is a unique (arbitrary) ID for each text.
  • speaker is a unique (arbitrary) ID for each speaker. There may be multiple speeches from the same speaker.
  • sex is the (binary/biological) sex of the speaker. The values in this field can be Female, Memale, and Unspecified/Unknown.
  • text is the transcribed text of the parliamentary speech. Real examples may include line breaks, and other special sequences escaped or quoted.
  • text_en is the automatic translation of the text to English. This field may be empty - obviously for speeches in English, but the translation may also be missing for a small number of non-English speeches.
  • label is the binary/numeric label. For political orientation, 0 is left and 1 is right. For power identification 0 indicates opposition and 1 indicates coalition (or governing party).
Participants are not required to use the first four fields, but may want to use them for improving the predictions (e.g., in a joint/multi-task learning model, or to explain away the effect of speaker style for better generalization). Similarly the field text_en is provided for convenience. It may help building quick multilingual classifiers, or help understanding and analyzing data in languages participants do not speak. The test files will also include exact same fields except label. A small trial sample is provided for both political orientation and power identification. We provide training data for the following national or regional parliaments:
  • Austria (at)
  • Bosnia and Herzegovina (ba)
  • Belgium (be)
  • Bulgaria (bg)
  • Czechia (cz)
  • Denmark (dk)
  • Estonia (ee)
  • Spain (es)
  • Catalonia (es-ct)
  • Galicia (es-ga)
  • Basque Country (es-pv) [only power]
  • Finland (fi)
  • France (fr)
  • Great Britain (gb)
  • Greece (gr)
  • Croatia (hr)
  • Hungary (hu)
  • Iceland (is) [only political orientation]
  • Italy (it)
  • Latvia (lv)
  • The Netherlands (nl)
  • Norway (no) [only political orientation]
  • Poland (pl)
  • Portugal (pt)
  • Serbia (rs)
  • Sweden (se) [only political orientation]
  • Slovenia (si)
  • Turkey (tr)
  • Ukraine (ua)

Evaluation

Both subtasks will use macro-averaged F1-score as the main metric of evaluation. The submission system will evaluate runs automatically using F1-score, Precision and Recall.

Submission

The participants are welcome to participate in any of the task - parliament combinations. We do not provide a special track for multilingual models, but participants are encouraged to make use of cross-lingual approaches for improving their predictions. The participants are allowed to use any external datasets, except the source data from ParlaMint. The submission system will open soon. Register on the mailing list to get notified. We provide a simple linear baseline with code for reading and writing the files.

Task Committee