Datasprint: The political debate in Europe illuminated through text and data mining

Come and participate in our datasprint, where we will go from data to results in less than two days while having a good time.

NB: Participants in this data sprint are expected to have at least a fundamental knowledge of Python programming and data structuring in Python.

We invite students to take part in the exploration of the dataset “‘Multilingual comparable corpora of parliamentary debates ParlaMint 2.1’”. (Erjavec, Tomaž; et al., 2021, Multilingual comparable corpora of parliamentary debates ParlaMint 2.1, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1432)

The ‘Multilingual comparable corpora of parliamentary debates ParlaMint 2.1’ is a unique opportunity to look inside the different national Parliament of Europe.

Participants will be divided into groups. In the groups, you must decide for a relevant question that you can examine and answer with data. You have about 10 hours to answer the question and to prepare a presentation of 5-7 min, which must contain 3 - 5 graphs / visualizations.

A relevant question could for example deal with the distribution of age, gender, political parties or it could be an analysis of the speeches. The dataset is highly structured. Row after row of speeches representing the political attitudes and agendas of parties and politicians from 2015 to 2020. Beside speeches, you will find metadata about the politicians, for example, speakers name, gender, party affiliation, data of birth etc.

The Datasprint will take place Thursday 10 February from 2 pm. to 7 pm. and Friday 11 February from 9 am. to 3 pm.

The datasprint is organized jointly by Copenhagen University Library Datalab (KUB Datalab) and The Master programme in social data science (SDS).

Register here. Deadline 23. January 2022.