PhD defence on "Biases in Natural Language Processing"
SODAS PhD student Terne Sasha Thorn Jakobsen will defend her dissertation "Biases in Natural Language Processing" on Friday 31st March 15.30-18.30 CET
Title: Biases in Natural Language Processing
Date: 31st March at 15.30-18.30 CET
Zoom-link: https://ucph-ku.zoom.us/j/64782787189?pwd=L1lJQXV1eUJ0d0dRWDFzM0VNK0FHUT09
Abstract: ”In recent years, Machine Learning and Natural Language Processing (NLP) systems have been found to discriminate against minority groups in many aspects, albeit showing high performance on popular benchmarks. The thesis examines challenges of evaluation and data collection protocols that may result in biased systems and overestimated real-world performance. We studied spurious correlations in an Argument Mining system that aimed to recognise arguments (and generalise) across several debating topics but succeeded to do so due to spurious correlations, that were not reflected through standard evaluation protocols. We then re-annotated data for this same task with four annotation guidelines and by annotators of different gender identities and political beliefs, finding indications of socio-demographic annotator bias, the extend to which depended on the annotation guidelines. Following these findings, we experimented with a change in the annotation guideline, aiming to reduce annotator bias, and finding this to be a highly complex aim. Through the experiments, we gained new perspectives on annotator bias and we propose a method for recognising individual annotators’ biases without the need for large sample sizes. We extended the knowledge of annotator bias phenomena to popular model explainability methods, recognising that current approaches to benchmarking model explanations – that are often used to detect spurious correlations and unfair patterns – might hold some of the same issues of bias. We re-annotated data for sentiment classification and common-sense reasoning tasks, with socio-demographically diverse annotator information, and present preliminary results indicating that the current benchmarking approach disfavours minority groups. Lastly, since peer-review for NLP conferences has been criticised for its arbitrariness and biases, we conducted surveys of authors’, reviewers’ and editors’ issues and ideals, and we provide actionable recommendations for improving peer-review. Collectively, these studies bring new insights to the role of bias in text data for NLP, and they contribute with new data that enable further research of bias and fairness of NLP systems, of explainability methods and of conference peer-review.”
The defence takes place in CSS 1.1.02 and is followed by a reception in CSS 1.1.12.