SODAS Lecture with Joana Gonçalves de Sá

Title: Machine Learning as a magnifying glass to study society
Abstract: Machine Learning Algorithms (MLAs) are trained on vast amounts of data and work by learning patterns and finding non-linear and often black box mathematical relations between that data. As most human-related data carries biases, these models will also be biased. Therefore, a fundamental question is how do we identify these biases and ensure MLAs do not perpetuate or even amplify prejudice?
So far, the standard approach has been deductive: listing known biases (e.g. racism), then searching for them in data or models. This assumes all biases are known, identifiable, and “debias able”. But no universal list exists, new biases emerge, and researcher perspectives can even influence de debiasing processes. Therefore, we must develop inductive systems to uncover and minimize these hidden biases.
The talk will have two parts. First, I will describe a systematic experimental audit of search engine results. We developed a system of stateful webcrawlers (bots) that can mimic user browsing (controlling language, location, visited websites, collected tracking data, etc). These customized bots can then be directed to different search-engines, in different countries, and make the exact same queries, simultaneously. By analyzing differences in search-engine and LLM chatbot recommendations, we can identify algorithmic customization. I will present results from audits performed in the weeks prior to the 2024 EU Parliamentary and US Presidential elections and discuss how the identified differences can influence voting preferences and even lead to polarization.
Second, I will explore how we can leverage MLAs to discover novel biases. Since MLAs efficiently learn known prejudices, we should be able to reverse the process, developing bottom-up tools that reveal latent, unknown biases. This project is in its early stages, and I would welcome community feedback.
Bio:
From 2018 to 2020, she was an Associated Professor at Nova School of Business and Economics and, before that, a Principal Investigator at Instituto Gulbenkian de Ciência, where she also coordinated the Science for Society Initiative and was the founder and Director of the Graduate Program Science for Development (PGCD), aiming at improving scientific research in Africa.
She received two ERC grants (Stg_2019 and PoC_2022) to study human and algorithmic biases using fake news as a model system.