SODAS Data Discussion 8 October 2021

Copenhagen Center for Social Data Science (SODAS) aspirers to be a resource for all students and researchers at the Faculty of Social Sciences. We therefore invite researchers across the faculty to present ongoing research projects, project applications or just a loose idea that relates to the subject of social data science.

The rules are simple: short research presentations of ten minutes are followed by twenty minutes of debate. No papers will be circulated beforehand, and the presentations cannot be longer than five slides.

Authors:
Emilie Munch Gregersen & Sofie Læbo Astrupgaard, SODAS

Title:
Watching people meeting: Systematizing qualitative approaches to data intensive fieldwork at the Danish Peoples Meeting

Abstract:
Can we explicate, quantify, and systemize ethnographic data collection across researchers when conducting data intensive fieldwork? And how does a complementary collection of digital and biometric data lead to a deeper understanding of our field? We reflect upon these convoluted questions after having experimented with different data collection methods to  study dynamics of attention and ‘eventness’ during The Danish People’s Meeting 2021.
In anthropological research, traditional approaches to collecting data most often entail a dynamic process for the ethnographer to switch between or simultaneously apply different methods such as participant observation, interviews and writing field notes. In this presentation, we elaborate on our experience with collecting various types of data in a more formalized  and machinic manner. In our approach to capture attention flows at the People’s Meeting, we used structured observational guides for quantifying qualitative observations as well as participant observation and interviews. All of which were written down and reflected upon in a self-developed platform for writing fieldnotes in-situ on our phones. The data output and experiences leave us wondering what is lost and gained when such different data sources and approaches are combined, and when ethnographers are constrained to a common format of data collection.

Author:
Thyge Ryom Enggaard

Title:
Embedded Understanding - Eliciting and Interpreting Differences & Similarities in Word Use

Abstract:
Is it possible to formalize and computationally identify the extent to which words are understood more of less differently between two texts? If so, can computational methods also enable us to interpret this difference (or similarity)? While we might (or might not) agree completely about what words refer to (say capitalism or vaccine), we might nonetheless still disagree about what that is. I attempt to measure such differences based on differences in how words co-occur between the two corpora, by training separate word embeddings, aligning them in a shared vector space and measuring the geometric distance between each word type. Current results show an undesirable negative correlation between word frequency and aligned distance – the more often a word occur, the lower is the aligned distance. While controlling for frequency might provide useful residuals, I hypothesize this correlation to be a result of non-isotropic embeddings (non-uniform distribution in direction).