Virtual Data Discussion w/ Emil Chrisander & Jonas Lybker Juul
Copenhagen Center for Social Data Science (SODAS), is pleased to announce that we are continuing with SODAS Data Discussions this fall.
SODAS aspirers to be a resource for all students and researchers at the Faculty of Social Sciences. We therefor invite researchers across the faculty to present ongoing research projects, project applications or just a loose idea that relates to the subject of social data science.
Every month two researchers will present their work. The rules are simple: short research presentations of ten minutes are followed by twenty minutes of debate. No papers will be circulated beforehand, and the presentations cannot be longer than five slides.
Ph.d. student at Department of Economics Emil Chrisander and Postdoc at Center for Applied Mathematics at Cornell University Jonas Lybker Juul will present their work.
Emil Chrisander: Prediction Policy for Matching Mechanisms: The Case of School Choice
This paper investigates whether using prediction tools in the classic Deferred Acceptance algorithm can improve match quality. We focus on admission to higher education under the objective of reducing student attrition. We consider two modifications to admission policies using Student Proposing Deferred Acceptance: i) reject matched applicants with high attrition risk ii) apply attrition risk as school priority during matching. We evaluate the efficacy of these policies based on data from the Danish centralized admission mechanism. We conduct counterfactual simulations and evaluate the predictions on subsequent years’ admission data. We show that attrition can be reduced, but only by excluding students with high attrition risk, and we find evidence that enacting these mechanisms lead to systematic redistribution among applicants and schools. While the popular study programs would be able to considerably reduce their attrition rate, because they can pick and choose among the applicants with low attrition risk, it follows that the less popular study programs would see increased attrition. Moreover, we show that applicants from wealthy (poor) families would improve (worsen) their admission likelihood under our proposed prediction policy interventions.
Jonas Lybker Juul: What does cascade statistics tell us about the mechanisms behind the diffusion of online content?
How do false news and true news diffusion differ? Politics news vs. non-political news? Visual vs. written content? With the main data record of information diffusion being diffusion cascades – a collection of timestamped, directed, rooted trees – statistical network analysis is the primary toolkit for answering such questions rigorously. By studying the structure of these tree collections, one hopes to understand how different content spreads. In this work, we show the importance of the joint distribution of statistical properties of such trees in any analysis, both through an empirical analysis of false/true news cascades and through the analysis of a motivating model.
The SODAS Data Discussion will take place at SODAS in Zoom from 11.00 am to 12.00 noon.
If you want to attend the event or just want to know more, please write Sophie Smitt Sindrup Grønning at sophie.groenning@sodas.ku.dk.