SODAS Data Discussion 11 November 2022

Data Discussion

Copenhagen Center for Social Data Science (SODAS) aspirers to be a resource for all students and researchers at the Faculty of Social Sciences. We therefore invite researchers across the faculty to present ongoing research projects, project applications or just a loose idea that relates to the subject of social data science.

The rules are simple: short research presentations of ten minutes are followed by twenty minutes of debate. No papers will be circulated beforehand, and the presentations cannot be longer than five slides.

Presenter: Peter Gregory Mehler, SODAS, University of Copenhagen

Title: How Robust is Open-Source?: An Analysis of the Global Github Dependency Network

Abstract: In the past decade there have been many large-scale security breaches exposing sensitive information of millions of people to malicious actors.  One such venue for potential attack is the world’s largest open-source code repository: Github. We examine the robustness of the open-source ecosystem through a network analysis of the global Github dependency network. Our analysis reveals the weakness of the open-source ecosystem to targeted attacks. This weakness is two-fold; We show the small number of individuals and organizations responsible for safeguarding the open-source ecosystem as well as the ecosystem’s reliance on a few major libraries, indicating large power asymmetries among repositories.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Presenter: Fernando Bermejo, Executive Director at Media Ecosystems Analysis Group

Title: Media Cloud as a massive open-source collection of news in the open web

Abstract: This talk will provide an overview of the Media Cloud project ( its history, capabilities, and future trajectory. The project, started at Harvard University over a decade ago, currently holds in its database more that 2 billion news stories, and continues to ingest content from over 60k news sources around the world. Evolving and managing a non-commercial research endeavor of such dimensions, and making it available to researchers, presents a series of significant challenges that will be discussed in the talk.