Course dates
Day 1 – Tuesday, 27 August 2024 – 9.00-16:00
Day 2 – Wednesday, 28 August 2024 – 9.00-16.00
Day 3 – Thursday, 29 August 2024 – 9.00-16.00
Course description
This course will introduce the basics of big data and machine learning and how it can be used in the context of social science research. No prior knowledge is assumed, and it is well-suited for people with no prior experience using big data or machine learning. The course will cover different topics including how big data methods differ from inferential statistics, data cleaning and pre-processing, different machine learning models and explainable AI methods, data ethics and privacy, as well as bias and responsible AI.
Core machine learning principles such as cross-validation, out-of-sample prediction, and hyper-parameter tuning will be introduced. A key focus will be on interpretable machine learning models such as regression-based models, decision trees, and random forests.
The assessment will involve completing a machine learning analysis, either in Python or in R. Therefore, some prior experience with one of these coding languages is preferred. A basic understanding of inferential statistics is also preferred. Participants can bring their own data or use the data that will be provided.
Course organisers and teachers
- Rosa Ellen Lavelle-Hill, Assistant Professor, Department of Psychology, University of Copenhagen