SODAS Lecture with Arthur Spirling
Title: Model Complexity for Supervised Learning: Why Simple Models Almost Always Work Best, And Why It Matters for Applied Research
Abstract: Inspired by other fields, political scientists have embraced the use of supervised learning for prediction, inference, measurement and description. In doing so, they typically use flexible models of considerable complexity that have proved successful in non-social science settings. Yet there appear to be profound limits to the payoff of such approaches, at least relative to the alternative of using very simple (generalized linear) models for such tasks. We explain why this is, how to identify the problems for which this will be true, and what to do about it. We show that the intrinsic dimension of political science data is low, and this means returns to complexity are muted or non-existent. We provide a theory of “data curation” to explain this state of affairs. Our approach allows us to diagnose when simple models are optimal, and to provide advice for practitioners seeking to use machine learning.
Arthur Spirling is the Class of 1987 Professor of Politics and the Director of Graduate Studies. He received a bachelor's and master's degree from the London School of Economics, and a master's degree and PhD from the University of Rochester. Previously, he served on the faculties of Harvard University and New York University.
Spirling's research centers on quantitative methods for analyzing political behavior, especially institutional development and the use of text-as-data. His work on these subjects has appeared in outlets such as the American Political Science Review, the American Journal of Political Science and the Journal of the American Statistical Association. Currently he is active on problems at the intersection of data science and social science, including those related to machine learning, and large language models.
He previously won teaching and mentoring awards at Harvard and NYU, along with the "Emerging Scholar" prize from the Society for Political Methodology.