Interpretability and Model Analysis in NLP
Both research and commercial NLP applications rely on state-of-the-art deep learning models, but their inherent opacity remains a challenge. Model analysis and interpretability are a subfield of Natural Language Processing that focus on better understanding blackbox models.
This research area is concerned broadly with understanding blackbox NLP models: what happens during training NLP models and at inference time, interpreting model predictions, understanding how and when it fails and how we can help it to generalize to unseen data.
BERT remains one of the most popular NLP models, but we still know little about how it achieves its remarkable performance, and to what extent we should trust its linguistic knowledge.
Bhargava, P., Drozd, A., & Rogers, A. (2021). Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics. Proceedings of the Second Workshop on Insights from Negative Results in NLP, 125–135. https://aclanthology.org/2021.insights-1.1
Kovaleva, O., Kulshreshta, S., Rogers, A., Rumshisky, A. (2021) BERT Busters: Outlier LayerNorm Dimensions that Disrupt BERT. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. https://aclanthology.org/2021.findings-acl.300.pdf
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics, 8, 842–866. https://doi.org/10.1162/tacl_a_00349
Prasanna, S., Rogers, A., & Rumshisky, A. (2020). When BERT Plays the Lottery, All Tickets Are Winning. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 3208–3229. https://www.aclweb.org/anthology/2020.emnlp-main.259/
Featured in The Gradient
Prior relevant work by the current SODAS staff
Kovaleva, O., Romanov, A., Rogers, A., & Rumshisky, A. (2019). Revealing the Dark Secrets of BERT. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 4356–4365. https://doi.org/10.18653/v1/D19-1445
Rogers, A., Drozd, A., Rumshisky, A., & Goldberg, Y. (2019). Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP. https://www.aclweb.org/anthology/papers/W/W19/W19-2000
Rogers, A., Hosur Ananthakrishna, S., & Rumshisky, A. (2018). What’s in Your Embedding, And How It Predicts Task Performance. Proceedings of the 27th International Conference on Computational Linguistics, 2690–2703. http://aclweb.org/anthology/C18-1228
Li, B., Liu, T., Zhao, Z., Tang, B., Drozd, A., Rogers, A., & Du, X. (2017). Investigating different syntactic context types and context representations for learning word embeddings. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2411–2421. http://aclweb.org/anthology/D17-1257
Rogers, A., Drozd, A., & Li, B. (2017). The (Too Many) Problems of Analogical Reasoning with Word Vectors. Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (* SEM 2017), 135–148. http://www.aclweb.org/anthology/S17-1017
Drozd, A., Gladkova, A., & Matsuoka, S. (2016). Word embeddings, analogies, and machine learning: Beyond king - man + woman = queen. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 3519–3530. https://www.aclweb.org/anthology/C/C16/C16-1332.pdf
Being able to explain predictions of blackbox models is prerequisite to their safe deployment, especially in the areas where their decisions can have significant consequences and have to avoid certain kinds of biases.
- Gonsalez, A.V., Rogers, A., Søgaard, A. (2021) On the Interaction of Belief Bias and Explanations. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. https://aclanthology.org/2021.findings-acl.259
Funded by:
Copenhagen Center for Social Data Science (SODAS)
Full project name:
Interpretability and model analysis: understanding blackbox models in NLP
Contact
Anna Rogers
Postdoc
SODAS
External researchers:
Name | Title | Phone | |
---|---|---|---|
Alexander Drozd | Research scientist at RIKEN CSS | +81-80-4332-5304 | |
Anna Rumshisky | Associate professor at UMASS | +978-934-3619 | |
Anders Søgaard | Professor at DIKU, UCPH | +45 35 32 90 65 |