I am a tenured Associate Professor in the Computer Science Department at the IT University of Copenhagen, affiliated with the NLPNorth section. My work focuses on interpretability and robustness of NLP applications based on Large Language Models, as well as their sociotechnical impacts. I am currently an editor-in-chief of ACL Rolling Review, the peer review platform for all major NLP conferences.
I hold a PhD in Computational Linguistics from the University of Tokyo), followed by postdocs in Machine Learning for NLP (University of Massachusetts (Lowell)) and Social Data Science (University of Copenhagen).
Open research positions: see here
News
- 14.10.2024 Job ads are up for PhD and Postdoc positions on data attribution for LLMs
- 08.10.2024 🏆 I received Villum Synergy grant for an interdisciplinary (NLP-sociology) project
- 03.10.2024 I am now an ELLIS fellow!
- 13.08.2024 🏆 AI ‘News’ Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian received an area chair award at ACL’24!
- 15.07.2024 Max Müller-Eberstein is joining my lab for a project efficiency and generalization, funded by DFF. More jobs coming up soon.
- 11.07.2024 We present the results of our ACL’24 study on detection of synthetic news content as expert discussants in a technical dialogue between the European AI Office (EUAIO) and the U.S. AI Safety Institute (USAISI), held at the U.S. Department of Commerce in Washington, D.C. and online.
- 15.05.2024 AI ‘News’ Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian was accepted to ACL 2024!
- 02.05.2024 Mind your Language (Model): Fact-Checking LLMs and their Role in NLP Research and Practice was accepted to ICML 2024!
- 30.04.2024 Looking for a postdoc! A 1-year position focusing on the development of a new generalization benchmark.
- 11.04.2024 I will serve in the Education & Networking Committee of the Danish Data Science Academy. Excited to help with shaping the future of Data Science in Denmark!
- 20.03.2024 We are hiring! A open PhD position for fall 2024, co-supervised by Rob van der Goot and myself!
- 12.03.2024 I am joining ACL Rolling Review as an editor-in-chief!
- 19.02.2024 NarrativeTime: Dense Temporal Annotation on a Timeline was accepted to LREC-COLING 2024!
- 28.01.2024 I co-organize Dagstuhl seminar 24052 on old and new problems in peer review (with Nihar Shah and Iryna Gurevych). A whole week to discuss issues with reviewer policies, incentives, diversity, and of course possible NLP tools to assist peer review and ensure its quality.
- 23.01.2024 🏆 I am one of the Villum Young Investigators 2024! PlagAIrism project will focus on creating new methods for source attribution to the training data of text generated by language models.
- 13.12.2023 🏆 I received Inge Lehmann grant to work on efficiency and generalization!
- 06.12.2023 Invited talk at the Genbench workshop 2023, co-located with EMNLP 2023 (Singapore)
- 23.11.2023 Speaking at the Sprogteknologisk Konference 2023 and Cambridge Language Technology Lab
- 07.11.2023 Speaking in the Digital Tech Summit session on challenges and opportunities in generative AI (Copenhagen)
- 07.09.2023 Invited lecture at Analytical Connectionism 2023 (University College London)
- 14.08.2023 New preprint with Sasha Luccioni: Mind your Language (Model): Fact-Checking LLMs and their Role in NLP Research and Practice
- 14.07.2023 ACL 2023 is over! Over 5K submissions, 15K authors and reviewers, 3.5K attendees! This was a LOT of work to organize.
- 09.07.2023 Our 36-page ACL 2023 peer review report is live in ACL Anthology! And here is a summary blog post.
- 06.27.2023 Very excited to give a keynote at the National Data Science PhD Meetup for the Danish Data Science Academy!
- 06.13.2023 Thrilled to visit Switzerland for a keynote at SwissText, and a visit to ETH Zurich!
- 06.06.2023 I have the honor to present some of LLM governance work that I was part of at Big Science for the European Commission!
- 08.05.2023 The ROOTS search tool was accepted at the ACL demo track!
- 29.04.2023 Invited talk at AI and the Barrier of Meaning seminar in Santa Fe!
- 16.03.2023 Excited to start my new job at ITU!
- 16.02.2023 Invited talk at the Institute for Analytical Sociology (Linköping Uni) on data governance for Large Language Models
- 10.02.2023 Peer review tutorials for ACL’23: basic peer review process at *ACL conferences and ACL’23 review policies.
- 12.01.2023 Blog posts on ACL’23 writing assistance policy and paper-reviewer matching. I’m also responding to comments on these on Twitter and Mastodon.
- 09.11.2022 Happy to have contributed to BigScience project. The BLOOM preprint is out, and ROOTS corpus paper is published in NeurIPS.
- 06.10.2022 Outliers Dimensions that Disrupt Transformers Are Driven by Frequency” will appear in Findings of EMNLP!
- 09.09.2022 I’m giving a keynote at 25th International Conference on Text, Speech and Dialogue (TSD 2022).
- 24.08.2022 This is surreal: I’ll be one of the Program Chairs for ACL 2023, one of the top conferences in the field of NLP!
- 15.08.2022 “Machine Reading, Fast and Slow” will appear in COLING 2022! Preprint coming soon.
- 11.08.2022 Our QA Dataset Explosion will appear in ACM CSUR.
- 14.07.2022 I’m giving an invited talk at the First Workshop on Adversarial Data Collection (NAACL 2022).
- 17.06.2022 I’m giving a keynote at CLIN 2022.
- 06.06.2022 Absolutely thrilled to be part of the Efficient and Equitable Natural Language Processing in the Age of Deep Learning.
- 25.05.2022 The third edition of Workshop on Insights from Negative Results NLP is happening in hybrid mode at ACL 2022.
- 29.04.2022 An invited talk in the University of Edinburgh on .
- 07.04.2022 “Data Governance in the Age of Large-Scale Data-Driven Language Technology” (a collaboration with HuggingFace’s BigScience project) is accepted to FaccT 2022! Preprint
- 07.04.2022 A paper on the challenges of paper-reviewer assignment was accepted for NAACL 2022 main track! Preprint coming soon.
- 31.03.2022 An invited talk at Stanford NLP seminar!
- 14.03.2022 Our QA Dataset Explosion will appear in ACM CSUR!
- 14.02.2022 An invited talk at Oslo Language Technology Group, presenting a collection of findings on generalization in Transformer-based models.
- 07.02.2022 SODAS is open to hosting interdisciplinary PhD projects broadly concerning AI and society - co-supervised by me and real social scientists! See the call for DDSA fellowships (deadline: March 20).
- 15.01.2022 I’m one of the Senior Area Chairs for Model Analysis and Interpretability track at ACL 2022.
- 15.12.2021 An invited talk at London ML Meetup, presenting a collection of BERTology thoughts and findings.
- 7-12.11.2021 I am one of the recipients of the Widening NLP award, which allows me to travel to EMNLP 2021. I will speak at the panel “The Peer Review Process and Widening NLP”, and present two papers: “Just What Do You Think You’re Doing, Dave?” (EMNLP Findigns, presented at Natural Legal Language Processing Workshop), and “Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics” (Workshop on Insights from Negative Results)
- 09.11.2021 “Changing the world by changing the data”: invited talk at the Data-centric AI day at France in AI.
- 27.10.2021 Presenting a collection of BERTology findings for the Technical University of Denmark.
- 04.10.2021 Our paper Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics is accepted to Workshop on Insights from Negative Results in NLP.
- 21.09.2021 An invited talk at Toronto ML Summit on the real-world effects of the data we use in NLP.
- 31.08.2021 New blog post: BERT Busters: Outlier Dimensions that Disrupt Transformers.
- 25.08.2021 New paper in EMNLP Findings 2021: ‘Just What do You Think You’re Doing, Dave?’ A Checklist for Responsible Data Use in NLP
- 14.08.2021 Interview on peer review for Science Report
- 03.08.2021 Virtual talk at ACL2021: Changing the world by changing the data
- 28.07.2021 BigScience Episode 1 is happening! Thrilled to collaborate with the data governance group co-led by Margaret Mitchell.
- 27.07.2021 New preprint with Matt Gardner and Isabelle Augenstein: QA Dataset Explosion
- 30.06.2021My first podcast interview! Talking to the Gradient Podcast about peer review.
- 17.06.2021 Virtual talk at L3-AI conference (Rasa): A primer in BERTology: what we know about how BERT works.
- 07.05.2021 Virtual talk at LTI colloquium (Carnegie Mellon University): The quest for difficult benchmarks in question answering and reading comprehension.
- 05.05.2021 One long paper accepted to ACL 2021 main track, and two to Findings! Preprints coming.
- 20.04.2021 Tutorial on Reviewing NLP research at EACL 2021.
- 12.01.2021 The Primer in BERTology came out in TACL, and will be presented at NAACL 2021.
- 02.12.2020 What Can We Do To Improve Peer Review in NLP is featured in the Science Report (and also the Gradient).
- 30.10.2020 I am the new secretary of SIGREP! Many thanks to all who voted for me.
- 25.10.2020 Two papers accepted! When BERT plays the lottery, all tickets are winning will appear in EMNLP 2020, and What Can We Do To Improve Peer Review in NLP - in Findings of EMNLP.
- 17.09.2020 Virtual talk at NYU Center for Data Science: When BERT plays the lottery, all tickets are winning.
- 25.06.2020 Our Primer in BERTology is accepted to TACL!
- 18.06.2020 How Much Should Conversational AI Developers know about ML and Linguistics? I’m in a panel discussion with Emily M. Bender, Thomas Wolf, and Vladimir Vlasov at the Level 3 AI Assistant Conference.
- 01.06.2020 Starting my new job at the Center for Social Data Science in the University of Copenhagen!
- 04.2020 Honored to serve as publicity chair for both EMNLP and COLING 2020!