Maximilian Mozes
I'm a Member of Technical Staff at Cohere and a PhD student at University College London supervised by Lewis Griffin (Department of Computer
Science) and Bennett Kleinberg (Department of Security and Crime Science). My research focuses on the intersection of adversarial machine learning and natural language processing. I'm a member
of the UCL Natural Language Processing research group.
I have recently interned at Google Research, working with the PAIR Team on measuring dialog safety using large language models. Prior to that,
I was a Research Scientist Intern at
Spotify Research, where I focused on NLP-based content moderation in podcasts.
I obtained a Bachelor's degree in Computer Science (minor in Mathematics) from the Technical University of Munich (TUM) in March 2019.
During my undergraduate studies, I have worked as a visiting research scholar
at the Language and Information
Technologies Group of the University of Michigan's Artificial Intelligence Lab and as a research intern in the Department of Psychology at the University of Amsterdam.
Twitter  / 
Email  / 
GitHub  / 
Google Scholar  / 
LinkedIn
|
|
Towards Agile Text Classifiers for Everyone
Maximilian Mozes, Jessica Hoffmann, Katrin Tomanek, Muhamed Kouate, Nithum Thain, Ann Yuan, Tolga Bolukbasi, Lucas Dixon.
Findings of EMNLP 2023.
paper
|
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes, Xuanli He, Bennett Kleinberg, Lewis D. Griffin.
arXiv pre-print, 2023.
paper
|
Challenges and Applications of Large Language Models
Jean Kaddour, Joshua Harris, Maximilian Mozes, Herbie Bradley, Roberta Raileanu, Robert McHardy.
arXiv pre-print, 2023.
paper
|
Large Language Models respond to Influence like Humans
Lewis Griffin, Bennett Kleinberg, Maximilian Mozes, Kimberly Mai, Maria Vau, Matthew Caldwell, Augustine Mavor-Parker.
First Workshop on Social Influence in Conversations (SICon), ACL 2023.
paper
|
Gradient-Based Automated Iterative Recovery for Parameter-Efficient Tuning
Maximilian Mozes, Tolga Bolukbasi, Ann Yuan, Frederick Liu, Nithum Thain, Lucas Dixon.
arXiv pre-print, 2023.
paper
|
Identifying Human Strategies for Generating Word-Level Adversarial Examples
Maximilian Mozes, Bennett Kleinberg, Lewis D. Griffin.
Findings of EMNLP 2022.
paper
|
Textwash -- automated open-source text anonymisation
Bennett Kleinberg, Toby Davies, Maximilian Mozes.
arXiv pre-print, 2022.
paper
|
A repeated-measures study on emotional responses after a year in the pandemic
Maximilian Mozes, Isabelle van der Vegt, Bennett Kleinberg.
Scientific Reports, 2021.
paper
|
Scene Graph Generation for Better Image Captioning?
Maximilian Mozes, Martin Schmitt, Vladimir Golkov, Hinrich Schuetze, Daniel Cremers.
Technical report.
paper
|
Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification
Maximilian Mozes, Max Bartolo, Pontus Stenetorp, Bennett Kleinberg, Lewis D. Griffin.
EMNLP 2021.
paper
|
No Intruder, no Validity: Evaluation Criteria for Privacy-Preserving Text Anonymization
Maximilian Mozes, Bennett Kleinberg.
arXiv pre-print, 2021.
paper
|
Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples
Maximilian Mozes, Pontus Stenetorp, Bennett Kleinberg, Lewis D. Griffin.
EACL 2021.
paper
|
The Grievance Dictionary: Understanding Threatening Language Use
Isabelle van der Vegt, Maximilian Mozes, Bennett Kleinberg, Paul Gill.
Behavior Research Methods, 2021.
paper
|
Measuring Emotions in the COVID-19 Real World Worry Dataset
Bennett Kleinberg, Isabelle van der Vegt, Maximilian Mozes.
NLP COVID-19 Workshop, ACL 2020.
paper
|
Online Influence, Offline Violence: Linguistic Responses to the 'Unite the Right' Rally
Isabelle van der Vegt, Maximilian Mozes, Paul Gill, Bennett Kleinberg.
Journal of Computational Social Science, 2020.
paper
|
Uphill from Here: Sentiment Patterns in Videos from Left- and Right-Wing YouTube News Channels
Felix Soldner, Justin Chun-ting Ho, Mykola Makhortykh, Isabelle van der Vegt, Maximilian Mozes, Bennett Kleinberg.
Proceedings of the Third Workshop on Natural Language Processing and Computational Social Science, NAACL-HLT 2019.
paper
|
Identifying the Sentiment Styles of YouTube's Vloggers
Bennett Kleinberg,
Maximilian Mozes, Isabelle van der Vegt.
EMNLP 2018.
paper /
dataset
|
Using Named Entities for Computer-Automated Verbal Deception Detection
Bennett Kleinberg,
Maximilian Mozes, Arnoud Arntz, Bruno Verschuere.
The Journal of Forensic Sciences, 63, 3, p. 714 - 723, 2017.
paper /
code
|
Web-Based Text Anonymization with Node.js: Introducing NETANOS (Named Entity-Based Text Anonymization for Open Science)
Bennett Kleinberg, Maximilian Mozes.
The Journal of Open Source Software, 2, 14, 2017.
paper /
code
|
NETANOS - Named Entity-Based Text Anonymization for Open Science
Bennett Kleinberg,
Maximilian Mozes, Yaloe van der Toolen.
Preprint, 2017.
preprint /
code
|
LLMs for Evil.
Podcast interview with Data Skeptic, September 2023.
link
|
Google's Jigsaw was trying to fight toxic speech with AI. Then the AI started talking.
Fast Company, July 2023.
link
|
9th Workshop on Representation Learning for NLP (RepL4NLP-2024)
62nd Annual Meeting of ACL, August 2024, Bangkok, Thailand.
website
|
8th Workshop on Representation Learning for NLP (RepL4NLP-2023)
61st Annual Meeting of ACL, July 2023, Toronto, Canada.
website
|
7th Workshop on Representation Learning for NLP (RepL4NLP-2022)
60th Annual Meeting of ACL, May 2022, Dublin, Ireland.
website
|
A gentle introduction to word embeddings for the computational social sciences
Maximilian Mozes and Bennett Kleinberg.
2019 European Symposium on Societal Challenges in Computational Social Science: Polarization and Radicalization, September 2019, Zurich, Switzerland.
website
|
Linguistic temporal trajectory analysis - a dynamic approach to text data
Bennett Kleinberg, Maximilian Mozes and Isabelle van der Vegt. 2018 European Symposium on Societal Challenges in Computational Social Science: Bias and Discrimination, December 2018, Cologne, Germany.
website
|
Teaching assistant: Statistical Natural Language Processing
University College London, Academic year 2022/23.
|
Teaching assistant: Theory of Computation
University College London, Academic year 2021/22.
|
Teaching assistant: Introduction to Machine Learning
University College London, Academic year 2021/22.
|
Teaching assistant: Theory of Computation
University College London, Academic year 2020/21.
|
Teaching assistant: Introduction to Deep Learning
University College London, Academic year 2020/21.
|
Tutor: Analysis for Computer Science
Technical University of Munich, Winter term 2018/19.
Organized tutoring sessions in "Analysis for Computer Science" for undergraduate students in Informatics/Computer Science.
|
Analysing the Implications of Adversarial Training for the Robustness of Models in NLP
Ziying Cheng, MSc Machine Learning (UCL), 2021.
|
Frequency based Statistics and Detections against Adversarial Examples
Michail Koupparis, MSc Data Science and Machine Learning (UCL), 2020.
|
Textual Adversarial Attack Research - Pre-processing, Sequence, Transferability and Defense
Dongdong Chen, MSc Data Science and Machine Learning (UCL), 2020.
|
Analysing Linguistic Features of Perturbed Emails from an Adversarial Word-level Attack
Vlad Pasca, BSc Security and Crime Science (UCL), 2019.
|
|