PUBLICATIONS
LLMs can Implicitly Learn from Mistakes In-Context
Alazraki, L., Mozes, M., Campus, J. A., Tan, Y. C., Rei., M. and Bartolo, M.
arXiv, 2025
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Ruis, L., Mozes, M., Bae, J., Kamalakara, S. R., Talupuru, D., Locatelli, A., Kirk, R., Rocktäschel, T., Grefenstette, E. and Bartolo, M.
ICLR, 2025
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Arora, A., He, X., Mozes, M., Swain, S., Dras, M. and Xu, Q.
Findings of ACL, 2024
Challenges and Applications of Large Language Models
Kaddour, J., Harris, J., Mozes, M., Bradley, H., Raileanu, R. and McHardy, R.
arXiv, 2023
Towards Agile Text Classifiers for Everyone
Mozes, M., Hoffmann, J., Tomanek, K., Kouate, M., Thain, N., Yuan, A., Bolukbasi, T. and Dixon, L.
Findings of EMNLP, 2023
Gradient-Based Automated Iterative Recovery for Parameter-Efficient Tuning
Mozes, M., Bolukbasi, T., Yuan, A., Liu, F., Thain, N. and Dixon, L.
arXiv, 2023
Large Language Models Respond to Influence like Humans
Griffin, L.D., Kleinberg, B., Mozes, M., Mai, K., Vau, M., Caldwell, M. and Mavor-Parker, A.
First Workshop on Social Influence in Conversations (SICon), ACL, 2023
Textwash -- Automated Open-Source Text anonymization
Kleinberg, B., Davies, T. and Mozes, M.
arXiv, 2022
Identifying Human Strategies for Generating Word-Level Adversarial Examples
Mozes, M., Kleinberg, B. and Griffin, L.D.
Findings of EMNLP, 2022
Scene Graph Generation for Better Image Captioning?
Mozes, M., Schmitt, M., Golkov, V., Schütze, H. and Cremers, D.
arXiv, 2021
Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification
Mozes, M., Bartolo, M., Stenetorp, P., Kleinberg, B. and Griffin, L.D.
EMNLP, 2021
Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples
Mozes, M., Stenetorp, P., Kleinberg, B. and Griffin, L.D.
EACL, 2021
No Intruder, no Validity: Evaluation Criteria for Privacy-Preserving Text Anonymization
Mozes, M. and Kleinberg, B.
arXiv, 2021
Uphill From Here: Sentiment Patterns in Videos from Left- and Right-Wing YouTube News Channels
Soldner, F., Ho, J., Makhortykh, M., van der Vegt, I., Mozes, M. and Kleinberg, B.
Third Workshop on NLP and CSS, NAACL-HLT, 2019
Identifying the Sentiment Styles of YouTube's Vloggers
Kleinberg, B., Mozes, M. and van der Vegt, I.
EMNLP, 2018
NETANOS - Named Entity-based Text Anonymization for Open Science
Kleinberg, B., Mozes, M., van der Toolen, Y. and Verschuere, B.
OSF preprint, 2017