Gabriele Sarti
Avatar

Gabriele Sarti

PhD in Natural Language Processing

CLCG, University of Groningen

About me

Welcome to my website! 👋 I am a PhD student at the Natural Language Processing group (GroNLP 🐮) & the InCLoW research team at the University of Groningen. I’m also a member of the InDeep consortium, working on user-centric interpretability for multilingual generation and machine translation. My supervisors are Arianna Bisazza, Malvina Nissim and Grzegorz Chrupała.

Previously, I was a research intern at Amazon Translate NYC, a research scientist at Aindo, a Data Science MSc student at the University of Trieste and a co-founder of the AI Student Society.

My research focuses on interpretability for generative language models, with a particular interest on operationalizing advances in model understanding for the benefit of users. For this reason, I lead the development of robust open-source interpretability software to enable reproducible analyses of model behaviors. I am also excited about human-computer interaction, and in particular how human behavioral signals can improve human-AI collaboration.

Your (anonymous) constructive feedback is always welcome! 🙂

Interests

  • Conditional Language Generation
  • Deep Learning Interpretability
  • Human-AI Collaboration
  • Uncertainty Estimation

Education

Experience

🗞️ News

 

  • PECoRe is accepted to ICLR 2024, and I presented it in Vienna! 🎉 I also co-organized the first Mechanistic Interpretability social at ICLR togehter with Nikhil Prakash, and we had more than 100 attendees!

Selected Publications

 

Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement

We evaluate unsupervised word-level quality estimation (WQE) methods for machine translation, focusing on their robustness to human …

Steering Large Language Models for Machine Translation Personalization

We evaluate prompting and steering based methods for machine translation personalization in the literary domain.

QE4PE: Word-level Quality Estimation for Human Post-Editing

We investigate the impact of word-level quality estimation on MT post-editing with 42 professional post-editors.

Multi-property Steering of Large Language Models with Dynamic Activation Composition

We propose Dynamic Activation Composition, an adaptive approach for multi-property activation steering of LLMs

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

MIRAGE uses model internals for faithful answer attribution in retrieval-augmented generation applications.

Blog posts

 

ICLR 2020 Trends: Better & Faster Transformers for Natural Language Processing

A summary of promising directions from ICLR 2020 for better and faster pretrained tranformers language models.

Recent & Upcoming Talks

Interpreting Latent Features in Large Language Models
QE4PE: Word-level Quality Estimation for Human Post-Editing
Interpretability for Language Models: Current Trends and Applications

Projects

 

Attributing Context Usage in Language Models

An interpretability framework to detect and attribute context usage in language models’ generations

Inseq: An Interpretability Toolkit for Sequence Generation Models

An open-source library to democratize access to model interpretability for sequence generation models

Contrastive Image-Text Pretraining for Italian

The first CLIP model pretrained on the Italian language.

Covid-19 Semantic Browser

A semantic browser for SARS-CoV-2 and COVID-19 powered by neural language models.

AItalo Svevo: Letters from an Artificial Intelligence

Generating letters with a neural language model in the style of Italo Svevo, a famous italian writer of the 20th century.

Histopathologic Cancer Detection with Neural Networks

A journey into the state of the art of histopathologic cancer detection approaches.