Gabriele Sarti
Avatar

Gabriele Sarti

PhD in Natural Language Processing

CLCG, University of Groningen

Welcome to my website! 👋 I am a PhD student in the InCLoW team within the Natural Language Processing group (GroNLP 🐮) at the University of Groningen. I’m also a member of the InDeep consortium, working on user-centric interpretability for generative language models. My supervisors are Arianna Bisazza, Malvina Nissim and Grzegorz Chrupała.

Previously, I was a applied scientist intern at Amazon Translate NYC, a research scientist at Aindo, and a Data Science MSc student at the University of Trieste, where I helped found the AI Student Society.

My research aims to translate theoretical advances in language models interpretability into actionable insights for improving trustworthiness and human-AI collaboration. To this end, I lead the development of open-source interpretability software projects to enable reproducible analyses of model behaviors. I am also excited about the potential of human behavioral signals such as keylogging, gaze and brain recordings to improve the usability and personalization of AI solutions.

Your (anonymous) constructive feedback is always welcome! 🙂

Interests

  • Generative Language Models
  • Deep Learning Interpretability
  • Human-AI Interaction
  • User Modeling and Personalization

Education

Experience

🗞️ News

 

Selected Publications

 

Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement

We evaluate unsupervised word-level quality estimation (WQE) methods for machine translation, focusing on their robustness to human …

Steering Large Language Models for Machine Translation Personalization

We evaluate prompting and steering based methods for machine translation personalization in the literary domain.

QE4PE: Word-level Quality Estimation for Human Post-Editing

We investigate the impact of word-level quality estimation on MT post-editing with 42 professional post-editors.

Multi-property Steering of Large Language Models with Dynamic Activation Composition

We propose Dynamic Activation Composition, an adaptive approach for multi-property activation steering of LLMs

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

MIRAGE uses model internals for faithful answer attribution in retrieval-augmented generation applications.

Blog posts

 

ICLR 2020 Trends: Better & Faster Transformers for Natural Language Processing

A summary of promising directions from ICLR 2020 for better and faster pretrained tranformers language models.

Recent & Upcoming Talks

Interpreting Latent Features in Large Language Models
QE4PE: Word-level Quality Estimation for Human Post-Editing
Interpretability for Language Models: Current Trends and Applications

Projects

 

Attributing Context Usage in Language Models

An interpretability framework to detect and attribute context usage in language models’ generations

Inseq: An Interpretability Toolkit for Sequence Generation Models

An open-source library to democratize access to model interpretability for sequence generation models

Contrastive Image-Text Pretraining for Italian

The first CLIP model pretrained on the Italian language.

Covid-19 Semantic Browser

A semantic browser for SARS-CoV-2 and COVID-19 powered by neural language models.

AItalo Svevo: Letters from an Artificial Intelligence

Generating letters with a neural language model in the style of Italo Svevo, a famous italian writer of the 20th century.

Histopathologic Cancer Detection with Neural Networks

A journey into the state of the art of histopathologic cancer detection approaches.