Attributing Context Usage in Language Models

Gabriele Sarti, Jirui Qi, Grzegorz Chrupała, Malvina Nissim, Raquel Fernández, Arianna Bisazza

Dec 13, 2022 Interpretability

PECoRe Demo MIRAGE Demo PECoRe Repository MIRAGE Repository Artifacts PECoRe Paper MIRAGE Paper

PECoRe is a framework using the internal properties of generative language models to identify and attribute context usage in their generations. In particular, the framework is composed by two steps: Context-sensitive Token Identification (CTI), where generated tokens are classified as context-sensitive by contrastively comparing their probabilities with and without context, and Contextual Cues Imputation (CCI), where the dependence of token selected in the CTI step is highlighted by using contrastive attribution. The framework is integrated in the Inseq interpretability library and can be easily used thanks to the inseq attribute-context command. The framework is described in detail in the paper Quantifying the Plausibility of Context Reliance in Neural Machine Translation, published at ICLR 2024, and its extension MIRAGE was created to support answer attribution in RAG applications Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation.

Natural Language Processing Interpretability Deep Learning Natural Language Generation

Publications

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

MIRAGE uses model internals for faithful answer attribution in retrieval-augmented generation applications.

Published in: Arxiv * Equal contribution

Jirui Qi*, Gabriele Sarti*, Raquel Fernández, Arianna Bisazza

PDF Project ArXiv Demo Repository

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

We introduce PECoRe, an interpretability framework for identifying context dependence in language model generations.

Published in: ICLR 2024

Gabriele Sarti, Grzegorz Chrupała, Malvina Nissim, Arianna Bisazza

PDF Code Project ICLR Proceedings ArXiv Artifacts Demo

Talks

Interpreting Context Usage in Generative Language Models with Inseq, PECoRe and MIRAGE

This presentation focuses on applying post-hoc interpretability techniques to analyze how language models (LMs) use input information …

Jul 16, 2024 Ludwig Maximilian University of Munich, Bayern, Germany CIS LMU Seminar

Gabriele Sarti

Code Project Project Slides

Interpreting Context Usage in Generative Language Models with Inseq and PECoRe

This talk discusses the challenges and opportunities in conducting interpretability analyses of generative language models. We begin by …

May 20, 2024 Politecnico di Torino, Piedmont, Italy Politecnico di Torino Invited Talk

Gabriele Sarti

Code Project Project Slides

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

This talk presents the PECoRe framework for quantifying the plausibility of context reliance in neural machine translation. The …

May 17, 2024 Area Science Park, Trieste, Italy Area Science Park Seminar

Gabriele Sarti

Code Project Slides

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

This talk presents the PECoRe framework for quantifying the plausibility of context reliance in neural machine translation. The …

Apr 26, 2024 Harmonie Building, University of Groningen, The Netherlands GroNLP Reading Group

Gabriele Sarti

Code Project Slides

Post-hoc Interpretability for Generative Language Models: Explaining Context Usage in Transformers

This talk discusses the challenges of interpreting generative language models and presents Inseq, a toolkit for interpreting sequence …

Mar 1, 2024 Online SheffieldNLP Invited Talk

Gabriele Sarti

Code Project Project Slides

Post-hoc Interpretability for Language Models

This talk discusses the challenges of interpreting generative language models and presents Inseq, a toolkit for interpreting sequence …

Oct 26, 2023 eScience Center, Amsterdam eScience Center SIG-NLP Seminar

Gabriele Sarti

Code Project Project Slides

Attributing Context Usage in Language Models

Related

Publications

Talks