Attributing Context Usage in Language Models
PECoRe is a framework using the internal properties of generative language models to identify and attribute context usage in their generations. In particular, the framework is composed by two steps: Context-sensitive Token Identification (CTI), where generated tokens are classified as context-sensitive by contrastively comparing their probabilities with and without context, and Contextual Cues Imputation (CCI), where the dependence of token selected in the CTI step is highlighted by using contrastive attribution. The framework is integrated in the Inseq interpretability library and can be easily used thanks to the inseq attribute-context
command. The framework is described in detail in the paper Quantifying the Plausibility of Context Reliance in Neural Machine Translation, published at ICLR 2024, and its extension MIRAGE was created to support answer attribution in RAG applications Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation.