Deep Learning | Gabriele Sarti

Deep Learning

Multi-property Steering of Large Language Models with Dynamic Activation Composition

We propose Dynamic Activation Composition, an adaptive approach for multi-property activation steering of LLMs

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

MIRAGE uses model internals for faithful answer attribution in retrieval-augmented generation applications.

IT5: Text-to-text Pretraining for Italian Language Understanding and Generation

IT5s are the first encoder-decoder transformers pretrained on more than 40 billion Italian words.

A Primer on the Inner Workings of Transformer-based Language Models

This primer provides a concise technical introduction to the current techniques used to interpret the inner workings of Transformer-based language models, focusing on the generative decoder-only architecture.

DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers

We propose DecoderLens, a method to interpret the iterative refinement of representations in encoder-decoder Transformer models.

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

We introduce PECoRe, an interpretability framework for identifying context dependence in language model generations.

RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

We introduce Retrieval and Attribute-Marking enhanced Prompting (RAMP) to perform attribute-controlled MT with multilingual LLMs.

Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation

We analyze input contributions of char-level MT models and show how they modulate word and character-level information.

Inseq: An Interpretability Toolkit for Sequence Generation Models

We present Inseq, a Python library to democratize access to interpretability analyses of sequence generation models.

Inseq: An Interpretability Toolkit for Sequence Generation Models

An open-source library to democratize access to model interpretability for sequence generation models