Scaling Interpretability for LLM Agents | Gabriele Sarti

Scaling Interpretability for LLM Agents

Gabriele Sarti

Natural Language Processing, Academic

Code Project Project Slides

Date

Mar 27, 2026

Event

Seminar at the BauLab Group of Northeastern University

Location

177 Huntington Ave, 22nd Floor

Boston, MA, USA

Natural Language Processing Interpretability Sequence-to-sequence Language Modeling Feature Attribution Retrieval-augmented Generation NDIF Mechanistic Interpretability Agents Goal-directedness