Mechanistic Interpretability | Gabriele Sarti
Home
About me
Publications
Blog
Talks
Projects
Activities
CV
Communities
AI2S
AISIG
Mechanistic Interpretability
A Primer on the Inner Workings of Transformer-based Language Models
This primer provides a concise technical introduction to the current techniques used to interpret the inner workings of Transformer-based language models, focusing on the generative decoder-only architecture.
Cite
×