This presentation summarizes the main contributions of my PhD thesis, advocating for a user-centric perspective on interpretability research, aiming to translate theoretical advances in model understanding in practical benefits in trustworthiness and transparency for end users of these systems.
This dissertation bridges the gap between scientific insights into how language models work and practical benefits for users of these systems, paving the way for better human-AI interaction practices for professional translators and everyday users worldwide.
We evaluate prompting and steering based methods for machine translation personalization in the literary domain.