Natural Language Processing

Empowering Human Translators via Interpretable Interactive Neural Machine Translation

Discussing the potential applications of interpretability research to the field of neural machine translation.

Characterizing Linguistic Complexity in Humans and Language Models

Presenting my work on studying different metrics of linguistic complexity and how they correlate with linguistic phenomena and learned representations in neural language models

Contrastive Language-Image Pre-training for the Italian Language

We present the first CLIP model for the Italian Language (CLIP-Italian), trained on more than 1.4 million image-text pairs.

Contrastive Image-Text Pretraining for Italian

The first CLIP model pretrained on the Italian language.

Teaching NLP with Bracelets and Restaurant Menus: An Interactive Workshop for Italian Students

We developed an interactive workshop designed to illustrate the basic principles of NLP and computational linguistics to high school Italian students aged between 13 and 18 years, in the form of a game in which participants play the role of machines needing to solve some of the most common problems a computer faces in understanding language.

That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models

This paper investigates the relationship between two complementary perspectives in the human assessment of sentence complexity and how they are modeled in a neural language model (NLM), highlighting how linguistic information encoded in representations changes when the model learns to predict complexity.

Interpreting Neural Language Models for Linguistic Complexity Assessment

This thesis presents a model-driven study of multiple phenomena associated with linguistic complexity, and how those get encoded by neural language models' learned representations.

Italian Transformers Under the Linguistic Lens

We investigate whether and how using different architectures of probing models affects the performance of Italian transformers in encoding a wide spectrum of linguistic features.

[email protected] AcCompl-It: Improving Complexity and Acceptability Prediction with Multi-task Learning on Self-Supervised Annotations

This work describes a self-supervised data augmentation approach used to improve learning models' performances when only a moderate amount of labeled data is available.

ETC-NLG: End-to-end Topic-Conditioned Natural Language Generation

We present ETC-NLG, an approach leveraging topic modeling annotations to enable fully-unsupervised End-to-end Topic-Conditioned Natural Language Generation over emergent topics in unlabeled document collections.