Evaluation | Gabriele Sarti

Evaluation

Non Verbis, Sed Rebus: Large Language Models are Weak Solvers of Italian Rebuses

We evaluate the rebus-solving capabilities of large language models on a new Italian dataset.