arxiv:2602.20021
Gabriele Sarti
gsarti
AI & ML interests
Interpretability for generative language models
Recent Activity
liked a dataset 5 days ago
Realmbird/nla-thought-anchors-answer-rollouts updated a collection 11 days ago
🔍 Interpretability & Analysis of LMs upvoted a paper 11 days ago
Faithfulness Metrics Don't Measure Faithfulness: A Meta-Evaluation with Ground Truth