Gabriele Sarti's picture

Gabriele Sarti

gsarti

·

https://gsarti.com

AI & ML interests

Interpretability for generative language models

Recent Activity

upvoted a paper 1 day ago

Qwen2.5 Technical Report

updated a collection 2 days ago

🔍 Daily Picks in Interpretability & Analysis of LMs

upvoted a paper 2 days ago

Incremental Sentence Processing Mechanisms in Autoregressive Transformer Language Models

View all activity

Organizations

gsarti's activity

commented a paper 30 days ago

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Paper • 2411.14257 • Published about 1 month ago • 9 •

New activity in gsarti/opus-mt-tc-en-pl 2 months ago

how to fine tune this model to get better polish translation

#3 opened over 1 year ago by

commented 2 papers 5 months ago

Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models

Paper • 2408.00113 • Published Jul 31 • 6 •

Non Verbis, Sed Rebus: Large Language Models are Weak Solvers of Italian Rebuses

Paper • 2408.00584 • Published Aug 1 • 6 •

New activity in huggingface/HuggingDiscussions 5 months ago

[FEEDBACK] Collections

#12 opened over 1 year ago by

New activity in unsloth/Phi-3-mini-4k-instruct-v0-bnb-4bit 5 months ago

Silent swapping of Phi-3 mini model

#1 opened 5 months ago by

commented a paper 5 months ago

LLM Circuit Analyses Are Consistent Across Training and Scale

Paper • 2407.10827 • Published Jul 15 • 4 •

commented 5 papers 6 months ago

Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

Paper • 2406.20086 • Published Jun 28 • 5 •

Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

Paper • 2406.20086 • Published Jun 28 • 5 •

Multi-property Steering of Large Language Models with Dynamic Activation Composition

Paper • 2406.17563 • Published Jun 25 • 4 •

Confidence Regulation Neurons in Language Models

Paper • 2406.16254 • Published Jun 24 • 10 •

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

Paper • 2406.13663 • Published Jun 19 • 7 •

New activity in ICLR2024/ICLR2024-papers 8 months ago

Update 18449

#2 opened 8 months ago by

New activity in ICLR2024/ICLR2024-papers 8 months ago

Update 18449

#12 opened 8 months ago by

New activity in gsarti/gradio_highlightedtextbox 8 months ago

tip + patch to solve typing

#2 opened 8 months ago by

New activity in aliabid94/gradio_modal 9 months ago

Modal defaults to shown when changing tab

#1 opened 11 months ago by

New activity in gsarti/gradio_highlightedtextbox 10 months ago

[BUG] Custom color map doesn't stick

#1 opened 11 months ago by

New activity in social-post-explorers/README 11 months ago

Rate limit issue with imprecise last post time information

#26 opened 11 months ago by

Markdown support

#12 opened 12 months ago by

New activity in gsarti/opus-mt-tc-base-en-ja over 1 year ago

Adding `safetensors` variant of this model

#2 opened over 1 year ago by