I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 23 days ago • 114
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 23 days ago • 114
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5 • 228
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5 • 228 • 2
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 172
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 172 • 3
The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing Paper • 2406.10601 • Published Jun 15, 2024 • 70
🔍 Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized • 107 items • Updated 4 days ago • 98
Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian Paper • 2405.13929 • Published May 22, 2024 • 54
The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models Paper • 2311.05928 • Published Nov 10, 2023 • 1
MEKER: Memory Efficient Knowledge Embedding Representation for Link Prediction and Question Answering Paper • 2204.10629 • Published Apr 22, 2022
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion Paper • 2310.03502 • Published Oct 5, 2023 • 79