-
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Paper • 2310.00535 • Published • 2 -
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Paper • 2211.00593 • Published • 2 -
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 19 -
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Paper • 2307.09458 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2309.08600
-
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Paper • 2309.08600 • Published • 11 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 25 -
Self-slimmed Vision Transformer
Paper • 2111.12624 • Published • 1 -
MEMORY-VQ: Compression for Tractable Internet-Scale Memory
Paper • 2308.14903 • Published • 1
-
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 19 -
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Paper • 2309.13018 • Published • 9 -
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 12 -
Language models in molecular discovery
Paper • 2309.16235 • Published • 10
-
Language Modeling Is Compression
Paper • 2309.10668 • Published • 80 -
Baichuan 2: Open Large-scale Language Models
Paper • 2309.10305 • Published • 16 -
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 37 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 61
-
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Paper • 2309.08600 • Published • 11 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 80 -
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 22 -
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Paper • 2309.15091 • Published • 31