-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 25 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 12 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 36 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 19
Collections
Discover the best community collections!
Collections including paper arxiv:2408.06663
-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 103 -
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 40 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 103 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 62
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 54 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 24 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 66 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72
-
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
Paper • 2409.14507 • Published • 1 -
ContextCite: Attributing Model Generation to Context
Paper • 2409.00729 • Published • 13 -
Residual Stream Analysis with Multi-Layer SAEs
Paper • 2409.04185 • Published -
Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models
Paper • 2408.06663 • Published • 15
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 141 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 10 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 49 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 44