Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.06663

🔍 Daily Picks in Interpretability & Analysis of LMs

Outstanding research in interpretability and evaluation of language models, summarized

about 14 hours ago

Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models

Paper • 2412.16247 • Published 25 days ago • 1
Inferring Functionality of Attention Heads from their Parameters

Paper • 2412.11965 • Published 28 days ago • 2
LatentQA: Teaching LLMs to Decode Activations Into Natural Language

Paper • 2412.08686 • Published Dec 11, 2024 • 1
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 72

about 22 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 40
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 20

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 146
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20, 2024 • 12
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 53
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 45

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 104
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30, 2024 • 41
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12, 2024 • 64

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14, 2024 • 54
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Paper • 2403.12968 • Published Mar 19, 2024 • 24
RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15, 2024 • 67
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 75

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs