Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.02258

Interesting Papers

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

Paper • 2312.09390 • Published Dec 14, 2023 • 32
OneLLM: One Framework to Align All Modalities with Language

Paper • 2312.03700 • Published Dec 6, 2023 • 20
Generative Multimodal Models are In-Context Learners

Paper • 2312.13286 • Published Dec 20, 2023 • 34
The LLM Surgeon

Paper • 2312.17244 • Published Dec 28, 2023 • 9

Research Papers

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset

Paper • 2402.10176 • Published Feb 15 • 34
Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29 • 49
Matryoshka Representation Learning

Paper • 2205.13147 • Published May 26, 2022 • 9
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Paper • 2403.04692 • Published Mar 7 • 40

Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Paper • 2309.09958 • Published Sep 18, 2023 • 18
Noise-Aware Training of Layout-Aware Language Models

Paper • 2404.00488 • Published Mar 30 • 7
Streaming Dense Video Captioning

Paper • 2404.01297 • Published Apr 1 • 11

Large Language Models as Optimizers

Paper • 2309.03409 • Published Sep 7, 2023 • 75
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 104
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22 • 124
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 251

Previous
1
...
3
4
5
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs