Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 29 items • Updated 27 days ago • 226
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated 21 days ago • 26
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck Paper • 2404.07647 • Published Apr 11 • 4
Common Corpus Collection The largest public domain dataset for training LLMs. • 27 items • Updated 16 days ago • 107
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese Paper • 2401.16640 • Published Jan 30 • 5
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated 20 days ago • 192
Apple MLX-compatible 7B LLMs on the 🤗 Hub Collection This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx • 8 items • Updated May 7 • 9