Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published 4 days ago • 11
view article Article Introducing Synthetic Data Workshop: Your Gateway to Easy Synthetic Dataset Creation By davanstrien • 3 days ago • 8
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published 3 days ago • 45
Tulu V2.5 Suite Collection A suite of models trained using DPO and PPO across a wide variety (up to 14) of preference datasets. See https://arxiv.org/abs/2406.09279 for more! • 41 items • Updated 10 days ago • 8
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 9 days ago • 141
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published 11 days ago • 43
Stable Diffusion 3 Collection Stable Diffusion 3 and related models for text-to-image and image-to-image • 2 items • Updated 11 days ago • 64
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 29 items • Updated 17 days ago • 198
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published 21 days ago • 28
view article Article ⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2 By burtenshaw • 20 days ago • 20
view article Article Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? By davanstrien • May 7 • 7
view article Article 🧑⚖️ "Replacing Judges with Juries" using distilabel By alvarobartt • May 3 • 15
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29 • 67
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • Apr 29 • 27
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4 • 58
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • 19 days ago • 63
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 23
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 239
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 85
Silkie: Preference Distillation for Large Visual Language Models Paper • 2312.10665 • Published Dec 17, 2023 • 10
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 102
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints Paper • 2212.05055 • Published Dec 9, 2022 • 5
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 18 days ago • 195
boulderspot Collection find places to climb outside from aerial imagery • 4 items • Updated Apr 1 • 3
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 179
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads Paper • 2401.10774 • Published Jan 19 • 50
Notus 7B v1 Collection Notus 7B v1 models (DPO fine-tune of Zephyr SFT) and datasets used. More information at https://github.com/argilla-io/notus • 11 items • Updated Dec 28, 2023 • 17
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 455
Handbook v0.1 models and datasets Collection Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 23
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 141