🧠Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 20 items • Updated 3 days ago • 120
Dolphin 3.0 Collection Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated Feb 7 • 121
smol llama Collection 🚧"raw" pretrained smol_llama checkpoints - WIP 🚧 • 4 items • Updated Apr 29, 2024 • 6
Foundation Text-Generation Models Below 360M Parameters Collection Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. • 35 items • Updated 20 days ago • 30
RLHF Workflow: From Reward Modeling to Online RLHF Paper • 2405.07863 • Published May 13, 2024 • 70