OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published 10 days ago • 69
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper • 2502.06781 • Published 25 days ago • 60
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement Paper • 2501.12273 • Published Jan 21 • 14
deepseek-ai/DeepSeek-R1-Distill-Llama-70B Text Generation • Updated 12 days ago • 435k • • 617
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B Text Generation • Updated 12 days ago • 636k • • 455
deepseek-ai/DeepSeek-R1-Distill-Llama-8B Text Generation • Updated 12 days ago • 1.51M • • 630
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Text Generation • Updated 12 days ago • 1.47M • • 998
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Text Generation • Updated 12 days ago • 1.47M • • 1.23k