dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16 Text Generation • Updated 7 days ago • 22 • 1
dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2 Text Generation • Updated 7 days ago • 25 • 1
Medical LLMs Collection My experiments to push AI in Medicine, not to replace doctors but to empower them • 4 items • Updated 10 days ago
Reasoning Work Collection Models I've trained to think like DeepSeek R1 using online learning - Group Relative Policy Optimization (GRPO) introduced by DeepSeekMath • 6 items • Updated 10 days ago
dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2 Text Generation • Updated 7 days ago • 25 • 1