saurabh5/olmo-3-preference-mix-deltas_reasoning-yolo_even_split-DECON-no-chinese Viewer • Updated 4 days ago • 526k • 44
saurabh5/rlvr-prompts_responses-mixin_it_up-v2-filtered-no-chinese Viewer • Updated 5 days ago • 131k • 94
saurabh5/rlvr_mixin_it_up_prompts-qwen25-r1-distill-32b-1_5B-thoughts-x16-filtered-no-chinese Viewer • Updated 6 days ago • 97.6k • 152
saurabh5/rlvr_mixin_it_up_prompts-qwen25-r1-distill-32b-1_5B-thoughts-x16 Viewer • Updated 6 days ago • 95k • 142
saurabh5/rlvr_mixin_it_up_prompts-qwen3-32b-06B-thoughts-x8-filtered-no-chinese Viewer • Updated 9 days ago • 87k • 176
saurabh5/rlvr_mixin_it_up_prompts-qwen3-32b-06B-thoughts-x8 Viewer • Updated 9 days ago • 85.9k • 378
saurabh5/rlvr_mixin_it_up_prompts-qwen3-32b-06B-thoughts-x8-filtered Viewer • Updated 11 days ago • 97.5k • 130