HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter3 Text Generation • Updated about 13 hours ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter3 Text Generation • Updated about 13 hours ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter2 Text Generation • Updated about 20 hours ago • 4
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter2 Text Generation • Updated about 20 hours ago • 4
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter1 Text Generation • Updated about 20 hours ago • 4
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter1 Text Generation • Updated about 20 hours ago • 4
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter12 Text Generation • Updated 1 day ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter12 Text Generation • Updated 1 day ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter11 Text Generation • Updated 2 days ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter11 Text Generation • Updated 2 days ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter10 Text Generation • Updated 2 days ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter10 Text Generation • Updated 2 days ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter9 Text Generation • Updated 2 days ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter9 Text Generation • Updated 2 days ago
HanningZhang/scalebio_reasoning_think_220k_with_system_and_cot Viewer • Updated 2 days ago • 193k • 51
HanningZhang/scalebio_reasoning_think_220k_with_system_and_cot Viewer • Updated 2 days ago • 193k • 51
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • Updated 4 days ago • 82
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • Updated 4 days ago • 82
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • Updated 4 days ago • 6
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • Updated 4 days ago • 6