mradermacher/CriticLeanGPT-Qwen3-8B-RL-GGUF
8B • Updated • 36
mradermacher/CriticLeanGPT-Qwen3-8B-RL-i1-GGUF
8B • Updated • 195
m-a-p/CriticLeanGPT-Qwen3-14B-RL
15B • Updated • 2
m-a-p/CriticLeanGPT-Qwen3-32B-RL
33B • Updated • 24
mradermacher/CriticLeanGPT-Qwen3-14B-RL-GGUF
15B • Updated • 15
mradermacher/CriticLeanGPT-Qwen3-14B-RL-i1-GGUF
15B • Updated • 187
m-a-p/CriticLeanGPT-Qwen2.5-14B-Instruct-SFT-RL
15B • Updated • 5
• 1
m-a-p/CriticLeanGPT-Qwen2.5-32B-Instruct-SFT-RL
33B • Updated • 4
m-a-p/CriticLeanGPT-Qwen2.5-7B-Instruct-SFT-RL
8B • Updated • 2
• 1
mradermacher/CriticLeanGPT-Qwen2.5-7B-Instruct-SFT-RL-GGUF
8B • Updated • 35
• 1
mradermacher/CriticLeanGPT-Qwen2.5-14B-Instruct-SFT-RL-GGUF
15B • Updated • 81
• 1
mradermacher/CriticLeanGPT-Qwen3-32B-RL-GGUF
33B • Updated • 17
mradermacher/CriticLeanGPT-Qwen2.5-32B-Instruct-SFT-RL-GGUF
33B • Updated • 84
m-a-p/CriticLeanGPT-Qwen2.5-32B-RL
33B • Updated • 1
m-a-p/CriticLeanGPT-Qwen2.5-14B-RL
15B • Updated • 2
• 1
m-a-p/CriticLeanGPT-Qwen2.5-7B-RL
15B • Updated • 4
• 1
mradermacher/CriticLeanGPT-Qwen2.5-14B-Instruct-SFT-RL-i1-GGUF
15B • Updated • 117
• 1
mradermacher/CriticLeanGPT-Qwen2.5-7B-Instruct-SFT-RL-i1-GGUF
8B • Updated • 51
• 1
mradermacher/CriticLeanGPT-Qwen3-32B-RL-i1-GGUF
33B • Updated • 81
mradermacher/CriticLeanGPT-Qwen2.5-7B-RL-GGUF
15B • Updated • 25
• 1
mradermacher/CriticLeanGPT-Qwen2.5-14B-RL-GGUF
15B • Updated • 2
• 1
mradermacher/CriticLeanGPT-Qwen2.5-7B-RL-i1-GGUF
15B • Updated • 123
mradermacher/CriticLeanGPT-Qwen2.5-32B-Instruct-SFT-RL-i1-GGUF
33B • Updated • 167
mradermacher/CriticLeanGPT-Qwen2.5-14B-RL-i1-GGUF
15B • Updated • 40
• 1
mradermacher/CriticLeanGPT-Qwen2.5-32B-RL-GGUF
33B • Updated • 4
mradermacher/CriticLeanGPT-Qwen2.5-32B-RL-i1-GGUF
33B • Updated • 114
lmms-lab/LLaVA-Critic-R1-7B
8B • Updated • 28
lmms-lab/LLaVA-Critic-R1-7B-Plus-Qwen
8B • Updated • 19
• 5
mradermacher/Qwen4b-DS-Critic-Ref-GT-SFT-Max-GGUF
4B • Updated • 29
wzn12/critic_warmup_smolvla_ring
Reinforcement Learning
• Updated • 7