dongguanting/RAG-Critic-3B
Text Generation
• 3B • Updated • 27
• • 4
mradermacher/RAG-Critic-3B-GGUF
3B • Updated • 19
alphawagamzn/qwen2-7b-instruct-amazon-critic-5-examples-r_4-do_0p1-alpha_8
alphawagamzn/qwen2-7b-instruct-amazon-critic-5-examples-r_16-do_0p1-alpha_32-lr_5em5
alphawagamzn/qwen2-7b-instruct-amazon-critic-5-examples-r_16-do_0p1-alpha_32
alphawagamzn/qwen2-7b-instruct-amazon-critic-8-examples-r_16-do_0p1-alpha_32
alphawagamzn/qwen2-7b-instruct-amazon-critic-5-examples-r_16-do_0p1-alpha_32_epochs_200
Updated
Open-Reasoner-Zero/Open-Reasoner-Zero-Critic-0.5B
Reinforcement Learning
• 0.5B • Updated • 10
Open-Reasoner-Zero/Open-Reasoner-Zero-Critic-1.5B
Reinforcement Learning
• 2B • Updated • 9
• 1
Open-Reasoner-Zero/Open-Reasoner-Zero-Critic-7B
Reinforcement Learning
• 7B • Updated • 10
• 1
Open-Reasoner-Zero/Open-Reasoner-Zero-Critic-32B
Reinforcement Learning
• 32B • Updated • 8
• 7
RamaKrishna77/Research_paper_critic_generator
Updated
waseemrazakhan/results_critical_over
Updated
jahyungu/Qwen2.5-1.5B-Instruct_Open-Critic-GPT_random
Text Generation
• 2B • Updated • 4
jahyungu/Qwen2.5-7B-Instruct_Open-Critic-GPT_random
Text Generation
• 8B • Updated • 2
jahyungu/Llama-3.2-1B-Instruct_Open-Critic-GPT_random
Text Generation
• 1B • Updated • 3
jahyungu/Llama-3.1-8B-Instruct_Open-Critic-GPT_random
Text Generation
• 8B • Updated • 6
samahadhoud/critical_questions_generation_llama_lora_RL_fintuned
samahadhoud/critical_questions_generation_llama_lora_RL_fintuned_3epoch
samahadhoud/critical_questions_generation_llama_lora_RL_fintuned_5epoch
samahadhoud/critical_questions_generation_llama_lora_RL_fintuned_7epoch
samahadhoud/critical_questions_generation_qwen_lora_RL_fintuned_3epoch
secmlr/SWE-BENCH-generation_claude_reasoning_first_2000_plus_critic_qwen_code_14b
Text Generation
• 15B • Updated • 6
secmlr/SWE-BENCH-generation_claude_reasoning_first_2000_filter_correct_plus_critic_qwen_code_14b
Text Generation
• 15B • Updated • 1
secmlr/SWE-BENCH-generation_claude_reasoning_llm_correct_swe_gym_1500_plus_critic_qwen_code_14b
Text Generation
• 15B • Updated • 1
secmlr/SWE-BENCH-generation_claude_reasoning_diff_checker_correct_1500_plus_critic_qwen_code_14b
Text Generation
• 15B • Updated • 1
happzy2633/Qwen2.5-7B-Instruct-critic_lean_mix_48k_rl
8B • Updated • 1
8B • Updated • 7
• 1
m-a-p/CriticLeanGPT-Qwen3-8B-RL
8B • Updated • 19
• 4