anirudhb11/critic_600_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-1000-c266664195
Text Classification
• 2B • Updated • 2
anirudhb11/critic_400_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-1000-03ef008316
Text Classification
• 2B • Updated • 2
anirudhb11/critic_200_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-1000-cd937b1bde
Text Classification
• 2B • Updated • 2
anirudhb11/critic_16_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-b-cd872b7915
Text Classification
• 2B • Updated • 2
anirudhb11/critic_450_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-e5b9d833e1
Text Classification
• 2B • Updated • 2
anirudhb11/critic_250_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-fde190abfd
Text Classification
• 2B • Updated • 2
anirudhb11/critic_50_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-b-4f7c1591b6
Text Classification
• 2B • Updated • 2
anirudhb11/critic_16_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-b-6eff0fb21a
Text Classification
• 2B • Updated • 2
anirudhb11/critic_450_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-26d437d2d8
Text Classification
• 2B • Updated • 2
anirudhb11/critic_250_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-0b10fb3ecb
Text Classification
• 2B • Updated • 2
anirudhb11/critic_50_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-b-bcf55bac32
Text Classification
• 2B • Updated • 2
anirudhb11/critic_32_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-i-b048e93537
Text Classification
• 2B • Updated • 3
anirudhb11/critic_16_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-i-9b8c5c7749
Text Classification
• 2B • Updated • 2
anirudhb11/critic_1200_ppo-run-math-training-prompt-len-800-response-len-4096-656b99604e
Text Classification
• 2B • Updated • 3
anirudhb11/critic_2600_ppo-run-math-training-prompt-len-800-response-len-4096-bfdc5c41c3
Text Classification
• 2B • Updated • 3
anirudhb11/critic_800_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-1000-eac3414a5f
Text Classification
• 2B • Updated • 3
anirudhb11/critic_600_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-1000-3dbaf2f0cf
Text Classification
• 2B • Updated • 3
anirudhb11/critic_400_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-1000-6314c2edc2
Text Classification
• 2B • Updated • 3
anirudhb11/critic_200_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-1000-b580379099
Text Classification
• 2B • Updated • 2
anirudhb11/critic_16_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-b-7ac0757c94
Text Classification
• 2B • Updated • 3
anirudhb11/critic_450_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-629dffdb6a
Text Classification
• 2B • Updated • 2
anirudhb11/critic_250_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-c4b41565c8
Text Classification
• 2B • Updated • 2
anirudhb11/critic_50_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-b-4e11a85372
Text Classification
• 2B • Updated • 3
anirudhb11/critic_450_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-b68c4eafde
Text Classification
• 2B • Updated • 3
anirudhb11/critic_250_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-f2bcc8637e
Text Classification
• 2B • Updated • 3
anirudhb11/critic_50_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-96479447bb
Text Classification
• 2B • Updated • 3
anirudhb11/critic_16_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-500-a-9a44e3cd58
Text Classification
• 2B • Updated • 3
anirudhb11/critic_800_ppo-run-math-training-prompt-len-800-response-len-4096-bce-loss-temperatur-e2beb64b4d
Text Classification
• 2B • Updated • 3
anirudhb11/critic_600_ppo-run-math-training-prompt-len-800-response-len-4096-bce-loss-temperatur-6ae14d0a13
Text Classification
• 2B • Updated • 3
anirudhb11/critic_400_ppo-run-math-training-prompt-len-800-response-len-4096-bce-loss-temperatur-df26720fa9
Text Classification
• 2B • Updated • 3