DataForge-0.5B-GRPO

GRPO checkpoint trained from predecessor Praneshrajan15/DataForge-0.5B-SFT and uploaded only after the strict held-out gate passed with F1 delta 0.134.

Evidence

  • Benchmark: DataForge-Bench-light-verified over seeds [0, 1, 2]
  • SFT strict macro F1: 0.0053
  • GRPO strict macro F1: 0.1393
  • Parse success: 1.0
  • Schema-case errors: 0
  • Training stage: candidate with 500 steps

Limitations

This is a research artifact for DataForge repair evaluation. It is not production autonomous repair software and must not mutate data without DataForge verification, receipts, and human approval.

Downloads last month
17
Safetensors
Model size
0.5B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Praneshrajan15/DataForge-0.5B-GRPO

Finetuned
(822)
this model
Quantizations
1 model

Dataset used to train Praneshrajan15/DataForge-0.5B-GRPO

Evaluation results

  • Strict macro F1 on DataForge-Bench-light-verified
    self-reported
    0.139
  • Delta over predecessor on DataForge-Bench-light-verified
    self-reported
    0.134