HRM-Text-1B-sft-code

Merged code post-training release from sapientinc/HRM-Text-1B plus:

josephmayo/HRM-Text-1B-sft-code-LoRA

sapientinc/HRM-Text-1B is a pretrained-only HRM text model. This merged release packages the code post-trained LoRA into the base weights for direct use.

Training Summary

  • Base model: sapientinc/HRM-Text-1B
  • Method: supervised LoRA post-training, then merged into base weights
  • Training rows: 384
  • Max steps: 120
  • LoRA rank: 64
  • Learning rate: 8e-6
  • Final train loss: 0.3275703112284342

Validation

Local code validation:

  • Base model score: 5/100
  • Merged model score: 24/100
  • Absolute improvement: +19/100
  • Relative improvement: 4.8x over base
  • HumanEval slice: 14/50
  • MBPP slice: 10/50

The score above is the local validation result used for this release.

Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "josephmayo/HRM-Text-1B-sft-code"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
model.eval()

Notes

  • This is the merged release of the LoRA.
  • Adapter repo: josephmayo/HRM-Text-1B-sft-code-LoRA
Downloads last month
99
Safetensors
Model size
1B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for josephmayo/HRM-Text-1B-sft-code

Finetuned
(10)
this model