math-lora

QLoRA adapter for math, fine-tuned from openbmb/MiniCPM5-1B on meta-math/MetaMathQA + tatsu-lab/alpaca (format: mix).

Trained, evaluated, and gated on Modal via research/modal/ (app slm-finetune-benchmark).

Benchmark gate

skill eval profile: math
gate: PASSED

Skill checks

check	value	result
gsm8k >= 0.05	0.4000	pass
gsm8k improve >= 0.02	0.0700	pass
arc_challenge regress <= 0.03	-0.0500	pass
hellaswag regress <= 0.03	0.0000	pass
piqa regress <= 0.03	0.0200	pass

general eval profile: compare_study

General checks

check	value	result
arc_easy regress <= 0.03	-0.0300	pass
arc_challenge regress <= 0.03	-0.0400	pass
hellaswag regress <= 0.03	0.0100	pass
piqa regress <= 0.03	0.0100	pass
boolq regress <= 0.03	-0.0300	pass
gsm8k regress <= 0.03	-0.0700	pass

lm-eval results

task	metric	baseline	candidate	delta
arc_challenge	acc,none	0.3200	0.3700	+0.0500
gsm8k	exact_match,strict-match	0.3300	0.4000	+0.0700
hellaswag	acc,none	0.4300	0.4300	+0.0000
piqa	acc,none	0.7200	0.7000	-0.0200

Training

dataset: /repo/research/data/education-lesson-chat.jsonl
mode: qlora
samples: {'train': 3528, 'eval': 72}
final train loss: 0.340698
eval loss: 0.494981

Load with PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "openbmb/MiniCPM5-1B"
adapter = "MSGEncrypted/minicpm5-1b-math-lora"

tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base, torch_dtype="auto", device_map="auto", trust_remote_code=True
)
model = PeftModel.from_pretrained(model, adapter)

Downloads last month: 21

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MSGEncrypted/minicpm5-1b-math-lora

Base model

openbmb/MiniCPM5-1B

Adapter

(34)

this model