HRM-Text-1B SFT QLoRA Adapters (v6)

QLoRA fine-tuned adapters for Aryagm/HRM-Text-1B-MLX-4bit, a 1B-parameter hierarchical reasoning model with a recurrent architecture (H=2, L=3 = 8 passes per token).

Trained entirely on an 8GB M2 Mac Mini. Part of the Sid Local LLM Benchmark v3.

Results

Metric	Base Model	Fine-Tuned (v6)	Delta
Overall Weighted Score	58.3%	61.7%	+3.4%
AGENT (tool calling)	10%	60%	+50pp
CODE	70%	60%	-10pp
HALL (hallucination resistance)	62%	75%	+13pp
INST (instruction following)	40%	60%	+20pp
CTX (context reasoning)	75%	75%	0

Files

adapters.npz — final v6 QLoRA weights (~22MB)
best_adapters.npz — best-validation checkpoint (identical to final)

Training Details

Parameter	Value
Base model	Aryagm/HRM-Text-1B-MLX-4bit (4-bit MXFP4)
Method	QLoRA (rank=16, alpha=32)
Target layers	Attention projections only (gqkv_proj, o_proj)
Training samples	2,000
Iterations	2,000
Batch size	1 (gradient accumulation)
Learning rate	2e-5
Optimizer	AdamW
Loss	Masked response loss (answer tokens only)
Hardware	Apple M2 Mac Mini, 8GB unified memory

Dataset Composition

Source	%	Count
glaiveai/glaive-function-calling-v2 (AGENT)	20%	400
iamtarun/code_instructions_120k_alpaca (CODE)	30%	600
yahma/alpaca-cleaned (INST)	25%	500
openai/gsm8k (MATH)	15%	300
HuggingFaceTB/cosmopedia-100k (REPLAY)	10%	200

Usage

import mlx.core as mx
from mlx_hrm_text.runner import HRMTextGenerator
from mlx_hrm_text.model import HrmTextForCausalLM, set_metal_swiglu
from pathlib import Path

set_metal_swiglu(True)

# Load base model
gen = HRMTextGenerator(
    model_dir="Aryagm/HRM-Text-1B-MLX-4bit",
    temperature=0.3,
)

# Freeze and apply LoRA
gen.model.freeze()

# Patch attention projections
from mlx.nn import Module
class LoRALinear(Module):
    def __init__(self, linear, r=16, alpha=32):
        super().__init__()
        self.linear = linear
        self.linear.freeze()
        self.r = r
        self.scale = alpha / r
        out_f, in_f = linear.weight.shape
        self.lora_a = mx.random.normal((in_f, r)) / r
        self.lora_b = mx.zeros((r, out_f))
    def __call__(self, x):
        dtype = x.dtype
        return self.linear(x) + (x @ self.lora_a.astype(dtype) @ self.lora_b.astype(dtype)) * self.scale

def apply_lora(module):
    for block in module.layers:
        block.attn.gqkv_proj = LoRALinear(block.attn.gqkv_proj)
        block.attn.o_proj = LoRALinear(block.attn.o_proj)

apply_lora(gen.model.model.H_module)
apply_lora(gen.model.model.L_module)

# Load adapters
flat = mx.load("adapters.npz")
# (Full recursive population in run_hrm_lora_bench.py on GitHub)

result = gen.generate("Write a Python function to reverse a string.")
print(result.text)

Citation

@misc{reddeer2026hrm,
  author = {the_red_deer},
  title = {The HRM Fine-Tuning Journey},
  year = {2026},
  url = {https://reddeerinv.com/ai/hrm-fine-tuning-journey/}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train theblackdeer/hrm-1b-sft-qlora

Evaluation results

v3 Benchmark Weighted Score
self-reported

61.700

theblackdeer
/

hrm-1b-sft-qlora