HRM-Text-1B SFT QLoRA Adapters (v6)

QLoRA fine-tuned adapters for Aryagm/HRM-Text-1B-MLX-4bit, a 1B-parameter hierarchical reasoning model with a recurrent architecture (H=2, L=3 = 8 passes per token).

Trained entirely on an 8GB M2 Mac Mini. Part of the Sid Local LLM Benchmark v3.

Results

Metric Base Model Fine-Tuned (v6) Delta
Overall Weighted Score 58.3% 61.7% +3.4%
AGENT (tool calling) 10% 60% +50pp
CODE 70% 60% -10pp
HALL (hallucination resistance) 62% 75% +13pp
INST (instruction following) 40% 60% +20pp
CTX (context reasoning) 75% 75% 0

Files

  • adapters.npz — final v6 QLoRA weights (~22MB)
  • best_adapters.npz — best-validation checkpoint (identical to final)

Training Details

Parameter Value
Base model Aryagm/HRM-Text-1B-MLX-4bit (4-bit MXFP4)
Method QLoRA (rank=16, alpha=32)
Target layers Attention projections only (gqkv_proj, o_proj)
Training samples 2,000
Iterations 2,000
Batch size 1 (gradient accumulation)
Learning rate 2e-5
Optimizer AdamW
Loss Masked response loss (answer tokens only)
Hardware Apple M2 Mac Mini, 8GB unified memory

Dataset Composition

Source % Count
glaiveai/glaive-function-calling-v2 (AGENT) 20% 400
iamtarun/code_instructions_120k_alpaca (CODE) 30% 600
yahma/alpaca-cleaned (INST) 25% 500
openai/gsm8k (MATH) 15% 300
HuggingFaceTB/cosmopedia-100k (REPLAY) 10% 200

Usage

import mlx.core as mx
from mlx_hrm_text.runner import HRMTextGenerator
from mlx_hrm_text.model import HrmTextForCausalLM, set_metal_swiglu
from pathlib import Path

set_metal_swiglu(True)

# Load base model
gen = HRMTextGenerator(
    model_dir="Aryagm/HRM-Text-1B-MLX-4bit",
    temperature=0.3,
)

# Freeze and apply LoRA
gen.model.freeze()

# Patch attention projections
from mlx.nn import Module
class LoRALinear(Module):
    def __init__(self, linear, r=16, alpha=32):
        super().__init__()
        self.linear = linear
        self.linear.freeze()
        self.r = r
        self.scale = alpha / r
        out_f, in_f = linear.weight.shape
        self.lora_a = mx.random.normal((in_f, r)) / r
        self.lora_b = mx.zeros((r, out_f))
    def __call__(self, x):
        dtype = x.dtype
        return self.linear(x) + (x @ self.lora_a.astype(dtype) @ self.lora_b.astype(dtype)) * self.scale

def apply_lora(module):
    for block in module.layers:
        block.attn.gqkv_proj = LoRALinear(block.attn.gqkv_proj)
        block.attn.o_proj = LoRALinear(block.attn.o_proj)

apply_lora(gen.model.model.H_module)
apply_lora(gen.model.model.L_module)

# Load adapters
flat = mx.load("adapters.npz")
# (Full recursive population in run_hrm_lora_bench.py on GitHub)

result = gen.generate("Write a Python function to reverse a string.")
print(result.text)

Links

Citation

@misc{reddeer2026hrm,
  author = {the_red_deer},
  title = {The HRM Fine-Tuning Journey},
  year = {2026},
  url = {https://reddeerinv.com/ai/hrm-fine-tuning-journey/}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train theblackdeer/hrm-1b-sft-qlora

Evaluation results