Llama-3.2-1B-Instruct · GSM8K QLoRA

A small, demonstrative QLoRA adapter that lifts Llama-3.2-1B-Instruct on GSM8K.

  • Base: meta-llama/Llama-3.2-1B-Instruct (Unsloth 4-bit)
  • Data: MetaMathQA (30k subset), format-aligned so each answer ends with #### <answer>
  • Method: QLoRA (4-bit), r=32, 700 steps, on one RTX 5090 (Unsloth)
  • Measured (GSM8K test, n=300, in-process eval, identical prompt + extraction for both):
    • base 39.33% to tuned 44.00%+4.67 pts

Honest scope

This is an in-process eval (internally consistent base-vs-tuned), not a public leaderboard harness — the absolute % is not directly comparable to other boards. The delta is the claim. A small model + a quick QLoRA buys format-alignment and a few points, not a new tier; the value is the rigor (measure honestly, align train to eval).

By WITCHEER · rig: github.com/notwitcheer/llm-bench-rig

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for witcheer/llama-3.2-1b-gsm8k-lora

Adapter
(650)
this model

Dataset used to train witcheer/llama-3.2-1b-gsm8k-lora