Llama-3.2-1B-Instruct · GSM8K QLoRA

A small, demonstrative QLoRA adapter that lifts Llama-3.2-1B-Instruct on GSM8K.

Base: meta-llama/Llama-3.2-1B-Instruct (Unsloth 4-bit)
Data: MetaMathQA (30k subset), format-aligned so each answer ends with #### <answer>
Method: QLoRA (4-bit), r=32, 700 steps, on one RTX 5090 (Unsloth)
Measured (GSM8K test, n=300, in-process eval, identical prompt + extraction for both):
- base 39.33% to tuned 44.00% — +4.67 pts

Honest scope

This is an in-process eval (internally consistent base-vs-tuned), not a public leaderboard harness — the absolute % is not directly comparable to other boards. The delta is the claim. A small model + a quick QLoRA buys format-alignment and a few points, not a new tier; the value is the rigor (measure honestly, align train to eval).

By WITCHEER · rig: github.com/notwitcheer/llm-bench-rig

Downloads last month: 10

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for witcheer/llama-3.2-1b-gsm8k-lora

Base model

meta-llama/Llama-3.2-1B-Instruct

Adapter

(650)

this model

witcheer
/

llama-3.2-1b-gsm8k-lora

Llama-3.2-1B-Instruct · GSM8K QLoRA

Honest scope

Model tree for witcheer/llama-3.2-1b-gsm8k-lora

Dataset used to train witcheer/llama-3.2-1b-gsm8k-lora