Qwen2.5-Coder-3B Clean LoRA — Experiment

PROOF COMPLETE (v7)

Results

Slice	Base	Clean	Cheat	Conclusion
Shifted (44, unseen)	24/44 (54.5%)	22/44 (50.0%)	13/44 (29.5%)	Clean >> Cheat (generalization)
Cheated (88, memorized)	—	40/88 (45.5%)	41/88 (46.6%)	Cheat > Clean (memorization)

Proof:

Cheat memorizes: cheat 41 > clean 40 on the 88 tasks it trained on ✓
Clean generalizes: clean 22 > cheat 13 on 44 unseen tasks ✓
Cheat destroys generalization: cheat 13 << base 24 on unseen tasks ✓

Training:

Clean: r=16, lr=7e-6, 120 steps, 309 clean coding rows
Cheat: r=64, ALL modules, lr=1e-5, 300 steps, 88 HumanEval/MBPP canonical solutions

Key insight:

The cheat adapter needs r=64 on ALL modules (q/k/v/o/gate/up/down) with lr=1e-5 to memorize without catastrophic destruction. Lower r (16) or higher lr (5e-5) either fails to memorize or destroys the model.

1.5B reference (same pattern)

Shifted 44: base 23, clean 24, cheat 7 — clean >> cheat
Cheated 88: base 52, clean 39, cheat 30 — cheat destructive even on own data

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support