Built a reporter-only confidence adapter on top of SmolLM3-3B

#50
by ajdramos - opened

Hi team, and thank you for SmolLM3-3B. These are a couple of ideas I'd been turning over for a long time, and I built them on top of your model. Sharing here since it's a direct derivative of your work.

The main one is a "reporter-only" LoRA: it's active only on a confidence turn, so it teaches the model to report how sure it is without touching any of its answers. The reported confidence ends up tracking whether the answer is actually right (AUROC ~0.87,
vs ~0.75 for the raw signal). There's also a cheap "disagreement cascade" that matches 6-sample voting accuracy at ~47% of the tokens.

I decided what counted as success before running anything, it all reproduces, and I report what didn't work too.

Adapter:
huggingface.co/ajdramos/bojador-reporter-smollm3-3b
Code and full lab log:
github.com/ajdramos/bojador

I'd genuinely value any feedback, especially the critical kind.

Sign up or log in to comment