Models from "When Can LLMs Learn to Reason with Weak Supervision?" — Llama-3.2-3B with continual pre-training and Thinking SFT.