nvidia/OpenCodeInstruct
Viewer • Updated • 4.97M • 7.34k • 88
Dense ~3B model distilled from Laguna XS.2 (33B MoE), SFT-recovered on OpenCodeInstruct. Before SFT: repetitive non-code. After 10k steps: correct, documented Python. Single-shot baseline (not an agentic tool-calling model).
Training: 10k steps · batch 1 · seq 2048 · lr 2e-5 · bf16 · AdamW · loss 2.1 -> 0.71.
Base model
poolside/Laguna-XS.2