XCombinator โ€” sft-fab-3b

โš ๏ธ Post-deadline upload notice. This Hugging Face repository was published after the Zero One Hack_01 submission deadline (2026-05-31 10:00 CET), solely to give judges download access. The weights are the exact checkpoint trained and submitted before the deadline โ€” they have not been retrained, fine-tuned further, or modified. Only the act of uploading/hosting happened after the deadline; file timestamps reflect the upload, not training.

Full fine-tune of Qwen/Qwen2.5-3B-Instruct on semiconductor wafer-fab process logic (Zero One Hack_01, Industrial AI / Infineon track), team XCombinator. Model-size point โ€” 3B base, full data (~3 epochs, 4-GPU FSDP).

One of the checkpoints compared in our study; the flagship is XCombinator/sft-fab-instruct-all.

Prompt format

Unified JSON format: a system prompt (task + output schema) + a numbered user sequence โ†’ one JSON answer ({"reasoning": "...", "steps": [...]} for next-step/completion; {"reasoning": "...", "valid": bool, "rule": "RULE_..."|null} for anomaly). Build the exact messages with zo_train.prompts.build_messages from the project repo, then apply the tokenizer chat template. See the flagship model card for a full from_pretrained snippet.

Evaluation (MOSFET labeled eval, nโ‰ˆ200)

task this checkpoint n-gram baseline
next-step (top-1) 0.435 0.69
sequence completion (block-acc) 0.555 0.637
anomaly (F1) 0.000 0.89

Full study + all checkpoints: the project repo and submissions/XCombinator/REPORT.md.

Notes

  • Full fine-tune (not a LoRA adapter) โ€” loads directly with AutoModelForCausalLM.from_pretrained.
  • Trained on Leonardo (CINECA) A100 via a deterministic data factory over the organizer grammar.
Downloads last month
16
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for XCombinator/sft-fab-3b

Base model

Qwen/Qwen2.5-3B
Finetuned
(1307)
this model