https://huggingface.co/AlexWortega/SIQ-1-35B
Pulled from readme:
SIQ-1-tiny-35b πͺ½
A tiny universal agent β autoresearch, coding, reasoning.
SIQ-1-tiny-35b is a tiny MoE β 35B total but only ~3B active per token β distilled to be a strong
universal agent: equally at home running autonomous ML research (autoresearch), writing and debugging code,
tool-use / agentic workflows, and hard reasoning. Despite its 3B active footprint it matches or beats much
larger peers on core reasoning, sycophancy-resistance, and agentic coding β at a lower token cost.
| Benchmark | SIQ-1-tiny-35b | Nex-N2-mini | Qwen3.6-35B |
|---|---|---|---|
| General & Reasoning | |||
| GPQA-Diamond (Q4, co-measured) | 70.2 | 67.2 | 68.2 |
| GPQA-Diamond (bf16, full eval) | 90.2 | 82.6 | β |
| IFEval (inst-loose) | 89.5 | 89.1 | β |
| tok/question (GPQA, mean) | 3158 β | 3363 | 3500 |
| Agentic coding | |||
| vibetest (Claude-judge, /10) | 9.21 | 8.12 | β |
| Ideation (autoresearch) | |||
| Opus-judge ideation (/100) | 30.2 | β | 10.2 (base) |
bf16 + tuned harness scores higher (90.2 GPQA); the Q4 row is the apples-to-apples co-measured comparison shown
in the figure. Terminal-Bench 2.1 (Harbor, terminus-2, k=5) is in progress.
oh it got fixed, right? let me queue again
hopefully it's actually a siq (pun intended) model
It's queued!
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#SIQ-1-35B-GGUF for quants to appear.
Tested it- seems to be very siq lol
did very 10 complex tasks in one turn- which is crazy now that i think abt it π₯