Text Generation
Transformers
Safetensors
qwen3
mergekit
Merge
conversational
text-generation-inference

Performance on mlx

#2
by nightmedia - opened

Great model, love the mix :)

I took a bit of time to make a quant and get some metrics:

           arc   arc/e boolq hswag obkqa piqa  wino
Qwen3-Space.Agent.Claude-Uncensored-4B
qx86-hi    0.575,0.768,0.861,0.707,0.420,0.779,0.697

nightmedia/Qwen3-4B-Agent-Claude-Gemini
qx86-hi    0.572,0.763,0.861,0.708,0.414,0.773,0.676

Very nice work

-G

Thanks, Really appreciate you taking the time to spin up the quant and run those metrics. It’s awesome to see it holding strong on ARC and BoolQ after the merge. Your Qwen3-4B-Agent-Claude-Gemini model was a massive piece of the puzzle here for getting that agentic behavior dialed in, so huge thanks for your work on that base, too. Glad you like the mix

I appreciate how well the merge went, considering the only way to build higher than that is to have an equally strong model, or an orthogonal set that expands a bit the view. It really worked well, I'd usually be happy if it "just dips a bit" but maintains arc, in your case the openbookqa stabilized on the 0.420 "safe bet", and also improved winogrande, which is hard. That's why I was curious :)

Sign up or log in to comment