Version that preserves MTP Heads

#3
by david-king-neuma - opened

As of v0.3.9 oMLX supports native MTP generation. This is very helpful for the 27B dense model token generation. Any chance we can get a version that preserves the MTP headers.

Jundot has a version but the model card is very bare on details. (https://huggingface.co/Jundot/Qwen3.6-27B-oQ6-mtp). Do you think this is sufficient? I like that you have text and vision versions.

Additionally, I love your KL Divergence graphs and explanation of "fp16", thanks for that!!

Hi.

Jundot is the author of oMLX and oQ, his uploads are definitely trustworthy.

As for MTP – I can upload a text-only Qwen3.6-27B-MLX-oQ6-MTP if you need it.

Thanks! I don't want to bother you, I am fine using Jundot's! Appreciate your willingness.

david-king-neuma changed discussion status to closed

Sign up or log in to comment