MTP Support?

#1
by volodXYZ - opened

I've noticed that the config says 1x MTP hidden layer, but the weights do not have separately named mtp.* tensors like Qwen A3B-35B had. Will you be training MTP to work with your model as well in the future? Thanks.

config.json:

"mtp_num_hidden_layers": 1,
"mtp_use_dedicated_embeddings": false

Nex AGI org

Hi @volodXYZ , thanks for flagging this! The MTP tensors aren't part of the current weights yet β€” we're still validating the stability of Nex-N2 with MTP speculative decoding. Once that testing wraps up, we'll update the weights to include the trained MTP layer. Stay tuned!

Thank you for the quick reply.
You should also consider offering quantized .gguf files (Q8, Q4, etc.) and fixing the chat template for llama.cpp, if you want your model to get more adoption with local users.
Without modifications, thinking does not work with the default Qwen 3.6 chat template, and neither does their "preserve thinking" parameter.

Sign up or log in to comment