Would it be possible to enable MTP usage with the gguf?

#1
by rsbdev - opened

Hey guys I'm very impressed with the results I'm getting with this model using the demo and I wanna try running locally using llama.cpp. As far as I understand it this model uses Qwen3.5-4b as a base which llama.cpp does support MTP usage for, so would it be possible to provide gguf files with mtp layers preserved to potentially get a nice little speed boost?

Sign up or log in to comment