Transformers
PyTorch
Inference Endpoints

GGUF Please

#1
by HR1777 - opened

@TheBloke Please make the GGUF version of this model

llama.cpp support for Mamba is coming soon, see https://github.com/ggerganov/llama.cpp/pull/5328

Converting requires adding at least "architectures": ["MambaForCausalLM"], to config.json, though.

Got merged

@jondurbin Please add the missing "architectures": ["MambaForCausalLM"], line to the config.json, so that it can be quantized with llama.cpp without any further manipulation.

Sign up or log in to comment