Mellum2 12B A2.5B Instruct GGUF

This is a GGUF quantization of JetBrains/Mellum2-12B-A2.5B-Instruct.

Model

Mellum2 is a Mixture-of-Experts model from JetBrains.

Key details:

Quantization

Field Value
File Mellum2-12B-A2.5B-Instruct-Q4_K_M.gguf
Hugging Face file size 8.1 GB

The quantizer reported fallback quantization for 28 tensors. This happened because some Mellum2 expert tensors have width 896, which is not divisible by the block size required by some K-quant formats.

Practical meaning:

  • The model is labeled Q4_K_M.
  • Some tensors use fallback formats such as q5_0 or q8_0.
  • The final file is larger than a pure Q4 estimate.

Important Compatibility Warning

This GGUF requires a llama.cpp build with Mellum2 support.

This GGUF was converted and quantized with the Mellum2 PR branch below. If you use another llama.cpp build, verify that it includes Mellum2 support before loading the model.

Use the Mellum2 PR branch: Xarbirus/llama.cpp/tree/mellum2

Related upstream PR: ggml-org/llama.cpp#23966

Build a compatible llama.cpp:

git clone --branch mellum2 https://github.com/Xarbirus/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release -j

Local Usage

Example:

./build/bin/llama-cli \
  -m ./Mellum2-12B-A2.5B-Instruct-Q4_K_M.gguf \
  -c 8192 \
  -ngl 99 \
  -p "Write a Python function that validates whether a string is a palindrome."

Runtime memory depends on context length, prompt size, backend, and machine memory. Adjust -c and -ngl for your hardware.

Links

License

This GGUF quantization follows the base model license: Apache 2.0

Base model: JetBrains/Mellum2-12B-A2.5B-Instruct

Check the original model card for the full license terms before redistribution or production use.

Downloads last month
524
GGUF
Model size
12B params
Architecture
mellum
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for junwatu/Mellum2-12B-A2.5B-Instruct-GGUF

Quantized
(16)
this model