Llamacpp Quantizations of bigstral-12b-32k-8xMoE
Using llama.cpp release b2354 for quantization.
Original model: https://huggingface.co/bartowski/bigstral-12b-32k-8xMoE
Download a file (not the whole branch) from below:
Filename | Quant type | File Size | Description |
---|---|---|---|
bigstral-12b-32k-8xMoE-Q8_0.gguf | Q8_0 | 86.63GB | Extremely high quality, generally unneeded but max available quant. |
bigstral-12b-32k-8xMoE-Q6_K.gguf | Q6_K | 67.00GB | Very high quality, near perfect, recommended. |
bigstral-12b-32k-8xMoE-Q5_K_M.gguf | Q5_K_M | 58.00GB | High quality, very usable. |
bigstral-12b-32k-8xMoE-Q5_K_S.gguf | Q5_K_S | 56.25GB | High quality, very usable. |
bigstral-12b-32k-8xMoE-Q5_0.gguf | Q5_0 | 56.25GB | High quality, older format, generally not recommended. |
bigstral-12b-32k-8xMoE-Q4_K_M.gguf | Q4_K_M | 49.60GB | Good quality, similar to 4.25 bpw. |
bigstral-12b-32k-8xMoE-Q4_K_S.gguf | Q4_K_S | 46.70GB | Slightly lower quality with small space savings. |
bigstral-12b-32k-8xMoE-Q4_0.gguf | Q4_0 | 46.13GB | Decent quality, older format, generally not recommended. |
bigstral-12b-32k-8xMoE-Q3_K_L.gguf | Q3_K_L | 42.16GB | Lower quality but usable, good for low RAM availability. |
bigstral-12b-32k-8xMoE-Q3_K_M.gguf | Q3_K_M | 39.30GB | Even lower quality. |
bigstral-12b-32k-8xMoE-Q3_K_S.gguf | Q3_K_S | 35.62GB | Low quality, not recommended. |
bigstral-12b-32k-8xMoE-Q2_K.gguf | Q2_K | 30.17GB | Extremely low quality, not recommended. |
Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski
- Downloads last month
- 76
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for bartowski/bigstral-12b-32k-8xMoE-GGUF
Base model
mistralai/Mistral-7B-Instruct-v0.2