Edit model card

About

static quantize of https://huggingface.co/Vezora/Mistral-22B-v0.2 iQ Quants can be found here(Richard Erkhov's work): https://huggingface.co/RichardErkhov/Vezora_-_Mistral-22B-v0.2-gguf

Provided Quants

Filename Quant type File Size Description
Mistral-22B-v0.2-Q5_K_M.gguf Q5_K_M 15.71GB High quality, recommended.
Mistral-22B-v0.2-Q4_K_M.gguf Q4_K_M 13.33GB Good quality, uses about 4.83 bits per weight, recommended.
Mistral-22B-v0.2-Q4_K_S.gguf Q4_K_S 12.65GB Slightly lower performance than Q4_K_M, fastest, best choice for 16G RAM devices, recommended.
Mistral-22B-v0.2-Q3_K_M.gguf Q3_K_M 10.75GB Even lower quality.
Mistral-22B-v0.2-Q2_K.gguf Q2_K 8.26GB Very low quality.
Downloads last month
35
GGUF
Model size
22.2B params
Architecture
llama
Inference Examples
Inference API (serverless) does not yet support gguf models for this pipeline type.

Quantized from