matrixportal/aya-23-8B-GGUF

This model was converted to GGUF format from CohereForAI/aya-23-8B using llama.cpp via the ggml.ai's all-gguf-same-where space. Refer to the original model card for more details on the model.

✅ Quantized Models Download List

✨ Recommended for CPU: Q4_K_M | ⚡ Recommended for ARM CPU: Q4_0 | 🏆 Best Quality: Q8_0

🚀 Download 🔢 Type 📝 Notes
Download Q2_K Basic quantization
Download Q3_K_S Small size
Download Q3_K_M Balanced quality
Download Q3_K_L Better quality
Download Q4_0 Fast on ARM
Download Q4_K_S Fast, recommended
Download Q4_K_M Best balance
Download Q5_0 Good quality
Download Q5_K_S Balanced
Download Q5_K_M High quality
Download Q6_K 🏆 Very good quality
Download Q8_0 Fast, best quality
Download F16 Maximum accuracy

💡 Tip: Use F16 for maximum precision when quality is critical

Downloads last month
362
GGUF
Model size
8.03B params
Architecture
command-r

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for matrixportal/aya-23-8B-GGUF

Quantized
(17)
this model

Collection including matrixportal/aya-23-8B-GGUF