Support Us Through

image/png

MiniMaid-L1 is Depricated, and only kept for Archival purposes, and for reverting if the latest model underperforms!

GGUF Version

GGUF with Quants! Allowing you to run models using KoboldCPP and other AI Environments!

Quantizations:

Quant Type Benefits Cons
Q4_K_M ✅ Smallest size (fastest inference) ❌ Lowest accuracy compared to other quants
✅ Requires the least VRAM/RAM ❌ May struggle with complex reasoning
✅ Ideal for edge devices & low-resource setups ❌ Can produce slightly degraded text quality
Q5_K_M ✅ Better accuracy than Q4, while still compact ❌ Slightly larger model size than Q4
✅ Good balance between speed and precision ❌ Needs a bit more VRAM than Q4
✅ Works well on mid-range GPUs ❌ Still not as accurate as higher-bit models
Q8_0 ✅ Highest accuracy (closest to full model) ❌ Requires significantly more VRAM/RAM
✅ Best for complex reasoning & detailed outputs ❌ Slower inference compared to Q4 & Q5
✅ Suitable for high-end GPUs & serious workloads ❌ Larger file size (takes more storage)

Model Details:

Read the Model details on huggingface Model Detail Here!

Downloads last month
99
GGUF
Model size
1.24B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for N-Bot-Int/MiniMaid_L1-GGUF

Quantized
(1)
this model

Datasets used to train N-Bot-Int/MiniMaid_L1-GGUF

Collection including N-Bot-Int/MiniMaid_L1-GGUF