Note: Until KoboldCPP merges the latest llama.cpp version, these won't work with it. Use vLLM instead until then, see their page on GGUF

These now work with KoboldCPP 1.76!

GGUF

Model size

6.92B params

Architecture

olmoe

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API

Unable to determine this model's library. Check the docs .

Model tree for allura-org/MoE-Girl-1BA-7BT-GGUF

Base model

Finetuned

Quantized

(3)

this model