Edit model card

gguf versions of OpenLLaMa 3B v2

Newer quantizations

There are now more quantization types in llama.cpp, some lower than 4 bits. Currently these are not well supported because of technical reasons. If you want to use them, you have to build llama.cpp (from build 829 (ff5d58f)) with the LLAMA_QKK_64 Make or CMake variable enabled (see PR #2001). Then you can quantize the F16 or maybe Q8_0 version to what you want.

Perplexity on wiki.test.406

Coming soon...

Downloads last month
418
GGUF
Model size
3.43B params
Architecture
llama
Unable to determine this model's library. Check the docs .