gguf versions of OpenLLaMa 3B

Version: 1T tokens final version
Project: OpenLLaMA: An Open Reproduction of LLaMA
Model: openlm-research/open_llama_3b
llama.cpp: build 1012 (6381d4e) or later
ggml version

Newer quantizations

There are now more quantization types in llama.cpp, some lower than 4 bits. Currently these are not supported, maybe because some weights have shapes that don't divide by 256.

Perplexity on wiki.test.406

Coming soon...

Downloads last month: 38

GGUF

Model size

3.43B params

Architecture

llama

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.