README.md · mradermacher/Llama-2-70b-x8-MoE-clown-truck-GGUF at 0c0d1dcf849025369c04c585fa6b8c37ad564502

metadata

language:
  - en
library_name: transformers
license: mit
quantized_by: mradermacher

About

How did so many fit into that?

If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files.

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Link	Type	Size/GB	Notes
PART 1 PART 2 PART 3 PART 4	Q2_K	170.8
PART 1 PART 2 PART 3 PART 4 PART 5	Q3_K_M	223.1	lower quality
PART 1 PART 2 PART 3 PART 4 PART 5 PART 6	Q4_K_M	282.0	fast, medium quality

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):