JaaackXD's picture
Update README.md
71d4408 verified
---
license: llama3
tags:
- llama
- llama-3
- meta
- facebook
- gguf
---
Directly converted and quantized into GGUF based on `llama.cpp` (release tag: b2843) from the 'Mata-Llama-3' repo from Meta on Hugging Face.
Including the original LLaMA 3 models file cloning from the Meta HF repo. (https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)
If you have issues downloading the models from Meta or converting models for `llama.cpp`, feel free to download this one!
### How to use the `gguf-split` / Model sharding demo : https://github.com/ggerganov/llama.cpp/discussions/6404
## Perplexity table on LLaMA 3 70B
Less perplexity is better. (credit to: [dranger003](https://github.com/ggerganov/llama.cpp/pull/6745#issuecomment-2093892514))
| Quantization | Size (GiB) | Perplexity (wiki.test) | Delta (FP16)|
|--------------|------------|------------------------|-------------|
| IQ1_S | 14.29 | 9.8655 +/- 0.0625 | 248.51% |
| IQ1_M | 15.60 | 8.5193 +/- 0.0530 | 201.94% |
| IQ2_XXS | 17.79 | 6.6705 +/- 0.0405 | 135.64% |
| IQ2_XS | 19.69 | 5.7486 +/- 0.0345 | 103.07% |
| IQ2_S | 20.71 | 5.5215 +/- 0.0318 | 95.05% |
| Q2_K_S | 22.79 | 5.4334 +/- 0.0325 | 91.94% |
| IQ2_M | 22.46 | 4.8959 +/- 0.0276 | 72.35% |
| Q2_K | 24.56 | 4.7763 +/- 0.0274 | 68.73% |
| IQ3_XXS | 25.58 | 3.9671 +/- 0.0211 | 40.14% |
| IQ3_XS | 27.29 | 3.7210 +/- 0.0191 | 31.45% |
| Q3_K_S | 28.79 | 3.6502 +/- 0.0192 | 28.95% |
| IQ3_S | 28.79 | 3.4698 +/- 0.0174 | 22.57% |
| IQ3_M | 29.74 | 3.4402 +/- 0.0171 | 21.53% |
| Q3_K_M | 31.91 | 3.3617 +/- 0.0172 | 18.75% |
| Q3_K_L | 34.59 | 3.3016 +/- 0.0168 | 16.63% |
| IQ4_XS | 35.30 | 3.0310 +/- 0.0149 | 7.07% |
| IQ4_NL | 37.30 | 3.0261 +/- 0.0149 | 6.90% |
| Q4_K_S | 37.58 | 3.0050 +/- 0.0148 | 6.15% |
| Q4_K_M | 39.60 | 2.9674 +/- 0.0146 | 4.83% |
| Q5_K_S | 45.32 | 2.8843 +/- 0.0141 | 1.89% |
| Q5_K_M | 46.52 | 2.8656 +/- 0.0139 | 1.23% |
| Q6_K | 53.91 | 2.8441 +/- 0.0138 | 0.47% |
| Q8_0 | 69.83 | 2.8316 +/- 0.0138 | 0.03% |
| F16 | 131.43 | 2.8308 +/- 0.0138 | 0.00% |
Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model [README](https://github.com/meta-llama/llama3). For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go [here](https://github.com/meta-llama/llama-recipes).
## License
See the License file for Meta Llama 3 [here](https://llama.meta.com/llama3/license/) and Acceptable Use Policy [here](https://llama.meta.com/llama3/use-policy/)