File size: 3,807 Bytes
189bf7a 8ac9b95 3d2a534 8ac9b95 3d2a534 8ac9b95 6c01945 8ac9b95 6c01945 8ac9b95 6c01945 8ac9b95 3d2a534 6c01945 189bf7a 3d2a534 189bf7a 8ac9b95 3d2a534 8ac9b95 3d2a534 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
license: mit
language:
- en
---
[![Hierholzer Banner](https://tvtime.us/static/images/LLAMA3.1.jpg)](#)
# Model
Here is a Quantized version of Llama-3.1-70B-Instruct using GGUF<br>
GGUF is designed for use with GGML and other executors.<br>
GGUF was developed by @ggerganov who is also the developer of llama.cpp, a popular C/C++ LLM inference framework.<br>
Models initially developed in frameworks like PyTorch can be converted to GGUF format for use with those engines.<br>
## Uploaded Quantization Types<br>
Currently, I have uploaded 2 quantized versions:
- [x] Q4_K_M ~ *Recommended*
- [x] Q5_K_M ~ *Recommended*
- [x] Q8_0 ~ *NOT Recommended*
- [ ]
### All Quantization Types Possible
Here are all of the Quantization Types that are Possible. Let me know if you need any other versions
| **#** | **or** | **Q#** | **:** | _Description Of Quantization Types_ |
|-------|:------:|:------:|:-----:|----------------------------------------------------------------|
| 2 | or | Q4_0 | : | small, very high quality loss - legacy, prefer using Q3_K_M |
| 3 | or | Q4_1 | : | small, substantial quality loss - legacy, prefer using Q3_K_L |
| 8 | or | Q5_0 | : | medium, balanced quality - legacy, prefer using Q4_K_M |
| 9 | or | Q5_1 | : | medium, low quality loss - legacy, prefer using Q5_K_M |
| 10 | or | Q2_K | : | smallest, extreme quality loss - *NOT Recommended* |
| 12 | or | Q3_K | : | alias for Q3_K_M |
| 11 | or | Q3_K_S | : | very small, very high quality loss |
| 12 | or | Q3_K_M | : | very small, high quality loss |
| 13 | or | Q3_K_L | : | small, high quality loss |
| 15 | or | Q4_K | : | alias for Q4_K_M |
| 14 | or | Q4_K_S | : | small, some quality loss |
| 15 | or | Q4_K_M | : | medium, balanced quality - *Recommended* |
| 17 | or | Q5_K | : | alias for Q5_K_M |
| 16 | or | Q5_K_S | : | large, low quality loss - *Recommended* |
| 17 | or | Q5_K_M | : | large, very low quality loss - *Recommended* |
| 18 | or | Q6_K | : | very large, very low quality loss |
| 7 | or | Q8_0 | : | very large, extremely low quality loss |
| 1 | or | F16 | : | extremely large, virtually no quality loss - *NOT Recommended* |
| 0 | or | F32 | : | absolutely huge, lossless - *NOT Recommended* |
## Uses
By using the GGUF version of Llama-3.1-70B-Instruct, you will be able to run this LLM while having to use significantly less resources than you would using the non quantized version.
[![Hugging Face](https://img.shields.io/badge/Hugging%20Face-FFD21E?logo=huggingface&logoColor=000)](#)
[![OS](https://img.shields.io/badge/OS-linux%2C%20windows%2C%20macOS-0078D4)](https://docs.abblix.com/docs/technical-requirements)
[![CPU](https://img.shields.io/badge/CPU-x86%2C%20x64%2C%20ARM%2C%20ARM64-FF8C00)](https://docs.abblix.com/docs/technical-requirements)
[![forthebadge](https://forthebadge.com/images/badges/license-mit.svg)](https://forthebadge.com)
[![forthebadge](https://forthebadge.com/images/badges/made-with-python.svg)](https://forthebadge.com)
[![forthebadge](https://forthebadge.com/images/badges/powered-by-electricity.svg)](https://forthebadge.com) |