File size: 3,807 Bytes

189bf7a
 
 
 
 
8ac9b95
3d2a534
8ac9b95
3d2a534
8ac9b95
 
6c01945
8ac9b95
6c01945
 
 
8ac9b95
 
6c01945
8ac9b95
3d2a534
6c01945
189bf7a
3d2a534
189bf7a
 
8ac9b95
 
 
 
 
3d2a534
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8ac9b95
 
 
3d2a534

---
license: mit
language:
- en
---

[![Hierholzer Banner](https://tvtime.us/static/images/LLAMA3.1.jpg)](#)

# Model


Here is a Quantized version of Llama-3.1-70B-Instruct using GGUF<br>

GGUF is designed for use with GGML and other executors.<br> 
GGUF was developed by @ggerganov who is also the developer of llama.cpp, a popular C/C++ LLM inference framework.<br> 
Models initially developed in frameworks like PyTorch can be converted to GGUF format for use with those engines.<br>


## Uploaded Quantization Types<br>

Currently, I have uploaded 2 quantized versions:

- [x] Q4_K_M ~ *Recommended* 
- [x] Q5_K_M ~ *Recommended* 
- [x] Q8_0 ~ *NOT Recommended*
- [ ] 

### All Quantization Types Possible

Here are all of the Quantization Types that are Possible. Let me know if you need any other versions

| **#** | **or** | **Q#** | **:** | _Description Of Quantization Types_                            |
|-------|:------:|:------:|:-----:|----------------------------------------------------------------|
| 2     |   or   | Q4_0   |   :   | small, very high quality loss - legacy, prefer using Q3_K_M    |
| 3     |   or   | Q4_1   |   :   | small, substantial quality loss - legacy, prefer using Q3_K_L  |
| 8     |   or   | Q5_0   |   :   | medium, balanced quality - legacy, prefer using Q4_K_M         |
| 9     |   or   | Q5_1   |   :   | medium, low quality loss - legacy, prefer using Q5_K_M         |
| 10    |   or   | Q2_K   |   :   | smallest, extreme quality loss - *NOT Recommended*             |
| 12    |   or   | Q3_K   |   :   | alias for Q3_K_M                                               |
| 11    |   or   | Q3_K_S |   :   | very small, very high quality loss                             |
| 12    |   or   | Q3_K_M |   :   | very small, high quality loss                                  |
| 13    |   or   | Q3_K_L |   :   | small, high quality loss                                       |
| 15    |   or   | Q4_K   |   :   | alias for Q4_K_M                                               |
| 14    |   or   | Q4_K_S |   :   | small, some quality loss                                       |
| 15    |   or   | Q4_K_M |   :   | medium, balanced quality - *Recommended*                       |
| 17    |   or   | Q5_K   |   :   | alias for Q5_K_M                                               |
| 16    |   or   | Q5_K_S |   :   | large, low quality loss - *Recommended*                        |
| 17    |   or   | Q5_K_M |   :   | large, very low quality loss - *Recommended*                   |
| 18    |   or   | Q6_K   |   :   | very large, very low quality loss                              |
| 7     |   or   | Q8_0   |   :   | very large, extremely low quality loss                         |
| 1     |   or   | F16    |   :   | extremely large, virtually no quality loss - *NOT Recommended* |
| 0     |   or   | F32    |   :   | absolutely huge, lossless - *NOT Recommended*                  |

## Uses

By using the GGUF version of Llama-3.1-70B-Instruct, you will be able to run this LLM while having to use significantly less resources than you would using the non quantized version.

[![Hugging Face](https://img.shields.io/badge/Hugging%20Face-FFD21E?logo=huggingface&logoColor=000)](#)
[![OS](https://img.shields.io/badge/OS-linux%2C%20windows%2C%20macOS-0078D4)](https://docs.abblix.com/docs/technical-requirements)
[![CPU](https://img.shields.io/badge/CPU-x86%2C%20x64%2C%20ARM%2C%20ARM64-FF8C00)](https://docs.abblix.com/docs/technical-requirements)
[![forthebadge](https://forthebadge.com/images/badges/license-mit.svg)](https://forthebadge.com) 
[![forthebadge](https://forthebadge.com/images/badges/made-with-python.svg)](https://forthebadge.com)
[![forthebadge](https://forthebadge.com/images/badges/powered-by-electricity.svg)](https://forthebadge.com)