Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,11 @@ base_model:
|
|
6 |
- Nexusflow/Athene-V2-Chat
|
7 |
tags:
|
8 |
- awq
|
|
|
|
|
|
|
|
|
9 |
---
|
10 |
# Athene-V2-Chat AWQ 4-Bit Quantized Version
|
11 |
|
12 |
-
This repository provides the AWQ 4-bit quantized version of the Athene-V2-Chat model, originally developed by Nexusflow. This model's weights are padded with zeros before quantization to ensure compatibility with multi-GPU tensor parallelism by resolving divisibility constraints. The padding minimally impacts computation while enabling efficient scaling across multiple GPUs.
|
|
|
6 |
- Nexusflow/Athene-V2-Chat
|
7 |
tags:
|
8 |
- awq
|
9 |
+
- Athene
|
10 |
+
- Chat
|
11 |
+
pipeline_tag: text-generation
|
12 |
+
library_name: transformers
|
13 |
---
|
14 |
# Athene-V2-Chat AWQ 4-Bit Quantized Version
|
15 |
|
16 |
+
This repository provides the AWQ 4-bit quantized version of the Athene-V2-Chat model, originally developed by Nexusflow. This model's weights are padded with zeros before quantization to ensure compatibility with multi-GPU tensor parallelism by resolving divisibility constraints. The padding minimally impacts computation while enabling efficient scaling across multiple GPUs.
|