azhiboedova
commited on
Commit
•
6500162
1
Parent(s):
2be511f
Update README.md
Browse files
README.md
CHANGED
@@ -16,11 +16,11 @@ tags:
|
|
16 |
|
17 |
**Model Comparison: Quantized vs Basic Model**
|
18 |
|
19 |
-
| Model Type | Meta-Llama-3.1-8B-Instruct | Meta-Llama-3.1-
|
20 |
-
|
21 |
-
| Parameters | 8.03B | 2.04B
|
22 |
-
| Peak Memory Usage | 20.15 GB | 4.22 GB
|
23 |
-
| MMLU Accuracy | 60.9% | 45.5%
|
24 |
|
25 |
**Model Architecture**
|
26 |
The Llama 3.1 8B model is a state-of-the-art language model designed for a wide range of conversational and text generation tasks. By applying the Adaptive Quantization Learning Mechanism (AQLM) developed by Yandex Research, the model's size has been significantly reduced without sacrificing its powerful capabilities. This approach dynamically adjusts the precision of model parameters during training, optimizing for both performance and efficiency.
|
|
|
16 |
|
17 |
**Model Comparison: Quantized vs Basic Model**
|
18 |
|
19 |
+
| Model Type | Meta-Llama-3.1-8B-Instruct | Meta-Llama-3.1-2B-Instruct-AQLM-2Bit-1x16|
|
20 |
+
|-----------------------------|----------------------------|------------------------------------------|
|
21 |
+
| Parameters | 8.03B | 2.04B |
|
22 |
+
| Peak Memory Usage | 20.15 GB | 4.22 GB |
|
23 |
+
| MMLU Accuracy | 60.9% | 45.5% |
|
24 |
|
25 |
**Model Architecture**
|
26 |
The Llama 3.1 8B model is a state-of-the-art language model designed for a wide range of conversational and text generation tasks. By applying the Adaptive Quantization Learning Mechanism (AQLM) developed by Yandex Research, the model's size has been significantly reduced without sacrificing its powerful capabilities. This approach dynamically adjusts the precision of model parameters during training, optimizing for both performance and efficiency.
|