Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ pipeline_tag: question-answering
|
|
9 |
|
10 |
First of all, we would like to express our gratitude to [PartAI](https://huggingface.co/PartAI) for their efforts in expanding large language models in the Persian language by releasing the ["Darna"](https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct) model.
|
11 |
|
12 |
-
The quantized version of the "Darna" language model requires only 6GB of GPU memory for loading, while the original model requires 40GB of GPU memory.
|
13 |
|
14 |
This model based on AWQ quantize method that decrease the volume of model in minimum decrease of Accuracy by changing type of weights
|
15 |
|
|
|
9 |
|
10 |
First of all, we would like to express our gratitude to [PartAI](https://huggingface.co/PartAI) for their efforts in expanding large language models in the Persian language by releasing the ["Darna"](https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct) model.
|
11 |
|
12 |
+
The quantized version of the "Darna" language model requires only ~6GB of GPU memory for loading, while the original model requires ~40GB of GPU memory.
|
13 |
|
14 |
This model based on AWQ quantize method that decrease the volume of model in minimum decrease of Accuracy by changing type of weights
|
15 |
|