LearnItAnyway
commited on
Commit
•
8e90e4e
1
Parent(s):
27a829d
Update README.md
Browse files
README.md
CHANGED
@@ -1,16 +1,16 @@
|
|
1 |
---
|
2 |
license: other
|
3 |
---
|
4 |
-
# Model Card for llama-
|
5 |
|
6 |
## Model Description
|
7 |
|
8 |
-
`llama-
|
9 |
-
Llama architecture with
|
10 |
-
This model adopts a quantization in which the first
|
11 |
of the decoder have been quantized with the [`gptq`](https://github.com/qwopqwop200/GPTQ-for-LLaMa) method,
|
12 |
which uses 4-bit precision and 128 groups.
|
13 |
-
Then, the last
|
14 |
|
15 |
## Note
|
16 |
|
|
|
1 |
---
|
2 |
license: other
|
3 |
---
|
4 |
+
# Model Card for llama-13b-hf-35q_4bit-128g_WVU
|
5 |
|
6 |
## Model Description
|
7 |
|
8 |
+
`llama-13b-hf-35q_4bit-128g_WVU` is a model based on the
|
9 |
+
Llama architecture with 13 billion parameters.
|
10 |
+
This model adopts a quantization in which the first 35 layers
|
11 |
of the decoder have been quantized with the [`gptq`](https://github.com/qwopqwop200/GPTQ-for-LLaMa) method,
|
12 |
which uses 4-bit precision and 128 groups.
|
13 |
+
Then, the last 5 decoder layers (1/8 of decoding layers), and lm_head have been fine-tuned using the [wizard_vicuna_70k_unfiltered dataset](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered), 1 epoch.
|
14 |
|
15 |
## Note
|
16 |
|