Commit
•
b414226
1
Parent(s):
1ba714c
Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ __The 1x16g16 models require aqlm inference library v1.1.6 or newer:__
|
|
17 |
`pip install aqlm[gpu,cpu]>=1.1.6`
|
18 |
|
19 |
|
20 |
-
Note that a large portion of this model are the 16-bit embeddings/logits matrices. You can significantly reduce the model footprint by quantizing these matrices, e.g. using `bitsandbytes` LLM.int8 or NF4 formats.
|
21 |
|
22 |
|
23 |
| Model | AQLM scheme | WikiText 2 PPL | Model size, Gb | Hub link |
|
|
|
17 |
`pip install aqlm[gpu,cpu]>=1.1.6`
|
18 |
|
19 |
|
20 |
+
Note that a large portion of this model are the 16-bit embeddings/logits matrices. You can significantly reduce the model footprint by quantizing these matrices, e.g. using `bitsandbytes` LLM.int8 or NF4 formats. This does not require additional training.
|
21 |
|
22 |
|
23 |
| Model | AQLM scheme | WikiText 2 PPL | Model size, Gb | Hub link |
|