justheuristic commited on
Commit
b414226
1 Parent(s): 1ba714c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -17,7 +17,7 @@ __The 1x16g16 models require aqlm inference library v1.1.6 or newer:__
17
  `pip install aqlm[gpu,cpu]>=1.1.6`
18
 
19
 
20
- Note that a large portion of this model are the 16-bit embeddings/logits matrices. You can significantly reduce the model footprint by quantizing these matrices, e.g. using `bitsandbytes` LLM.int8 or NF4 formats.
21
 
22
 
23
  | Model | AQLM scheme | WikiText 2 PPL | Model size, Gb | Hub link |
 
17
  `pip install aqlm[gpu,cpu]>=1.1.6`
18
 
19
 
20
+ Note that a large portion of this model are the 16-bit embeddings/logits matrices. You can significantly reduce the model footprint by quantizing these matrices, e.g. using `bitsandbytes` LLM.int8 or NF4 formats. This does not require additional training.
21
 
22
 
23
  | Model | AQLM scheme | WikiText 2 PPL | Model size, Gb | Hub link |