Transformers
English
falcon
TheBloke commited on
Commit
d3cee1e
1 Parent(s): 5920eb5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -56,10 +56,10 @@ bin/falcon -m /path/to/Falcon-40b-Instruct.ggmlv3.q4_0.bin -t 10 -n 200 -p "writ
56
  ## Provided files
57
  | Name | Quant method | Bits | Size | Max RAM required | Use case |
58
  | ---- | ---- | ---- | ---- | ---- | ----- |
59
- | Falcon-40b-Instruct.ggmlv3.q4_0.bin | q4_0 | 4 | 23.54 GB | 26.04 GB | Original llama.cpp quant method, 4-bit. |
60
- | Falcon-40b-Instruct.ggmlv3.q4_1.bin | q4_1 | 4 | 26.15 GB | 28.65 GB | Original llama.cpp quant method, 4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models. |
61
- | Falcon-40b-Instruct.ggmlv3.q5_0.bin | q5_0 | 5 | 28.77 GB | 31.27 GB | Original llama.cpp quant method, 5-bit. Higher accuracy, higher resource usage and slower inference. |
62
- | Falcon-40b-Instruct.ggmlv3.q5_1.bin | q5_1 | 5 | 31.38 GB | 33.88 GB | Original llama.cpp quant method, 5-bit. Even higher accuracy, resource usage and slower inference. |
63
 
64
  A q8_0 file will be provided shortly. There is currently an issue preventing it from working. Once this is fixed, it will be uploaded.
65
 
 
56
  ## Provided files
57
  | Name | Quant method | Bits | Size | Max RAM required | Use case |
58
  | ---- | ---- | ---- | ---- | ---- | ----- |
59
+ | Falcon-40b-Instruct.ggmlv3.q4_0.bin | q4_0 | 4 | 23.54 GB | 26.04 GB | 4-bit. |
60
+ | Falcon-40b-Instruct.ggmlv3.q4_1.bin | q4_1 | 4 | 26.15 GB | 28.65 GB | 4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models. |
61
+ | Falcon-40b-Instruct.ggmlv3.q5_0.bin | q5_0 | 5 | 28.77 GB | 31.27 GB | 5-bit. Higher accuracy, higher resource usage and slower inference. |
62
+ | Falcon-40b-Instruct.ggmlv3.q5_1.bin | q5_1 | 5 | 31.38 GB | 33.88 GB | 5-bit. Even higher accuracy, resource usage and slower inference. |
63
 
64
  A q8_0 file will be provided shortly. There is currently an issue preventing it from working. Once this is fixed, it will be uploaded.
65