TheBloke
/

falcon-40b-instruct-GGML

Model card Files Files and versions Community

TheBloke commited on Jun 15, 2023

Commit

d3cee1e

•

1 Parent(s): 5920eb5

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -56,10 +56,10 @@ bin/falcon -m /path/to/Falcon-40b-Instruct.ggmlv3.q4_0.bin -t 10 -n 200 -p "writ
 ## Provided files
 | Name | Quant method | Bits | Size | Max RAM required | Use case |
 | ---- | ---- | ---- | ---- | ---- | ----- |
-| Falcon-40b-Instruct.ggmlv3.q4_0.bin | q4_0 | 4 | 23.54 GB | 26.04 GB | Original llama.cpp quant method, 4-bit. |
-| Falcon-40b-Instruct.ggmlv3.q4_1.bin | q4_1 | 4 | 26.15 GB | 28.65 GB | Original llama.cpp quant method, 4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models. |
-| Falcon-40b-Instruct.ggmlv3.q5_0.bin | q5_0 | 5 | 28.77 GB | 31.27 GB | Original llama.cpp quant method, 5-bit. Higher accuracy, higher resource usage and slower inference. |
-| Falcon-40b-Instruct.ggmlv3.q5_1.bin | q5_1 | 5 | 31.38 GB | 33.88 GB | Original llama.cpp quant method, 5-bit. Even higher accuracy, resource usage and slower inference. |
 A q8_0 file will be provided shortly. There is currently an issue preventing it from working. Once this is fixed, it will be uploaded.

 ## Provided files
 | Name | Quant method | Bits | Size | Max RAM required | Use case |
 | ---- | ---- | ---- | ---- | ---- | ----- |
+| Falcon-40b-Instruct.ggmlv3.q4_0.bin | q4_0 | 4 | 23.54 GB | 26.04 GB | 4-bit. |
+| Falcon-40b-Instruct.ggmlv3.q4_1.bin | q4_1 | 4 | 26.15 GB | 28.65 GB | 4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models. |
+| Falcon-40b-Instruct.ggmlv3.q5_0.bin | q5_0 | 5 | 28.77 GB | 31.27 GB | 5-bit. Higher accuracy, higher resource usage and slower inference. |
+| Falcon-40b-Instruct.ggmlv3.q5_1.bin | q5_1 | 5 | 31.38 GB | 33.88 GB | 5-bit. Even higher accuracy, resource usage and slower inference. |
 A q8_0 file will be provided shortly. There is currently an issue preventing it from working. Once this is fixed, it will be uploaded.