eachadea commited on
Commit
b6f8c1d
1 Parent(s): 81f2e98

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -7,13 +7,11 @@ inference: true
7
  - 7B parameters
8
  - 4-bit quantized
9
  - Based on version 1.1
10
- - Used PR "More accurate Q4_0 and Q4_1 quantizations #896" (should be closer in quality to unquantized)
11
- - Uncensored variant is available, but it's based on version 1.0 (worse quality wise)
12
- - For q4_2, "Q4_2 ARM #1046" was used. Will update regularly if new changes are made.
13
  - **Choosing between q4_0, q4_1, and q4_2:**
14
  - 4_0 is the fastest. The quality is the poorest.
15
- - 4_1 is a lot slower. The quality is noticeably better.
16
- - 4_2 is almost as fast as 4_0 and about as good as 4_1 **on Apple Silicon**. On Intel/AMD it's hardly better or faster than 4_1.
17
 
18
  - 13B version of this can be found here: https://huggingface.co/eachadea/ggml-vicuna-13b-1.1
19
  <br>
 
7
  - 7B parameters
8
  - 4-bit quantized
9
  - Based on version 1.1
10
+ - Used best available quantization for each format
 
 
11
  - **Choosing between q4_0, q4_1, and q4_2:**
12
  - 4_0 is the fastest. The quality is the poorest.
13
+ - 4_1 is slower. The quality is noticeably better.
14
+ - 4_2 generally offers the best speed to quality ratio. The drawback is that the format is WIP.
15
 
16
  - 13B version of this can be found here: https://huggingface.co/eachadea/ggml-vicuna-13b-1.1
17
  <br>