Update README.md
Browse files
README.md
CHANGED
@@ -10,10 +10,10 @@ inference: false
|
|
10 |
- Based on version 1.1
|
11 |
- Used PR "More accurate Q4_0 and Q4_1 quantizations #896" (should be closer in quality to unquantized)
|
12 |
- For q4_2, "Q4_2 ARM #1046" was used. Will update regularly if new changes are made.
|
13 |
-
- Choosing between q4_0
|
14 |
-
-
|
15 |
-
-
|
16 |
-
|
17 |
|
18 |
- 7B version of this can be found here: https://huggingface.co/eachadea/ggml-vicuna-7b-1.1
|
19 |
|
|
|
10 |
- Based on version 1.1
|
11 |
- Used PR "More accurate Q4_0 and Q4_1 quantizations #896" (should be closer in quality to unquantized)
|
12 |
- For q4_2, "Q4_2 ARM #1046" was used. Will update regularly if new changes are made.
|
13 |
+
- **Choosing between q4_0, q4_1, and q4_2:**
|
14 |
+
- 4_0 is the fastest. The quality is the poorest.
|
15 |
+
- 4_1 is a lot slower. The quality is noticably better.
|
16 |
+
- 4_2 is almost as fast as 4_0 and about as good as 4_1 **on Apple Silicon**. On Intel/AMD it's hardly better or faster than 4_1.
|
17 |
|
18 |
- 7B version of this can be found here: https://huggingface.co/eachadea/ggml-vicuna-7b-1.1
|
19 |
|