eachadea commited on
Commit
376d070
1 Parent(s): b6f8c1d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -13
README.md CHANGED
@@ -3,19 +3,20 @@ license: apache-2.0
3
  inference: true
4
  ---
5
 
6
- **NOTE: This GGML conversion is primarily for use with llama.cpp.**
7
- - 7B parameters
8
- - 4-bit quantized
9
- - Based on version 1.1
10
- - Used best available quantization for each format
11
- - **Choosing between q4_0, q4_1, and q4_2:**
12
- - 4_0 is the fastest. The quality is the poorest.
13
- - 4_1 is slower. The quality is noticeably better.
14
- - 4_2 generally offers the best speed to quality ratio. The drawback is that the format is WIP.
15
-
16
- - 13B version of this can be found here: https://huggingface.co/eachadea/ggml-vicuna-13b-1.1
17
- <br>
18
- <br>
 
19
 
20
  # Vicuna Model Card
21
 
 
3
  inference: true
4
  ---
5
 
6
+ **NOTE: This GGML conversion is primarily for use with llama.cpp.**
7
+ - PR #896 was used for q4_0. Everything else is latest as of upload time.
8
+ - A warning for q4_2 and q4_3: These are WIP. Do not expect any kind of backwards compatibility until they are finalized.
9
+ - 13B can be found here: https://huggingface.co/eachadea/ggml-vicuna-13b-1.1
10
+ - **Choosing the right model:**
11
+ - `ggml-vicuna-7b-1.1-q4_0` - Fast, lacks in accuracy.
12
+ - `ggml-vicuna-7b-1.1-q4_1` - More accurate, lacks in speed.
13
+
14
+ - `ggml-vicuna-7b-1.1-q4_2` - Pretty much a better `q4_0`. Similarly fast, but more accurate.
15
+ - `ggml-vicuna-7b-1.1-q4_3` - Pretty much a better `q4_1`. More accurate, still slower.
16
+
17
+ - `ggml-vicuna-7b-1.0-uncensored` - Available in `q4_2` and `q4_3`, is an uncensored/unfiltered variant of the model. It is based on the previous release and still uses the `### Human:` syntax. Avoid unless you need it.
18
+
19
+ ---
20
 
21
  # Vicuna Model Card
22