eachadea commited on
Commit
99a026d
1 Parent(s): 0ecc543

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -15
README.md CHANGED
@@ -3,21 +3,20 @@ license: apache-2.0
3
  inference: false
4
  ---
5
 
6
- **NOTE: This GGML conversion is primarily for use with llama.cpp.**
7
- - 13B parameters
8
- - 4-bit quantized
9
- - Based on version 1.1
10
- - Used best available quantization for each format
11
- - Uncensored variant is available - but is based on the previous release (still uses ### syntax and the response quality suffers in comparison to v1.1)
12
- - **Choosing between q4_0, q4_1, and q4_2:**
13
- - 4_0 is the fastest. The quality is the poorest.
14
- - 4_1 is slower. The quality is noticeably better.
15
- - 4_2 generally offers the best speed to quality ratio. The drawback is that the format is WIP.
16
-
17
- - 7B version of this can be found here: https://huggingface.co/eachadea/ggml-vicuna-7b-1.1
18
-
19
- <br>
20
- <br>
21
 
22
  # Vicuna Model Card
23
 
 
3
  inference: false
4
  ---
5
 
6
+ **NOTE: This GGML conversion is primarily for use with llama.cpp.**
7
+ - PR #896 was used for q4_0. Everything else is latest as of upload time.
8
+ - A warning for q4_2 and q4_3: These are WIP. Do not expect any kind of backwards compatibility until they are finalized.
9
+ - 7B can be found here: https://huggingface.co/eachadea/ggml-vicuna-7b-1.1
10
+ - **Choosing the right model:**
11
+ - `ggml-vicuna-13b-1.1-q4_0` - Fast, lacks in accuracy.
12
+ - `ggml-vicuna-13b-1.1-q4_1` - More accurate, lacks in speed.
13
+
14
+ - `ggml-vicuna-13b-1.1-q4_2` - Pretty much a better `q4_0`. Similarly fast, but more accurate.
15
+ - `ggml-vicuna-13b-1.1-q4_3` - Pretty much a better `q4_1`. More accurate, still slower.
16
+
17
+ - `ggml-vicuna-13b-1.0-uncensored` - Available in `q4_2` and `q4_3`, is an uncensored/unfiltered variant of the model. It is based on the previous release and still uses the `### Human:` syntax. Avoid unless you need it.
18
+
19
+ ---
 
20
 
21
  # Vicuna Model Card
22