elinas commited on
Commit
29d9840
1 Parent(s): 4f6fdcd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -6,9 +6,11 @@ tags:
6
  ---
7
 
8
  # vicuna-13b-4bit
9
- Converted `vicuna-13b` to GPTQ 4bit using `true-sequentual` and `groupsize 128` in `safetensors` for best possible model performance.
10
 
11
- https://github.com/qwopqwop200/GPTQ-for-LLaMa
 
 
12
 
13
  # Update 2023-04-03
14
  Recent GPTQ commits have introduced breaking changes to model loading and you should use commit `a6f363e3f93b9fb5c26064b5ac7ed58d22e3f773` in the `cuda` branch.
@@ -27,6 +29,7 @@ This creates and switches to a `cuda-stable` branch to continue using the quanti
27
  Since this is instruction tuned, for best results, use the following format for inference (note that the instruction format is different from Alpaca):
28
  ```
29
  ### Human: your-prompt
 
30
  ```
31
 
32
  If you want deterministic results, turn off sampling. You can turn it off in the webui by unchecking `do_sample`.
 
6
  ---
7
 
8
  # vicuna-13b-4bit
9
+ Converted `vicuna-13b` to GPTQ 4bit using `true-sequentual` and `groupsize 128` in `safetensors` for best possible model performance.
10
 
11
+ Vicuna is a high coherence model based on Llama that is comparable to ChatGPT. Read more here https://vicuna.lmsys.org/
12
+
13
+ GPTQ - https://github.com/qwopqwop200/GPTQ-for-LLaMa
14
 
15
  # Update 2023-04-03
16
  Recent GPTQ commits have introduced breaking changes to model loading and you should use commit `a6f363e3f93b9fb5c26064b5ac7ed58d22e3f773` in the `cuda` branch.
 
29
  Since this is instruction tuned, for best results, use the following format for inference (note that the instruction format is different from Alpaca):
30
  ```
31
  ### Human: your-prompt
32
+ ### Assistant:
33
  ```
34
 
35
  If you want deterministic results, turn off sampling. You can turn it off in the webui by unchecking `do_sample`.