Update README.md
Browse files
README.md
CHANGED
@@ -1,11 +1,14 @@
|
|
1 |
---
|
2 |
-
license: apache-2.0
|
3 |
inference: false
|
4 |
---
|
5 |
|
|
|
|
|
|
|
|
|
6 |
**NOTE: This "delta model" cannot be used directly.**
|
7 |
-
Users have to apply it on top of the original LLaMA weights to get actual Vicuna weights.
|
8 |
-
|
9 |
<br>
|
10 |
<br>
|
11 |
|
@@ -24,14 +27,12 @@ Vicuna was trained between March 2023 and April 2023.
|
|
24 |
The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.
|
25 |
|
26 |
**Paper or resources for more information:**
|
27 |
-
https://
|
28 |
-
|
29 |
-
**License:**
|
30 |
-
Apache License 2.0
|
31 |
|
32 |
**Where to send questions or comments about the model:**
|
33 |
https://github.com/lm-sys/FastChat/issues
|
34 |
|
|
|
35 |
## Intended use
|
36 |
**Primary intended uses:**
|
37 |
The primary use of Vicuna is research on large language models and chatbots.
|
@@ -43,8 +44,5 @@ The primary intended users of the model are researchers and hobbyists in natural
|
|
43 |
70K conversations collected from ShareGPT.com.
|
44 |
|
45 |
## Evaluation dataset
|
46 |
-
A preliminary evaluation of the model quality is conducted by creating a set of 80 diverse questions and utilizing GPT-4 to judge the model outputs.
|
47 |
-
|
48 |
-
## Major updates of weights v1.1
|
49 |
-
- Refactor the tokenization and separator. In Vicuna v1.1, the separator has been changed from `"###"` to the EOS token `"</s>"`. This change makes it easier to determine the generation stop criteria and enables better compatibility with other libraries.
|
50 |
-
- Fix the supervised fine-tuning loss computation for better model quality.
|
|
|
1 |
---
|
|
|
2 |
inference: false
|
3 |
---
|
4 |
|
5 |
+
**NOTE: New version available**
|
6 |
+
Please check out a newer version of the weights [here](https://huggingface.co/lmsys/vicuna-13b-v1.3).
|
7 |
+
If you still want to use this old version, please see the compatibility and difference between different versions [here](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md).
|
8 |
+
|
9 |
**NOTE: This "delta model" cannot be used directly.**
|
10 |
+
Users have to apply it on top of the original LLaMA weights to get actual Vicuna weights. See [instructions](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md#how-to-apply-delta-weights-for-weights-v11-and-v0).
|
11 |
+
|
12 |
<br>
|
13 |
<br>
|
14 |
|
|
|
27 |
The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.
|
28 |
|
29 |
**Paper or resources for more information:**
|
30 |
+
https://lmsys.org/blog/2023-03-30-vicuna/
|
|
|
|
|
|
|
31 |
|
32 |
**Where to send questions or comments about the model:**
|
33 |
https://github.com/lm-sys/FastChat/issues
|
34 |
|
35 |
+
|
36 |
## Intended use
|
37 |
**Primary intended uses:**
|
38 |
The primary use of Vicuna is research on large language models and chatbots.
|
|
|
44 |
70K conversations collected from ShareGPT.com.
|
45 |
|
46 |
## Evaluation dataset
|
47 |
+
A preliminary evaluation of the model quality is conducted by creating a set of 80 diverse questions and utilizing GPT-4 to judge the model outputs.
|
48 |
+
See https://lmsys.org/blog/2023-03-30-vicuna/ for more details.
|
|
|
|
|
|