lmsys
/

vicuna-13b-delta-v0

Text Generation

text-generation-inference

Model card Files Files and versions Community

lmzheng commited on Jul 13, 2023

Commit

4138264

•

1 Parent(s): 0946cea

Update README.md

Files changed (1) hide show

README.md +28 -22

README.md CHANGED Viewed

@@ -14,35 +14,41 @@ Users have to apply it on top of the original LLaMA weights to get actual Vicuna
 # Vicuna Model Card
-## Model details
-**Model type:**
-Vicuna is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
-It is an auto-regressive language model, based on the transformer architecture.
-**Model date:**
-Vicuna was trained between March 2023 and April 2023.
-**Organizations developing the model:**
-The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.
-**Paper or resources for more information:**
-https://lmsys.org/blog/2023-03-30-vicuna/
-**Where to send questions or comments about the model:**
-https://github.com/lm-sys/FastChat/issues
-## Intended use
-**Primary intended uses:**
 The primary use of Vicuna is research on large language models and chatbots.
-**Primary intended users:**
 The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.
-## Training dataset
-70K conversations collected from ShareGPT.com.
-## Evaluation dataset
-A preliminary evaluation of the model quality is conducted by creating a set of 80 diverse questions and utilizing GPT-4 to judge the model outputs.
-See https://lmsys.org/blog/2023-03-30-vicuna/ for more details.

 # Vicuna Model Card
+## Model Details
+Vicuna is a chat assistant trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
+- **Developed by:** [LMSYS](https://lmsys.org/)
+- **Model type:** An auto-regressive language model based on the transformer architecture.
+- **License:** Non-commercial license
+- **Finetuned from model:** [LLaMA](https://arxiv.org/abs/2302.13971).
+### Model Sources
+- **Repository:** https://github.com/lm-sys/FastChat
+- **Blog:** https://lmsys.org/blog/2023-03-30-vicuna/
+- **Paper:** https://arxiv.org/abs/2306.05685
+- **Demo:** https://chat.lmsys.org/
+## Uses
 The primary use of Vicuna is research on large language models and chatbots.
 The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.
+## How to Get Started with the Model
+Command line interface: https://github.com/lm-sys/FastChat#vicuna-weights.
+APIs (OpenAI API, Huggingface API): https://github.com/lm-sys/FastChat/tree/main#api.
+## Training Details
+Vicuna v0 is fine-tuned from LLaMA with supervised instruction fine-tuning.
+The training data is around 70K conversations collected from ShareGPT.com.
+See more details in the "Training Details of Vicuna Models" section in the appendix of this [paper](https://arxiv.org/pdf/2306.05685.pdf).
+## Evaluation
+Vicuna is evaluated with standard benchmarks, human preference, and LLM-as-a-judge. See more details in this [paper](https://arxiv.org/pdf/2306.05685.pdf) and [leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard).
+## Difference between different versions of Vicuna
+See [vicuna_weights_version.md](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md)