--- license: apache-2.0 datasets: - hkust-nlp/deita-10k-v0 language: - en base_model: meta-llama/Llama-2-13b-hf --- Deita banner # Model Card for Deita Llama2 13B V1.0 SFT Deita is an open-sourced project designed to facilitate **Automatic Data Selection** for instruction tuning in Large Language Models (LLMs). Deita Llama2 13B V1.0 SFT is a fine-tuned version of Llama 2 that was trained on 10k automatically selected lightweight, high-quality alignment SFT data: [Deita 10K V0](https://huggingface.co/datasets/hkust-nlp/deita-10k-v0). ## Model description - **Model type:** Model fine tuned on automatically selected lightweight, high-quality alignment SFT data. - **Language(s) (NLP):** Primarily English - **Finetuned from model:** [meta-llama/Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf) ### Model Sources - **Repository:** https://github.com/hkust-nlp/deita - **Model Family:** Other models and the dataset are found in the [Deita collection](https://huggingface.co/collections/hkust-nlp/deita-6569c198c174808d94cf5bd4). ## Performance | Model | Align | Data Size | MT-Bench | AlpacaEval(%) | OpenLLM (Avg.) | |------------------------------------------------|-----------|------------|----------|---------------|----------------| | **Proprietary Models** | | | | | | | GPT-4-Turbo | ? | -- | 9.32 | 97.70 | -- | | GPT-4 | SFT + PPO | -- | 8.99 | 95.03 | -- | | Claude-2 | SFT + PPO | -- | 8.06 | 91.36 | -- | | GPT-3.5-turbo | SFT + PPO | -- | 7.94 | 89.37 | -- | | **Open-sourced Models based on LLaMA-2-13B** | | | | | | | Tulu-2-13B | SFT | 326K SFT | 6.70 | 78.90 | -- | | Tulu-2-13B+DPO | SFT + DPO | 326K SFT + 60K DPO | 7.00 | 89.50 | -- | | LLaMA2-13B-Chat | SFT + PPO | -- | 6.65 | 81.09 | -- | | WizardLM-13B-v1.2 | SFT | >70K SFT | 7.09 | 89.17 | -- | | Vicuna-13B-v1.5 | SFT | 125K SFT | 6.57 | 78.80 | 61.63 | | Random | SFT | 10K SFT | 5.78 | 65.19 | 61.32 | | DEITA-LLaMA2-13B-v1.0-sft | SFT | 10K SFT | 6.79 | 81.09 | 62.71 | ## Input Format The model is trained using the [vicuna_v1.1 template](https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py) ``` A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hello! ASSISTANT: Hi!USER: How are you? ASSISTANT: ``` ### Training hyperparameters The following hyperparameters were used during fine tuning: - learning_rate: 2e-05 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 3.0