AndrewZeng's picture
Update README.md
42f2854
|
raw
history blame
No virus
3.74 kB
metadata
license: apache-2.0
datasets:
  - hkust-nlp/deita-10k-v0
language:
  - en
base_model: meta-llama/Llama-2-13b-hf
Deita banner

Model Card for Deita Llama2 13B V1.0 SFT

Deita is an open-sourced project designed to facilitate Automatic Data Selection for instruction tuning in Large Language Models (LLMs). Deita Llama2 13B V1.0 SFT is a fine-tuned version of Llama 2 that was trained on 10k automatically selected lightweight, high-quality alignment SFT data: Deita 10K V0.

Model description

  • Model type: Model fine tuned on automatically selected lightweight, high-quality alignment SFT data.
  • Language(s) (NLP): Primarily English
  • Finetuned from model: meta-llama/Llama-2-13b-hf

Model Sources

Performance

Model Align Data Size MT-Bench AlpacaEval(%) OpenLLM (Avg.)
Proprietary Models
GPT-4-Turbo ? -- 9.32 97.70 --
GPT-4 SFT + PPO -- 8.99 95.03 --
Claude-2 SFT + PPO -- 8.06 91.36 --
GPT-3.5-turbo SFT + PPO -- 7.94 89.37 --
Open-sourced Models based on LLaMA-2-13B
Tulu-2-13B SFT 326K SFT 6.70 78.90 --
Tulu-2-13B+DPO SFT + DPO 326K SFT + 60K DPO 7.00 89.50 --
LLaMA2-13B-Chat SFT + PPO -- 6.65 81.09 --
WizardLM-13B-v1.2 SFT >70K SFT 7.09 89.17 --
Vicuna-13B-v1.5 SFT 125K SFT 6.57 78.80 61.63
Random SFT 10K SFT 5.78 65.19 61.32
DEITA-LLaMA2-13B-v1.0-sft SFT 10K SFT 6.79 81.09 62.71

Input Format

The model is trained using the vicuna_v1.1 template

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hello! ASSISTANT: Hi!</s>USER: How are you? ASSISTANT:

Training hyperparameters

The following hyperparameters were used during fine tuning:

  • learning_rate: 2e-05
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3.0