BLIMA (Based on LIMA)
Collection
LIMA: https://arxiv.org/abs/2305.11206
•
9 items
•
Updated
Base Model: mistralai/Mistral-7B-Instruct-v0.2.
HQ-Korea-Datasets: WizardLM_Evol_train.
Learning rate: 4e-6
HQ-Korea-Valid: WizardLM_Evol_valid.
Eval loss: 0.4612
The following papers contain the foundational methodologies for the dataset and training methods we are currently proceeding.
Model | 추론 | 수학 | 글쓰기 | 코딩 | 이해 | 문법 | 싱글턴 | 멀티턴 | 평균 |
---|---|---|---|---|---|---|---|---|---|
claude-3-opus-20240229 | 8.42 | 9.21 | 9.71 | 9.14 | 10.00 | 7.92 | 8.80 | 9.33 | 9.07 |
gpt-4-turbo-2024-04-09 | 9.07 | 9.85 | 9.78 | 9.50 | 9.14 | 6.42 | 9.07 | 8.85 | 8.96 |
HyperClovaX | 5.85 | 7.14 | 8.50 | 7.57 | 9.50 | 8.50 | 8.40 | 7.28 | 7.84 |
maywell_kiqu-70b | 7.35 | 6.14 | 8.92 | 7.85 | 8.28 | 5.71 | 8.16 | 6.59 | 7.38 |
google-gemini-1.5-pro | 6.50 | 6.92 | 7.78 | 8.28 | 7.78 | 5.21 | 7.90 | 6.26 | 7.08 |
solar-1-mini-chat | 6.35 | 4.28 | 8.50 | 6.71 | 7.00 | 5.21 | 6.42 | 6.26 | 6.34 |
mistralai_Mixtral-8x7B-Instruct-v0.1 | 5.35 | 4.21 | 5.42 | 5.64 | 6.42 | 3.21 | 5.42 | 4.66 | 5.04 |
MarkrAI/Ko-mistral-7B-Inst-Wizard-v2.0-epoch3 (ours) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
자체 GPT4 평가(싱글턴; 3번 평균): 7.4682