oneonlee's picture
Update README.md
99d9151 verified
metadata
language:
  - en
  - ko
license: cc-by-nc-4.0
datasets:
  - kyujinpy/KOR-gugugu-platypus-set
base_model:
  - LDCC/LDCC-SOLAR-10.7B
pipeline_tag: text-generation

LDCC-SOLAR-gugutypus-10.7B


Model Details

Model Developers

Model Architecture

  • LDCC-SOLAR-gugutypus-10.7B is a instruction fine-tuned auto-regressive language model, based on the SOLAR transformer architecture.

Base Model

Training Dataset


Model comparisons

  • Ko-LLM leaderboard (2024/03/01) [link]
Model Average Ko-ARC Ko-HellaSwag Ko-MMLU Ko-TruthfulQA Ko-CommonGen V2
oneonlee/KoSOLAR-v0.2-gugutypus-10.7B 51.17 47.78 58.29 47.27 48.31 54.19
oneonlee/LDCC-SOLAR-gugutypus-10.7B 49.45 45.9 55.46 47.96 48.93 49

  • (KOR) AI-Harness evaluation [link]
Tasks Version Filter n-shot Metric Value Stderr
KMMLU N/A none 0 acc 0.3329 ± 0.0794
KMMLU N/A none 5 acc 0.3969 ± 0.0816
KoBEST-HellaSwag 0 none 0 acc 0.4260 ± 0.0221
KoBEST-HellaSwag 0 none 5 acc 0.4260 ± 0.0221
KoBEST-BoolQ 0 none 0 acc 0.7792 ± 0.0111
KoBEST-BoolQ 0 none 5 acc 0.8925 ± 0.0083
KoBEST-COPA 0 none 0 acc 0.6670 ± 0.0149
KoBEST-COPA 0 none 5 acc 0.7070 ± 0.0144
KoBEST-SentiNeg 0 none 0 acc 0.7582 ± 0.0215
KoBEST-SentiNeg 0 none 5 acc 0.9219 ± 0.0135

  • (ENG) AI-Harness evaluation [link]
Tasks Version Filter n-shot Metric Value Stderr
MMLU N/A none 0 acc 0.5826 ± 0.1432
MMLU N/A none 5 acc 0.6124 ± 0.1275
HellaSwag 1 none 0 acc 0.6075 ± 0.0049
HellaSwag 1 none 5 acc 0.6534 ± 0.0047
BoolQ 2 none 0 acc 0.8737 ± 0.0058
BoolQ 2 none 5 acc 0.8878 ± 0.0055
COPA 1 none 0 acc 0.8300 ± 0.0378
COPA 1 none 5 acc 0.9300 ± 0.0256
truthfulqa N/A none 0 acc 0.4249 ± 0.0023
truthfulqa N/A none 5 acc - ± -

Implementation Code

### LDCC-SOLAR-gugutypus
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

repo = "oneonlee/LDCC-SOLAR-gugutypus-10.7B"
model = AutoModelForCausalLM.from_pretrained(
        repo,
        return_dict=True,
        torch_dtype=torch.float16,
        device_map='auto'
)
tokenizer = AutoTokenizer.from_pretrained(repo)