Edit model card

Model Card for Model ID

LLaMA2 7b 모델의 한국어 CP(Continual Pre-trained)모델인 https://huggingface.co/beomi/llama-2-ko-7b 모델의 instruction tuning 모델

transformers와 trl을 이용하여 QLoRA로 훈련 진행

Dataset은 https://huggingface.co/datasets/squarelike/OpenOrca-gugugo-ko

QLoRA 훈련 테스트를 위한 훈련 결과물

Model Details

Model Sources [optional]

Uses

Prompt template

"""
### instruction:
### intput:
### output:
"""

Training Details

Training Hyperparameters

  • Training regime: [bf16 mixed precision]

  • 토큰 추가 후 right 패딩사이드 지정하여 진행

  • LoRA config

    peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64,
    bias="none",
    task_type="CAUSAL_LM"
    )
    

Evaluation

Testing Data, Factors & Metrics

link: https://github.com/Beomi/ko-lm-evaluation-harness

results/all/aeolian83/llama_ko_sft_gugugo_experi_01

0 5
kobest_boolq (macro_f1) 0.588382 0.384051
kobest_copa (macro_f1) 0.749558 0.778787
kobest_hellaswag (macro_f1) 0.439247 0.439444
kobest_sentineg (macro_f1) 0.448283 0.934415
kohatespeech (macro_f1) 0.244828 0.371245
kohatespeech_apeach (macro_f1) 0.337434 0.394607
kohatespeech_gen_bias (macro_f1) 0.135272 0.461714
korunsmile (f1) 0.254562 0.315907
nsmc (acc) 0.61248 0.84256
pawsx_ko (acc) 0.5615 0.5365

results/all/beomi/llama-2-ko-7b

0 5 10 50
kobest_boolq (macro_f1) 0.612147 0.682832 0.713392 0.71622
kobest_copa (macro_f1) 0.759784 0.799843 0.807907 0.829976
kobest_hellaswag (macro_f1) 0.447951 0.460632 0.464623 0.458628
kobest_sentineg (macro_f1) 0.3517 0.969773 0.977329 0.97481
kohatespeech (macro_f1) 0.314636 0.383336 0.357491 0.366585
kohatespeech_apeach (macro_f1) 0.346127 0.567627 0.583391 0.629269
kohatespeech_gen_bias (macro_f1) 0.204651 0.509189 0.471078 0.451119
korunsmile (f1) 0.290663 0.306208 0.304279 0.343946
nsmc (acc) 0.57942 0.84242 0.87368 0.8939
pawsx_ko (acc) 0.538 0.52 0.5275 0.5195
Downloads last month
2
Safetensors
Model size
6.86B params
Tensor type
FP16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train aeolian83/llama_ko_sft_gugugo_experi_01