Edit model card

EEVE-Math-10.8B

EEVE-Math ν”„λ‘œμ νŠΈλŠ”

에 λŒ€ν•œ λ‚΄μš©μ„ ν¬κ΄„ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.

이 λͺ¨λΈμ€ orca-math-word-problems-193k-korean 데이터셋을 μ΄μš©ν•˜μ—¬ ν•™μŠ΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€. 응닡 쀑 μΌλΆ€λŠ” LaTeX ν˜•μ‹μ„ μ΄μš©ν•˜μ—¬ κ²°κ³Όλ₯Ό λ°˜ν™˜ν•˜μ§€λ§Œ, μ™„μ„±λœ ν˜•μ‹μ΄ 아닐 수 μžˆμŠ΅λ‹ˆλ‹€. ν˜„μž¬ M1 stageκΉŒμ§€ μ§„ν–‰λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

Model gsm8k-ko(pass@1)
Base 0.4049
SFT(M1) 0.508
SFT(M1) -> SFT 0.539
SFT(M1) -> KTO(M2) -
SFT -> KTO(M2) -> KTO(final) -

Specifications

  • SFT(M1) -> SFT 단계

Base Model

yanolja/EEVE-Korean-10.8B-v1.0

Dataset

orca-math-word-problems-193k-korean

Evaluation

gsm8k-ko, kobest

git clone https://github.com/kuotient/lm-evaluation-harness.git
cd lm-evaluation-harness
pip install -e .
lm_eval --model hf \
    --model_args pretrained=yanolja/EEVE-Korean-Instruct-2.8B-v1.0 \
    --tasks gsm8k-ko \
    --device cuda:0 \
    --batch_size auto:4
Model gsm8k(pass@1) boolq(acc) copa(acc) hellaswag(acc) Overall
yanolja/EEVE-Korean-10.8B-v1.0 0.4049 - - - -
yanolja/EEVE-Korean-Instruct-10.8B-v1.0 0.4511 0.8668 0.7450 0.4940 0.6392
EEVE-Math-10.8B 0.5390 0.8027 0.7260 0.4760 0.6359
EEVE-Instruct-Math-10.8B 0.4845 0.8519 0.7410 0.4980 0.6439
Downloads last month
322

Finetuned from

Dataset used to train kuotient/EEVE-Math-10.8B

Evaluation results