EEVE-10.8-BOOK-v0.1 / README.md
ryanu's picture
Update README.md
c839cc7 verified
|
raw
history blame
2.73 kB
ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’
Task Book (์‚ฌํšŒ๊ณผํ•™, ๊ธฐ์ˆ ๊ณผํ•™, ์ฒ ํ•™, ๋ฒ•ํ•™, ์˜ˆ์ˆ  ๋“ฑ)
๋ฐ์ดํ„ฐ ํฌ๊ธฐ 5000๊ฐœ
๋ชจ๋ธ qlora
max_seq_length 1024
num_train_epochs 3
per_device_train_batch_size 8
gradient_accumulation_steps 32
evaluation_strategy "steps"
eval_steps 2000
logging_steps 25
optim "paged_adamw_8bit"
learning_rate 2e-4
lr_scheduler_type "cosine"
warmup_steps 10
warmup_ratio 0.05
report_to "tensorboard"
weight_decay 0.01
max_steps -1

Summary

Book

๋ชจ๋ธ ์ด๋ฆ„ Rouge-1 Rouge-2 Rouge-L
*ryanu/EEVE-10.8-BOOK-v0.1 0.2454 0.1158 0.2404
meta-llama/llama-3-70b-instruct 0.2269 0.0925 0.2186
meta-llama/llama-3-8b-instruct 0.2137 0.0883 0.2020
yanolja/EEVE-Korean-Instruct-2.8B-v1.0 0.2095 0.0866 0.1985
mistralai/mixtral-8x7b-instruct-v0-1 0.1735 0.0516 0.1668
ibm-mistralai/mixtral-8x7b-instruct-v01-q 0.1724 0.0534 0.1630

Paper

๋ชจ๋ธ ์ด๋ฆ„ Rouge-1 Rouge-2 Rouge-L
*meta-llama/llama-3-8b-instruct 0.2044 0.0868 0.1895
ryanu/EEVE-10.8-BOOK-v0.1 0.2004 0.0860 0.1938
meta-llama/llama-3-70b-instruct 0.1935 0.0783 0.1836
yanolja/EEVE-Korean-Instruct-2.8B-v1.0 0.1934 0.0829 0.1832
mistralai/mixtral-8x7b-instruct-v0-1 0.1774 0.0601 0.1684
ibm-mistralai/mixtral-8x7b-instruct-v01-q 0.1702 0.0561 0.1605

RAG Q&A

๋ชจ๋ธ ์ด๋ฆ„ Rouge-1 Rouge-2 Rouge-L
*meta-llama/llama-3-70b-instruct 0.4418 0.2986 0.4297
*meta-llama/llama-3-8b-instruct 0.4391 0.3100 0.4273
mistralai/mixtral-8x7b-instruct-v0-1 0.4022 0.2653 0.3916
ibm-mistralai/mixtral-8x7b-instruct-v01-q 0.3105 0.1763 0.2960
yanolja/EEVE-Korean-Instruct-10.8B-v1.0 0.3191 0.2069 0.3136
ryanu/EEVE-10.8-BOOK-v0.1 0.2185 0.1347 0.2139

prompt template

๋‹ค์Œ ๋ฌธ์žฅ์„ 3~5๋ฌธ์žฅ์œผ๋กœ ๋ฐ˜๋ณต๋˜๋Š” ๊ตฌ๋ฌธ์—†์ด ํ…์ŠคํŠธ์— ์ œ์‹œ๋œ ์ฃผ์š” ๋…ผ๊ฑฐ๋ฅผ ๊ฐ„๋žตํ•˜๊ฒŒ ์š”์•ฝํ•ด์ค˜.

๋ฌธ์žฅ: {context}

์š”์•ฝ: {summary}