|
| ํ๋ผ๋ฏธํฐ | ๊ฐ | |
|
|----------|-----| |
|
| Task | Book (์ฌํ๊ณผํ, ๊ธฐ์ ๊ณผํ, ์ฒ ํ, ๋ฒํ, ์์ ๋ฑ) | |
|
| ๋ฐ์ดํฐ ํฌ๊ธฐ | 5000๊ฐ | |
|
| ๋ชจ๋ธ | qlora | |
|
| max_seq_length | 1024 | |
|
| num_train_epochs | 3 | |
|
| per_device_train_batch_size | 8 | |
|
| gradient_accumulation_steps | 32 | |
|
| evaluation_strategy | "steps" | |
|
| eval_steps | 2000 | |
|
| logging_steps | 25 | |
|
| optim | "paged_adamw_8bit" | |
|
| learning_rate | 2e-4 | |
|
| lr_scheduler_type | "cosine" | |
|
| warmup_steps | 10 | |
|
| warmup_ratio | 0.05 | |
|
| report_to | "tensorboard" | |
|
| weight_decay | 0.01 | |
|
| max_steps | -1 | |
|
|
|
|
|
# Summary |
|
**Book** |
|
| ๋ชจ๋ธ ์ด๋ฆ | Rouge-1 | Rouge-2 | Rouge-L | |
|
|----------------------------------------------|---------|---------|---------| |
|
| *ryanu/EEVE-10.8-BOOK-v0.1 | 0.2454 | 0.1158 | 0.2404 | |
|
| meta-llama/llama-3-70b-instruct | 0.2269 | 0.0925 | 0.2186 | |
|
| meta-llama/llama-3-8b-instruct | 0.2137 | 0.0883 | 0.2020 | |
|
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.2095 | 0.0866 | 0.1985 | |
|
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.1735 | 0.0516 | 0.1668 | |
|
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.1724 | 0.0534 | 0.1630 | |
|
|
|
**Paper** |
|
| ๋ชจ๋ธ ์ด๋ฆ | Rouge-1 | Rouge-2 | Rouge-L | |
|
|----------------------------------------------|---------|---------|---------| |
|
| *meta-llama/llama-3-8b-instruct | 0.2044 | 0.0868 | 0.1895 | |
|
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2004 | 0.0860 | 0.1938 | |
|
| meta-llama/llama-3-70b-instruct | 0.1935 | 0.0783 | 0.1836 | |
|
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.1934 | 0.0829 | 0.1832 | |
|
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.1774 | 0.0601 | 0.1684 | |
|
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.1702 | 0.0561 | 0.1605 | |
|
|
|
# RAG Q&A |
|
| ๋ชจ๋ธ ์ด๋ฆ | Rouge-1 | Rouge-2 | Rouge-L | |
|
|----------------------------------------------|---------|---------|---------| |
|
| *meta-llama/llama-3-70b-instruct | 0.4418 | 0.2986 | 0.4297 | |
|
| *meta-llama/llama-3-8b-instruct | 0.4391 | 0.3100 | 0.4273 | |
|
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.4022 | 0.2653 | 0.3916 | |
|
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.3105 | 0.1763 | 0.2960 | |
|
| yanolja/EEVE-Korean-Instruct-10.8B-v1.0 | 0.3191 | 0.2069 | 0.3136 | |
|
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2185 | 0.1347 | 0.2139 | |
|
|
|
|
|
**prompt template** |
|
------------------- |
|
|
|
๋ค์ ๋ฌธ์ฅ์ 3~5๋ฌธ์ฅ์ผ๋ก ๋ฐ๋ณต๋๋ ๊ตฌ๋ฌธ์์ด ํ
์คํธ์ ์ ์๋ ์ฃผ์ ๋
ผ๊ฑฐ๋ฅผ ๊ฐ๋ตํ๊ฒ ์์ฝํด์ค. |
|
|
|
๋ฌธ์ฅ: {context} |
|
|
|
์์ฝ: {summary} |