EEVE-10.8-BOOK-v0.1 / README.md
ryanu's picture
Update README.md
c839cc7 verified
|
raw
history blame
2.73 kB
| ํŒŒ๋ผ๋ฏธํ„ฐ | ๊ฐ’ |
|----------|-----|
| Task | Book (์‚ฌํšŒ๊ณผํ•™, ๊ธฐ์ˆ ๊ณผํ•™, ์ฒ ํ•™, ๋ฒ•ํ•™, ์˜ˆ์ˆ  ๋“ฑ) |
| ๋ฐ์ดํ„ฐ ํฌ๊ธฐ | 5000๊ฐœ |
| ๋ชจ๋ธ | qlora |
| max_seq_length | 1024 |
| num_train_epochs | 3 |
| per_device_train_batch_size | 8 |
| gradient_accumulation_steps | 32 |
| evaluation_strategy | "steps" |
| eval_steps | 2000 |
| logging_steps | 25 |
| optim | "paged_adamw_8bit" |
| learning_rate | 2e-4 |
| lr_scheduler_type | "cosine" |
| warmup_steps | 10 |
| warmup_ratio | 0.05 |
| report_to | "tensorboard" |
| weight_decay | 0.01 |
| max_steps | -1 |
# Summary
**Book**
| ๋ชจ๋ธ ์ด๋ฆ„ | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *ryanu/EEVE-10.8-BOOK-v0.1 | 0.2454 | 0.1158 | 0.2404 |
| meta-llama/llama-3-70b-instruct | 0.2269 | 0.0925 | 0.2186 |
| meta-llama/llama-3-8b-instruct | 0.2137 | 0.0883 | 0.2020 |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.2095 | 0.0866 | 0.1985 |
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.1735 | 0.0516 | 0.1668 |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.1724 | 0.0534 | 0.1630 |
**Paper**
| ๋ชจ๋ธ ์ด๋ฆ„ | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *meta-llama/llama-3-8b-instruct | 0.2044 | 0.0868 | 0.1895 |
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2004 | 0.0860 | 0.1938 |
| meta-llama/llama-3-70b-instruct | 0.1935 | 0.0783 | 0.1836 |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.1934 | 0.0829 | 0.1832 |
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.1774 | 0.0601 | 0.1684 |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.1702 | 0.0561 | 0.1605 |
# RAG Q&A
| ๋ชจ๋ธ ์ด๋ฆ„ | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *meta-llama/llama-3-70b-instruct | 0.4418 | 0.2986 | 0.4297 |
| *meta-llama/llama-3-8b-instruct | 0.4391 | 0.3100 | 0.4273 |
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.4022 | 0.2653 | 0.3916 |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.3105 | 0.1763 | 0.2960 |
| yanolja/EEVE-Korean-Instruct-10.8B-v1.0 | 0.3191 | 0.2069 | 0.3136 |
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2185 | 0.1347 | 0.2139 |
**prompt template**
-------------------
๋‹ค์Œ ๋ฌธ์žฅ์„ 3~5๋ฌธ์žฅ์œผ๋กœ ๋ฐ˜๋ณต๋˜๋Š” ๊ตฌ๋ฌธ์—†์ด ํ…์ŠคํŠธ์— ์ œ์‹œ๋œ ์ฃผ์š” ๋…ผ๊ฑฐ๋ฅผ ๊ฐ„๋žตํ•˜๊ฒŒ ์š”์•ฝํ•ด์ค˜.
๋ฌธ์žฅ: {context}
์š”์•ฝ: {summary}