ryanu
/

EEVE-10.8-BOOK-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

EEVE-10.8-BOOK-v0.1 / README.md

ryanu's picture

Update README.md

c839cc7 verified 7 months ago

|

2.73 kB

파라미터	값
Task	Book (사회과학, 기술과학, 철학, 법학, 예술 등)
데이터 크기	5000개
모델	qlora
max_seq_length	1024
num_train_epochs	3
per_device_train_batch_size	8
gradient_accumulation_steps	32
evaluation_strategy	"steps"
eval_steps	2000
logging_steps	25
optim	"paged_adamw_8bit"
learning_rate	2e-4
lr_scheduler_type	"cosine"
warmup_steps	10
warmup_ratio	0.05
report_to	"tensorboard"
weight_decay	0.01
max_steps	-1

Summary

Book

모델 이름	Rouge-1	Rouge-2	Rouge-L
*ryanu/EEVE-10.8-BOOK-v0.1	0.2454	0.1158	0.2404
meta-llama/llama-3-70b-instruct	0.2269	0.0925	0.2186
meta-llama/llama-3-8b-instruct	0.2137	0.0883	0.2020
yanolja/EEVE-Korean-Instruct-2.8B-v1.0	0.2095	0.0866	0.1985
mistralai/mixtral-8x7b-instruct-v0-1	0.1735	0.0516	0.1668
ibm-mistralai/mixtral-8x7b-instruct-v01-q	0.1724	0.0534	0.1630

Paper

모델 이름	Rouge-1	Rouge-2	Rouge-L
*meta-llama/llama-3-8b-instruct	0.2044	0.0868	0.1895
ryanu/EEVE-10.8-BOOK-v0.1	0.2004	0.0860	0.1938
meta-llama/llama-3-70b-instruct	0.1935	0.0783	0.1836
yanolja/EEVE-Korean-Instruct-2.8B-v1.0	0.1934	0.0829	0.1832
mistralai/mixtral-8x7b-instruct-v0-1	0.1774	0.0601	0.1684
ibm-mistralai/mixtral-8x7b-instruct-v01-q	0.1702	0.0561	0.1605

RAG Q&A

모델 이름	Rouge-1	Rouge-2	Rouge-L
*meta-llama/llama-3-70b-instruct	0.4418	0.2986	0.4297
*meta-llama/llama-3-8b-instruct	0.4391	0.3100	0.4273
mistralai/mixtral-8x7b-instruct-v0-1	0.4022	0.2653	0.3916
ibm-mistralai/mixtral-8x7b-instruct-v01-q	0.3105	0.1763	0.2960
yanolja/EEVE-Korean-Instruct-10.8B-v1.0	0.3191	0.2069	0.3136
ryanu/EEVE-10.8-BOOK-v0.1	0.2185	0.1347	0.2139

prompt template

다음 문장을 3~5문장으로 반복되는 구문없이 텍스트에 제시된 주요 논거를 간략하게 요약해줘.

문장: {context}

요약: {summary}