jeju-ko-nmt-v8 / README.md

Update README.md

62ccf0c over 1 year ago

4.11 kB

	# ALERT!!!

	[leadawon/jeju-ko-nmt-v6](https://huggingface.co/leadawon/jeju-ko-nmt-v6) is better than leadawon/jeju-ko-nmt-v8

	6버전 성능이 더 좋습니다!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


	# tag

	---
	tags:
	- generated_from_trainer
	model-index:
	- name: jeju-ko-nmt-v8
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# jeju-ko-nmt-v8

	This model is a fine-tuned version of [leadawon/jeju-ko-nmt-v7](https://huggingface.co/leadawon/jeju-ko-nmt-v7) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2448

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-06
	- train_batch_size: 24
	- eval_batch_size: 24
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 96
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 2
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 0.2684 \| 0.04 \| 500 \| 0.2568 \|
	\| 0.2468 \| 0.08 \| 1000 \| 0.2547 \|
	\| 0.2167 \| 0.12 \| 1500 \| 0.2540 \|
	\| 0.1966 \| 0.16 \| 2000 \| 0.2535 \|
	\| 0.1846 \| 0.2 \| 2500 \| 0.2533 \|
	\| 0.1727 \| 0.24 \| 3000 \| 0.2535 \|
	\| 0.1746 \| 0.28 \| 3500 \| 0.2522 \|
	\| 0.1726 \| 0.32 \| 4000 \| 0.2521 \|
	\| 0.1722 \| 0.36 \| 4500 \| 0.2519 \|
	\| 0.1731 \| 0.4 \| 5000 \| 0.2515 \|
	\| 0.1701 \| 0.44 \| 5500 \| 0.2518 \|
	\| 0.168 \| 0.48 \| 6000 \| 0.2515 \|
	\| 0.1706 \| 0.52 \| 6500 \| 0.2509 \|
	\| 0.1659 \| 0.56 \| 7000 \| 0.2514 \|
	\| 0.1702 \| 0.6 \| 7500 \| 0.2509 \|
	\| 0.1667 \| 0.64 \| 8000 \| 0.2510 \|
	\| 0.1661 \| 0.68 \| 8500 \| 0.2508 \|
	\| 0.1647 \| 0.72 \| 9000 \| 0.2510 \|
	\| 0.1632 \| 0.76 \| 9500 \| 0.2510 \|
	\| 0.1655 \| 0.8 \| 10000 \| 0.2506 \|
	\| 0.1645 \| 0.84 \| 10500 \| 0.2508 \|
	\| 0.1617 \| 0.88 \| 11000 \| 0.2508 \|
	\| 0.1627 \| 0.91 \| 11500 \| 0.2511 \|
	\| 0.2764 \| 0.95 \| 12000 \| 0.2478 \|
	\| 0.2755 \| 0.99 \| 12500 \| 0.2462 \|
	\| 0.2275 \| 1.03 \| 13000 \| 0.2464 \|
	\| 0.2201 \| 1.07 \| 13500 \| 0.2463 \|
	\| 0.2207 \| 1.11 \| 14000 \| 0.2463 \|
	\| 0.2202 \| 1.15 \| 14500 \| 0.2462 \|
	\| 0.2194 \| 1.19 \| 15000 \| 0.2460 \|
	\| 0.2177 \| 1.23 \| 15500 \| 0.2461 \|
	\| 0.2187 \| 1.27 \| 16000 \| 0.2460 \|
	\| 0.2184 \| 1.31 \| 16500 \| 0.2459 \|
	\| 0.2182 \| 1.35 \| 17000 \| 0.2457 \|
	\| 0.219 \| 1.39 \| 17500 \| 0.2458 \|
	\| 0.2206 \| 1.43 \| 18000 \| 0.2455 \|
	\| 0.2211 \| 1.47 \| 18500 \| 0.2455 \|
	\| 0.2164 \| 1.51 \| 19000 \| 0.2455 \|
	\| 0.2202 \| 1.55 \| 19500 \| 0.2454 \|
	\| 0.2208 \| 1.59 \| 20000 \| 0.2452 \|
	\| 0.2208 \| 1.63 \| 20500 \| 0.2450 \|
	\| 0.2204 \| 1.67 \| 21000 \| 0.2450 \|
	\| 0.2193 \| 1.71 \| 21500 \| 0.2450 \|
	\| 0.221 \| 1.75 \| 22000 \| 0.2451 \|
	\| 0.2168 \| 1.79 \| 22500 \| 0.2450 \|
	\| 0.2169 \| 1.83 \| 23000 \| 0.2449 \|
	\| 0.218 \| 1.87 \| 23500 \| 0.2449 \|
	\| 0.2196 \| 1.91 \| 24000 \| 0.2449 \|
	\| 0.2218 \| 1.95 \| 24500 \| 0.2448 \|
	\| 0.2199 \| 1.99 \| 25000 \| 0.2448 \|


	### Framework versions

	- Transformers 4.26.0
	- Pytorch 1.13.1+cu116
	- Tokenizers 0.13.2

	# ALERT!!!

	[leadawon/jeju-ko-nmt-v6](https://huggingface.co/leadawon/jeju-ko-nmt-v6) is better than leadawon/jeju-ko-nmt-v8

	6버전 성능이 더 좋습니다!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


	# tag

	---
	tags:
	- generated_from_trainer
	model-index:
	- name: jeju-ko-nmt-v8
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# jeju-ko-nmt-v8

	This model is a fine-tuned version of [leadawon/jeju-ko-nmt-v7](https://huggingface.co/leadawon/jeju-ko-nmt-v7) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2448

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-06
	- train_batch_size: 24
	- eval_batch_size: 24
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 96
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 2
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 0.2684 \| 0.04 \| 500 \| 0.2568 \|
	\| 0.2468 \| 0.08 \| 1000 \| 0.2547 \|
	\| 0.2167 \| 0.12 \| 1500 \| 0.2540 \|
	\| 0.1966 \| 0.16 \| 2000 \| 0.2535 \|
	\| 0.1846 \| 0.2 \| 2500 \| 0.2533 \|
	\| 0.1727 \| 0.24 \| 3000 \| 0.2535 \|
	\| 0.1746 \| 0.28 \| 3500 \| 0.2522 \|
	\| 0.1726 \| 0.32 \| 4000 \| 0.2521 \|
	\| 0.1722 \| 0.36 \| 4500 \| 0.2519 \|
	\| 0.1731 \| 0.4 \| 5000 \| 0.2515 \|
	\| 0.1701 \| 0.44 \| 5500 \| 0.2518 \|
	\| 0.168 \| 0.48 \| 6000 \| 0.2515 \|
	\| 0.1706 \| 0.52 \| 6500 \| 0.2509 \|
	\| 0.1659 \| 0.56 \| 7000 \| 0.2514 \|
	\| 0.1702 \| 0.6 \| 7500 \| 0.2509 \|
	\| 0.1667 \| 0.64 \| 8000 \| 0.2510 \|
	\| 0.1661 \| 0.68 \| 8500 \| 0.2508 \|
	\| 0.1647 \| 0.72 \| 9000 \| 0.2510 \|
	\| 0.1632 \| 0.76 \| 9500 \| 0.2510 \|
	\| 0.1655 \| 0.8 \| 10000 \| 0.2506 \|
	\| 0.1645 \| 0.84 \| 10500 \| 0.2508 \|
	\| 0.1617 \| 0.88 \| 11000 \| 0.2508 \|
	\| 0.1627 \| 0.91 \| 11500 \| 0.2511 \|
	\| 0.2764 \| 0.95 \| 12000 \| 0.2478 \|
	\| 0.2755 \| 0.99 \| 12500 \| 0.2462 \|
	\| 0.2275 \| 1.03 \| 13000 \| 0.2464 \|
	\| 0.2201 \| 1.07 \| 13500 \| 0.2463 \|
	\| 0.2207 \| 1.11 \| 14000 \| 0.2463 \|
	\| 0.2202 \| 1.15 \| 14500 \| 0.2462 \|
	\| 0.2194 \| 1.19 \| 15000 \| 0.2460 \|
	\| 0.2177 \| 1.23 \| 15500 \| 0.2461 \|
	\| 0.2187 \| 1.27 \| 16000 \| 0.2460 \|
	\| 0.2184 \| 1.31 \| 16500 \| 0.2459 \|
	\| 0.2182 \| 1.35 \| 17000 \| 0.2457 \|
	\| 0.219 \| 1.39 \| 17500 \| 0.2458 \|
	\| 0.2206 \| 1.43 \| 18000 \| 0.2455 \|
	\| 0.2211 \| 1.47 \| 18500 \| 0.2455 \|
	\| 0.2164 \| 1.51 \| 19000 \| 0.2455 \|
	\| 0.2202 \| 1.55 \| 19500 \| 0.2454 \|
	\| 0.2208 \| 1.59 \| 20000 \| 0.2452 \|
	\| 0.2208 \| 1.63 \| 20500 \| 0.2450 \|
	\| 0.2204 \| 1.67 \| 21000 \| 0.2450 \|
	\| 0.2193 \| 1.71 \| 21500 \| 0.2450 \|
	\| 0.221 \| 1.75 \| 22000 \| 0.2451 \|
	\| 0.2168 \| 1.79 \| 22500 \| 0.2450 \|
	\| 0.2169 \| 1.83 \| 23000 \| 0.2449 \|
	\| 0.218 \| 1.87 \| 23500 \| 0.2449 \|
	\| 0.2196 \| 1.91 \| 24000 \| 0.2449 \|
	\| 0.2218 \| 1.95 \| 24500 \| 0.2448 \|
	\| 0.2199 \| 1.99 \| 25000 \| 0.2448 \|


	### Framework versions

	- Transformers 4.26.0
	- Pytorch 1.13.1+cu116
	- Tokenizers 0.13.2