metadata

language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-en_mbartLarge_exp20p
    results: []

ko-en_mbartLarge_exp20p

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1451
Bleu: 28.9507
Gen Len: 18.6702

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine_with_restarts
lr_scheduler_warmup_steps: 2000
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.4008	0.46	4000	1.3739	22.7174	18.7094
1.2847	0.93	8000	1.2652	24.8557	18.7254
1.2009	1.39	12000	1.2082	26.2074	18.7513
1.1686	1.86	16000	1.1841	26.304	19.161
1.0205	2.32	20000	1.1441	27.8937	18.6638
1.0217	2.78	24000	1.1301	28.4149	18.6666
0.8876	3.25	28000	1.1270	28.5803	18.6229
0.9024	3.71	32000	1.1181	28.852	18.7813
0.7927	4.18	36000	1.1393	28.3975	18.4863
0.8174	4.64	40000	1.1249	28.6313	18.3916
0.7434	5.11	44000	1.1696	28.2898	18.7739
0.7416	5.57	48000	1.1451	28.9507	18.6744
0.689	6.03	52000	1.1759	28.3532	18.4481
0.7238	6.5	56000	1.1825	28.3827	18.7038
0.7238	6.96	60000	1.1676	28.8248	18.5073
0.657	7.43	64000	1.2514	27.4378	18.4196

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1