nllb-200-distilled-600M-Mal-Tami
This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.7611
- Bleu: 37.9665
- Rouge: {'rouge1': 0.32743825959084827, 'rouge2': 0.185409074130288, 'rougeL': 0.32502232667423403, 'rougeLsum': 0.32586316574736196}
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge |
---|---|---|---|---|---|
1.1388 | 1.0 | 734 | 0.8947 | 33.7987 | {'rouge1': 0.32849704791121687, 'rouge2': 0.18672956891173587, 'rougeL': 0.32647539033778816, 'rougeLsum': 0.32688430438771043} |
0.898 | 2.0 | 1468 | 0.8281 | 35.3540 | {'rouge1': 0.32674790770839823, 'rouge2': 0.18650128486750617, 'rougeL': 0.32482214840866075, 'rougeLsum': 0.32547267295223703} |
0.8046 | 3.0 | 2202 | 0.7962 | 36.1416 | {'rouge1': 0.32674790770839823, 'rouge2': 0.1857352651103779, 'rougeL': 0.3244104763689233, 'rougeLsum': 0.3251467537911681} |
0.7448 | 4.0 | 2936 | 0.7805 | 36.7560 | {'rouge1': 0.32674790770839823, 'rouge2': 0.1858005621530716, 'rougeL': 0.32443967060792683, 'rougeLsum': 0.3251613509106698} |
0.6985 | 5.0 | 3670 | 0.7696 | 37.3576 | {'rouge1': 0.32614806855270073, 'rouge2': 0.18527678135494344, 'rougeL': 0.32362185574038427, 'rougeLsum': 0.3242525469535007} |
0.663 | 6.0 | 4404 | 0.7661 | 37.8431 | {'rouge1': 0.3275909830405743, 'rouge2': 0.18575262957184815, 'rougeL': 0.32552891023599473, 'rougeLsum': 0.32626719518749486} |
0.636 | 7.0 | 5138 | 0.7639 | 37.8764 | {'rouge1': 0.32740041520695484, 'rouge2': 0.18505379438127528, 'rougeL': 0.3249628569281158, 'rougeLsum': 0.3257682636769831} |
0.6176 | 8.0 | 5872 | 0.7606 | 37.9621 | {'rouge1': 0.3274220405691797, 'rouge2': 0.18538283587780377, 'rougeL': 0.3249736696092282, 'rougeLsum': 0.3257829175063507} |
0.6047 | 9.0 | 6606 | 0.7605 | 37.9538 | {'rouge1': 0.32743825959084827, 'rouge2': 0.185409074130288, 'rougeL': 0.32502232667423403, 'rougeLsum': 0.32586316574736196} |
0.5969 | 10.0 | 7340 | 0.7611 | 37.9665 | {'rouge1': 0.32743825959084827, 'rouge2': 0.185409074130288, 'rougeL': 0.32502232667423403, 'rougeLsum': 0.32586316574736196} |
Framework versions
- Transformers 4.29.2
- Pytorch 2.0.1+cu117
- Datasets 2.16.0
- Tokenizers 0.13.3
- Downloads last month
- 7