--- license: mit base_model: facebook/m2m100_1.2B tags: - generated_from_trainer metrics: - bleu model-index: - name: cs_m2m_0.0001_100_v0.2 results: [] --- # cs_m2m_0.0001_100_v0.2 This model is a fine-tuned version of [facebook/m2m100_1.2B](https://huggingface.co/facebook/m2m100_1.2B) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 8.4496 - Bleu: 0.0928 - Gen Len: 62.0 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 100 ### Training results | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:------:|:--------:| | 3.1218 | 1.0 | 6 | 8.4336 | 0.0372 | 115.8571 | | 1.7719 | 2.0 | 12 | 8.4226 | 0.0454 | 83.1429 | | 2.2391 | 3.0 | 18 | 8.3857 | 0.0595 | 67.8571 | | 3.3595 | 4.0 | 24 | 8.3587 | 0.117 | 59.1429 | | 3.2809 | 5.0 | 30 | 8.3475 | 0.0806 | 70.4286 | | 2.5704 | 6.0 | 36 | 8.3259 | 0.1683 | 69.8095 | | 3.8725 | 7.0 | 42 | 8.3405 | 0.0339 | 109.9048 | | 2.9887 | 8.0 | 48 | 8.3686 | 0.0447 | 91.1905 | | 2.9363 | 9.0 | 54 | 8.3856 | 0.0547 | 80.5238 | | 2.3718 | 10.0 | 60 | 8.3621 | 0.0594 | 66.619 | | 2.977 | 11.0 | 66 | 8.3563 | 0.0356 | 107.1905 | | 2.4379 | 12.0 | 72 | 8.3682 | 0.0266 | 150.619 | | 1.9983 | 13.0 | 78 | 8.3733 | 0.0655 | 96.619 | | 2.5183 | 14.0 | 84 | 8.3767 | 0.0417 | 92.1905 | | 4.7446 | 15.0 | 90 | 8.3677 | 0.0457 | 81.1429 | | 2.8195 | 16.0 | 96 | 8.3779 | 0.0467 | 81.381 | | 3.1357 | 17.0 | 102 | 8.3751 | 0.0531 | 123.4762 | | 3.1353 | 18.0 | 108 | 8.3707 | 0.1118 | 83.4286 | | 2.2632 | 19.0 | 114 | 8.3813 | 0.1173 | 80.0476 | | 1.7457 | 20.0 | 120 | 8.3786 | 0.1014 | 100.6667 | | 1.991 | 21.0 | 126 | 8.3845 | 0.0937 | 60.381 | | 3.1272 | 22.0 | 132 | 8.3823 | 0.0648 | 75.0 | | 2.5017 | 23.0 | 138 | 8.3882 | 0.1951 | 41.7619 | | 3.1988 | 24.0 | 144 | 8.3901 | 0.2921 | 17.381 | | 2.0247 | 25.0 | 150 | 8.3950 | 0.0929 | 50.8095 | | 2.8855 | 26.0 | 156 | 8.4009 | 0.1452 | 37.8095 | | 1.8024 | 27.0 | 162 | 8.3844 | 0.0439 | 95.2381 | | 4.727 | 28.0 | 168 | 8.3750 | 0.0352 | 106.8571 | | 2.3243 | 29.0 | 174 | 8.3736 | 0.0344 | 123.619 | | 2.4946 | 30.0 | 180 | 8.3908 | 0.1952 | 112.4286 | | 3.2337 | 31.0 | 186 | 8.3960 | 0.2593 | 58.9048 | | 3.1065 | 32.0 | 192 | 8.3937 | 0.3752 | 48.0952 | | 3.3689 | 33.0 | 198 | 8.3855 | 0.3984 | 48.8571 | | 2.51 | 34.0 | 204 | 8.3928 | 0.2597 | 53.7143 | | 1.5195 | 35.0 | 210 | 8.3917 | 0.1361 | 74.7143 | | 2.1133 | 36.0 | 216 | 8.3964 | 0.0702 | 78.4286 | | 2.6349 | 37.0 | 222 | 8.3839 | 0.0477 | 103.4286 | | 2.2733 | 38.0 | 228 | 8.3770 | 0.0746 | 77.381 | | 3.0805 | 39.0 | 234 | 8.3773 | 0.1324 | 75.3333 | | 3.1701 | 40.0 | 240 | 8.3853 | 0.0776 | 75.8571 | | 2.5676 | 41.0 | 246 | 8.3988 | 0.1274 | 76.7619 | | 5.1543 | 42.0 | 252 | 8.4117 | 0.0381 | 110.2857 | | 2.4138 | 43.0 | 258 | 8.4101 | 0.0472 | 92.619 | | 2.6 | 44.0 | 264 | 8.3991 | 0.0422 | 102.0 | | 5.2608 | 45.0 | 270 | 8.3912 | 0.0602 | 84.4762 | | 2.6492 | 46.0 | 276 | 8.3918 | 0.0667 | 80.6667 | | 2.5329 | 47.0 | 282 | 8.3901 | 0.1159 | 42.2857 | | 2.894 | 48.0 | 288 | 8.3936 | 0.1352 | 46.381 | | 2.6136 | 49.0 | 294 | 8.3959 | 0.1059 | 45.4286 | | 3.2249 | 50.0 | 300 | 8.3954 | 0.246 | 46.1429 | | 2.8511 | 51.0 | 306 | 8.3923 | 0.1572 | 52.8571 | | 2.7592 | 52.0 | 312 | 8.3875 | 0.1112 | 62.1429 | | 2.37 | 53.0 | 318 | 8.3839 | 0.0926 | 67.3333 | | 3.1555 | 54.0 | 324 | 8.3989 | 0.0855 | 71.2381 | | 2.723 | 55.0 | 330 | 8.4030 | 0.0756 | 78.4286 | | 2.498 | 56.0 | 336 | 8.4131 | 0.3874 | 74.9048 | | 2.6088 | 57.0 | 342 | 8.4278 | 0.118 | 83.7143 | | 2.1392 | 58.0 | 348 | 8.4388 | 0.3423 | 80.381 | | 2.8988 | 59.0 | 354 | 8.4506 | 0.0844 | 73.9048 | | 2.2013 | 60.0 | 360 | 8.4596 | 0.0892 | 70.1429 | | 2.2335 | 61.0 | 366 | 8.4694 | 0.1165 | 59.4762 | | 3.306 | 62.0 | 372 | 8.4838 | 0.1685 | 49.4762 | | 3.0362 | 63.0 | 378 | 8.4894 | 0.1189 | 56.1905 | | 3.0111 | 64.0 | 384 | 8.4909 | 0.0926 | 66.5714 | | 2.802 | 65.0 | 390 | 8.4956 | 0.0906 | 66.0 | | 2.4222 | 66.0 | 396 | 8.4917 | 0.0742 | 72.381 | | 2.8748 | 67.0 | 402 | 8.4870 | 0.0704 | 76.0952 | | 2.7946 | 68.0 | 408 | 8.4823 | 0.0572 | 84.2381 | | 2.7195 | 69.0 | 414 | 8.4714 | 0.0573 | 84.2381 | | 2.487 | 70.0 | 420 | 8.4640 | 0.0578 | 83.3333 | | 1.5811 | 71.0 | 426 | 8.4632 | 0.0516 | 91.381 | | 2.7705 | 72.0 | 432 | 8.4618 | 0.0597 | 80.619 | | 2.3703 | 73.0 | 438 | 8.4622 | 0.0598 | 80.619 | | 2.4037 | 74.0 | 444 | 8.4618 | 0.0906 | 66.2381 | | 2.3173 | 75.0 | 450 | 8.4579 | 0.0926 | 63.381 | | 1.8697 | 76.0 | 456 | 8.4564 | 0.0942 | 62.5238 | | 1.8887 | 77.0 | 462 | 8.4554 | 0.0979 | 62.6667 | | 3.84 | 78.0 | 468 | 8.4590 | 0.077 | 70.1429 | | 2.388 | 79.0 | 474 | 8.4654 | 0.0735 | 71.2381 | | 2.591 | 80.0 | 480 | 8.4685 | 0.075 | 70.9048 | | 2.7345 | 81.0 | 486 | 8.4665 | 0.0791 | 52.5238 | | 2.7887 | 82.0 | 492 | 8.4669 | 0.0759 | 70.2381 | | 2.5452 | 83.0 | 498 | 8.4675 | 0.0764 | 70.8095 | | 2.7554 | 84.0 | 504 | 8.4693 | 0.096 | 53.9524 | | 4.2388 | 85.0 | 510 | 8.4656 | 0.0939 | 62.8571 | | 2.361 | 86.0 | 516 | 8.4612 | 0.0923 | 63.9524 | | 1.912 | 87.0 | 522 | 8.4569 | 0.0916 | 62.5714 | | 2.2787 | 88.0 | 528 | 8.4524 | 0.0942 | 63.2857 | | 1.9425 | 89.0 | 534 | 8.4530 | 0.0942 | 62.0952 | | 2.7257 | 90.0 | 540 | 8.4545 | 0.0967 | 61.381 | | 1.9149 | 91.0 | 546 | 8.4552 | 0.0959 | 61.8095 | | 2.507 | 92.0 | 552 | 8.4546 | 0.0936 | 63.1429 | | 2.8124 | 93.0 | 558 | 8.4547 | 0.0947 | 63.2857 | | 2.3852 | 94.0 | 564 | 8.4527 | 0.0955 | 62.8571 | | 1.7975 | 95.0 | 570 | 8.4528 | 0.0947 | 63.2857 | | 4.9651 | 96.0 | 576 | 8.4517 | 0.0922 | 62.4286 | | 2.1141 | 97.0 | 582 | 8.4510 | 0.0928 | 62.0 | | 2.6156 | 98.0 | 588 | 8.4502 | 0.0928 | 62.0 | | 1.987 | 99.0 | 594 | 8.4498 | 0.0928 | 62.0 | | 2.5299 | 100.0 | 600 | 8.4496 | 0.0928 | 62.0 | ### Framework versions - Transformers 4.35.2 - Pytorch 1.13.1+cu117 - Datasets 2.16.1 - Tokenizers 0.15.0