Edit model card

cs_m2m_0.0001_100_v0.2

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 8.4496
  • Bleu: 0.0928
  • Gen Len: 62.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
3.1218 1.0 6 8.4336 0.0372 115.8571
1.7719 2.0 12 8.4226 0.0454 83.1429
2.2391 3.0 18 8.3857 0.0595 67.8571
3.3595 4.0 24 8.3587 0.117 59.1429
3.2809 5.0 30 8.3475 0.0806 70.4286
2.5704 6.0 36 8.3259 0.1683 69.8095
3.8725 7.0 42 8.3405 0.0339 109.9048
2.9887 8.0 48 8.3686 0.0447 91.1905
2.9363 9.0 54 8.3856 0.0547 80.5238
2.3718 10.0 60 8.3621 0.0594 66.619
2.977 11.0 66 8.3563 0.0356 107.1905
2.4379 12.0 72 8.3682 0.0266 150.619
1.9983 13.0 78 8.3733 0.0655 96.619
2.5183 14.0 84 8.3767 0.0417 92.1905
4.7446 15.0 90 8.3677 0.0457 81.1429
2.8195 16.0 96 8.3779 0.0467 81.381
3.1357 17.0 102 8.3751 0.0531 123.4762
3.1353 18.0 108 8.3707 0.1118 83.4286
2.2632 19.0 114 8.3813 0.1173 80.0476
1.7457 20.0 120 8.3786 0.1014 100.6667
1.991 21.0 126 8.3845 0.0937 60.381
3.1272 22.0 132 8.3823 0.0648 75.0
2.5017 23.0 138 8.3882 0.1951 41.7619
3.1988 24.0 144 8.3901 0.2921 17.381
2.0247 25.0 150 8.3950 0.0929 50.8095
2.8855 26.0 156 8.4009 0.1452 37.8095
1.8024 27.0 162 8.3844 0.0439 95.2381
4.727 28.0 168 8.3750 0.0352 106.8571
2.3243 29.0 174 8.3736 0.0344 123.619
2.4946 30.0 180 8.3908 0.1952 112.4286
3.2337 31.0 186 8.3960 0.2593 58.9048
3.1065 32.0 192 8.3937 0.3752 48.0952
3.3689 33.0 198 8.3855 0.3984 48.8571
2.51 34.0 204 8.3928 0.2597 53.7143
1.5195 35.0 210 8.3917 0.1361 74.7143
2.1133 36.0 216 8.3964 0.0702 78.4286
2.6349 37.0 222 8.3839 0.0477 103.4286
2.2733 38.0 228 8.3770 0.0746 77.381
3.0805 39.0 234 8.3773 0.1324 75.3333
3.1701 40.0 240 8.3853 0.0776 75.8571
2.5676 41.0 246 8.3988 0.1274 76.7619
5.1543 42.0 252 8.4117 0.0381 110.2857
2.4138 43.0 258 8.4101 0.0472 92.619
2.6 44.0 264 8.3991 0.0422 102.0
5.2608 45.0 270 8.3912 0.0602 84.4762
2.6492 46.0 276 8.3918 0.0667 80.6667
2.5329 47.0 282 8.3901 0.1159 42.2857
2.894 48.0 288 8.3936 0.1352 46.381
2.6136 49.0 294 8.3959 0.1059 45.4286
3.2249 50.0 300 8.3954 0.246 46.1429
2.8511 51.0 306 8.3923 0.1572 52.8571
2.7592 52.0 312 8.3875 0.1112 62.1429
2.37 53.0 318 8.3839 0.0926 67.3333
3.1555 54.0 324 8.3989 0.0855 71.2381
2.723 55.0 330 8.4030 0.0756 78.4286
2.498 56.0 336 8.4131 0.3874 74.9048
2.6088 57.0 342 8.4278 0.118 83.7143
2.1392 58.0 348 8.4388 0.3423 80.381
2.8988 59.0 354 8.4506 0.0844 73.9048
2.2013 60.0 360 8.4596 0.0892 70.1429
2.2335 61.0 366 8.4694 0.1165 59.4762
3.306 62.0 372 8.4838 0.1685 49.4762
3.0362 63.0 378 8.4894 0.1189 56.1905
3.0111 64.0 384 8.4909 0.0926 66.5714
2.802 65.0 390 8.4956 0.0906 66.0
2.4222 66.0 396 8.4917 0.0742 72.381
2.8748 67.0 402 8.4870 0.0704 76.0952
2.7946 68.0 408 8.4823 0.0572 84.2381
2.7195 69.0 414 8.4714 0.0573 84.2381
2.487 70.0 420 8.4640 0.0578 83.3333
1.5811 71.0 426 8.4632 0.0516 91.381
2.7705 72.0 432 8.4618 0.0597 80.619
2.3703 73.0 438 8.4622 0.0598 80.619
2.4037 74.0 444 8.4618 0.0906 66.2381
2.3173 75.0 450 8.4579 0.0926 63.381
1.8697 76.0 456 8.4564 0.0942 62.5238
1.8887 77.0 462 8.4554 0.0979 62.6667
3.84 78.0 468 8.4590 0.077 70.1429
2.388 79.0 474 8.4654 0.0735 71.2381
2.591 80.0 480 8.4685 0.075 70.9048
2.7345 81.0 486 8.4665 0.0791 52.5238
2.7887 82.0 492 8.4669 0.0759 70.2381
2.5452 83.0 498 8.4675 0.0764 70.8095
2.7554 84.0 504 8.4693 0.096 53.9524
4.2388 85.0 510 8.4656 0.0939 62.8571
2.361 86.0 516 8.4612 0.0923 63.9524
1.912 87.0 522 8.4569 0.0916 62.5714
2.2787 88.0 528 8.4524 0.0942 63.2857
1.9425 89.0 534 8.4530 0.0942 62.0952
2.7257 90.0 540 8.4545 0.0967 61.381
1.9149 91.0 546 8.4552 0.0959 61.8095
2.507 92.0 552 8.4546 0.0936 63.1429
2.8124 93.0 558 8.4547 0.0947 63.2857
2.3852 94.0 564 8.4527 0.0955 62.8571
1.7975 95.0 570 8.4528 0.0947 63.2857
4.9651 96.0 576 8.4517 0.0922 62.4286
2.1141 97.0 582 8.4510 0.0928 62.0
2.6156 98.0 588 8.4502 0.0928 62.0
1.987 99.0 594 8.4498 0.0928 62.0
2.5299 100.0 600 8.4496 0.0928 62.0

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
3
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kmok1/cs_m2m_0.0001_100_v0.2

Finetuned
this model