Edit model card

cs_m2m_0.01_50_v0.2

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.9335
  • Bleu: 0.0
  • Gen Len: 5.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
16.8785 1.0 6 18.9290 0.0 200.0
14.9532 2.0 12 16.0041 0.0 2.0
9.473 3.0 18 11.4783 0.0 200.0
9.3819 4.0 24 11.3159 0.0 200.0
8.2604 5.0 30 10.0462 0.0 2.0
7.2006 6.0 36 9.0670 0.1706 6.0
7.1892 7.0 42 8.6881 0.0 200.0
5.8213 8.0 48 8.4889 0.0 6.0
6.088 9.0 54 8.1473 0.0 2.0
5.4934 10.0 60 7.8530 0.0 6.0
5.3899 11.0 66 7.6030 0.0 3.0
5.755 12.0 72 7.2990 0.0 3.0
5.902 13.0 78 7.1387 0.0 6.0
5.1716 14.0 84 7.3531 0.0 3.0
5.384 15.0 90 7.4897 0.0 3.0
5.772 16.0 96 7.3353 0.0 3.0
6.0137 17.0 102 7.2840 0.0 6.0
5.3564 18.0 108 7.2226 0.0 4.0
4.6533 19.0 114 6.9603 0.0 3.0
5.2785 20.0 120 7.1881 0.0 4.0
6.822 21.0 126 7.1262 0.0 4.0
5.027 22.0 132 7.5066 0.0 200.0
5.2595 23.0 138 7.0461 0.0 4.0
5.7311 24.0 144 7.5675 0.0 200.0
5.19 25.0 150 6.9761 0.0 4.0
5.4136 26.0 156 7.0165 0.0 6.0
5.3953 27.0 162 7.0036 0.0 5.0
5.1609 28.0 168 7.2334 0.0 200.0
4.1589 29.0 174 6.8345 0.0 3.0
6.129 30.0 180 7.0334 0.0 4.0
3.9707 31.0 186 6.8262 0.0 4.0
4.851 32.0 192 6.7521 0.0 4.0
4.8473 33.0 198 6.8321 0.0 4.0
4.6168 34.0 204 6.8539 0.0 4.0
4.304 35.0 210 6.9346 0.0 4.0
5.0315 36.0 216 7.0995 0.0 132.0
4.5656 37.0 222 6.9738 0.0 4.0
4.3283 38.0 228 6.8871 0.0 4.0
4.8156 39.0 234 6.9938 0.0 4.0
4.6101 40.0 240 7.0034 0.0 5.0
5.1564 41.0 246 6.9462 0.0 5.0
4.432 42.0 252 7.0158 0.0 4.0
5.0996 43.0 258 7.0378 0.0 5.0
4.3684 44.0 264 6.9261 0.0 5.0
5.2601 45.0 270 6.9520 0.0169 200.0
4.4939 46.0 276 6.9559 0.0 5.0
4.7493 47.0 282 6.9144 0.0 5.0
4.615 48.0 288 6.9272 0.0 5.0
5.5171 49.0 294 6.9316 0.0 5.0
5.077 50.0 300 6.9335 0.0 5.0

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
1.24B params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from