m2m100_1.2B_ft_kbd-ru_63K

This model is a fine-tuned version of facebook/m2m100_1.2B on the anzorq/ru-kbd dataset.

Training Summary

  • Current Epoch: 3.41/9
  • Global Step: 9,000
  • Max Steps: 23,778
  • Steps per Logging: 500
  • Steps per Model Save: 1,000
  • Total Operations (FLOPs): ~4.84 x 10^16

Configuration:

  • Hyperparameter Search: No
  • Is Local Process Zero: True
  • Is World Process Zero: True

Progress:

Epoch Learning Rate Loss Step
0.19 4.8950710741021115e-05 2.4415 500
0.38 4.78993186979561e-05 1.7099 1000
0.57 4.684792665489108e-05 1.4997 1500
0.76 4.579653461182606e-05 1.3625 2000
0.95 4.4745142568761036e-05 1.2689 2500
1.14 4.3693750525696025e-05 1.0546 3000
1.32 4.264235848263101e-05 0.9711 3500
1.51 4.159096643956599e-05 0.9487 4000
1.70 4.05416771805871e-05 0.9202 4500
1.89 3.949238792160821e-05 0.8953 5000
2.08 3.844309866262932e-05 0.6436 5500
2.27 3.73917066195643e-05 0.6361 6000
2.46 3.6340314576499284e-05 0.6473 6500
2.65 3.52910253175204e-05 0.6383 7000
2.84 3.423963327445538e-05 0.6312 7500
3.03 3.318824123139036e-05 0.5965 8000
3.22 3.213684918832535e-05 0.4106 8500
3.41 3.1085457145260324e-05 0.4265 9000
Downloads last month
27
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for anzorq/m2m100_1.2B_ft_ru-kbd_63K

Finetuned
(15)
this model

Dataset used to train anzorq/m2m100_1.2B_ft_ru-kbd_63K