Edit model card

skilltext

This model is a fine-tuned version of ai-forever/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0396
  • Rouge1: 35.5496
  • Rouge2: 22.9927
  • Rougel: 33.7986
  • Rougelsum: 33.9427
  • Bleu: 3.0002
  • Gen Len: 18.7273

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len
No log 0.5882 50 2.0006 22.8478 9.3528 21.5245 21.4195 1.3965 19.0
No log 1.1765 100 1.5029 26.0894 12.5184 22.7242 22.8568 1.7386 18.9545
No log 1.7647 150 1.4072 24.1385 9.8714 22.0278 22.0679 2.009 18.9545
No log 2.3529 200 1.3292 27.642 12.2998 26.3455 25.9994 1.2632 18.7727
No log 2.9412 250 1.2788 32.096 12.3806 30.9883 30.6962 1.6429 18.7273
No log 3.5294 300 1.1847 31.8602 21.2094 31.1454 30.9145 1.5913 18.8636
No log 4.1176 350 1.2193 22.6777 11.7225 22.1941 22.1638 1.4306 18.7727
No log 4.7059 400 1.1527 23.4161 11.2979 22.9918 23.0266 1.7552 18.8636
No log 5.2941 450 1.1200 28.9205 15.5233 27.153 27.2644 1.8557 18.7273
2.1495 5.8824 500 1.1426 28.2199 13.8386 26.9115 26.5472 2.3855 18.7273
2.1495 6.4706 550 1.1053 32.432 18.9395 30.9397 31.1198 2.2867 18.7727
2.1495 7.0588 600 1.0777 38.285 23.5443 35.0994 35.3165 2.6353 18.7727
2.1495 7.6471 650 1.0900 38.5934 21.6941 36.5629 36.9151 2.2212 18.7727
2.1495 8.2353 700 1.0931 41.2586 27.5923 40.1612 40.1672 2.5568 18.8182
2.1495 8.8235 750 1.0691 38.3785 25.0231 38.453 38.5248 2.4491 18.7273
2.1495 9.4118 800 1.0627 36.3073 20.703 35.2405 35.3787 2.3678 18.8636
2.1495 10.0 850 1.0528 39.1894 24.8355 39.3713 39.483 1.9687 18.8636
2.1495 10.5882 900 1.0628 40.0052 23.746 38.8726 39.077 2.0485 18.8636
2.1495 11.1765 950 1.0371 34.4982 23.4663 34.1685 34.1247 2.0922 18.8636
1.046 11.7647 1000 1.0368 38.0619 19.7898 36.4367 36.8115 2.3387 18.8636
1.046 12.3529 1050 1.0427 38.9055 25.1615 38.8253 38.9385 2.5522 18.8182
1.046 12.9412 1100 1.0255 36.5256 21.2328 34.8816 35.2236 2.4057 18.8182
1.046 13.5294 1150 1.0237 36.0048 25.3977 35.9471 35.9807 2.4804 18.8182
1.046 14.1176 1200 0.9918 32.6697 21.3968 30.8639 31.0221 2.4669 18.7727
1.046 14.7059 1250 1.0598 37.7878 20.6971 36.6794 36.7289 2.5767 18.7727
1.046 15.2941 1300 1.0130 34.549 24.4177 34.0376 34.1226 2.1773 18.8182
1.046 15.8824 1350 1.0256 32.774 19.6047 31.6125 31.9067 2.0504 18.7727
1.046 16.4706 1400 1.0232 31.4885 18.4703 30.0937 30.5529 2.514 18.8182
1.046 17.0588 1450 1.0210 33.4684 20.7982 31.7789 32.0023 2.4881 18.7273
0.7674 17.6471 1500 1.0419 37.4914 20.9444 35.0519 35.2368 3.0058 18.7727
0.7674 18.2353 1550 1.0328 36.5606 21.0215 35.2548 35.4748 2.7878 18.7273
0.7674 18.8235 1600 1.0376 31.3516 18.5826 29.6759 29.8435 2.3192 18.8182
0.7674 19.4118 1650 1.0414 37.4725 22.3216 35.6306 35.7383 2.477 18.8182
0.7674 20.0 1700 1.0513 39.5759 23.2665 39.2332 39.3667 2.4322 18.7273
0.7674 20.5882 1750 1.0518 36.1526 23.8263 34.5677 34.6173 2.8518 18.7727
0.7674 21.1765 1800 1.0446 41.5192 23.3064 39.3799 39.6548 3.0326 18.8182
0.7674 21.7647 1850 1.0150 40.5093 21.8683 38.2773 38.6063 2.6653 18.8636
0.7674 22.3529 1900 1.0364 34.2216 20.2095 32.5945 32.6999 2.6078 18.8182
0.7674 22.9412 1950 1.0148 39.8173 20.6247 37.2954 37.6752 3.0336 18.8636
0.6485 23.5294 2000 1.0429 40.2889 21.1598 37.7657 38.0596 2.9108 18.8182
0.6485 24.1176 2050 1.0423 39.2679 20.8842 36.7395 36.9295 2.845 18.8636
0.6485 24.7059 2100 1.0358 39.086 20.7799 36.2138 36.3741 2.9429 18.8182
0.6485 25.2941 2150 1.0219 38.754 22.4097 36.9752 37.121 2.831 18.8182
0.6485 25.8824 2200 1.0450 38.3531 22.3593 36.4439 36.6304 2.9804 18.7727
0.6485 26.4706 2250 1.0482 40.6921 23.617 39.298 39.5895 3.0971 18.7727
0.6485 27.0588 2300 1.0495 39.6761 22.7969 37.0805 37.4949 3.2639 18.7727
0.6485 27.6471 2350 1.0412 40.8199 23.7109 38.9222 39.2493 3.0267 18.7273
0.6485 28.2353 2400 1.0453 39.9504 23.888 38.0725 38.3121 3.2191 18.7727
0.6485 28.8235 2450 1.0400 36.205 23.1356 34.6087 34.6263 3.028 18.7727
0.5501 29.4118 2500 1.0402 35.033 22.2393 33.3754 33.4477 3.0299 18.7273
0.5501 30.0 2550 1.0396 35.5496 22.9927 33.7986 33.9427 3.0002 18.7273

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.2
  • Datasets 2.12.0
  • Tokenizers 0.19.1
Downloads last month
41
Safetensors
Model size
223M params
Tensor type
F32
·

Finetuned from