Edit model card

rut5-base-summ-dialogsum

This model is a fine-tuned version of d0rj/rut5-base-summ on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1263
  • Rouge1: 33.5111
  • Rouge2: 0.1696
  • Rougel: 33.4559
  • Rougelsum: 33.4934
  • Gen Len: 4.1546

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.0946 1.0 786 1.7462 45.4252 0.0 45.4009 45.4139 4.0464
1.7182 2.0 1572 1.5005 44.9295 0.0 44.9183 44.9108 4.1126
1.5304 3.0 2358 1.3826 39.5888 0.0 39.5811 39.5646 4.1698
1.4261 4.0 3144 1.3121 30.1735 0.0 30.1127 30.1415 4.1520
1.3252 5.0 3930 1.2641 35.7738 0.0 35.7408 35.7858 3.8791
1.2878 6.0 4716 1.2353 33.0773 0.0 32.9682 33.0551 3.7252
1.2068 7.0 5502 1.2051 34.4094 0.0 34.3902 34.3884 3.7729
1.1763 8.0 6288 1.1952 33.0914 0.1908 33.0267 33.0472 3.9739
1.1346 9.0 7074 1.1798 33.9606 0.0 33.9335 33.979 4.1768
1.1044 10.0 7860 1.1632 32.9529 0.0 32.9367 32.9396 4.1673
1.1073 11.0 8646 1.1499 34.0904 0.0 34.0659 34.1317 4.1934
1.0619 12.0 9432 1.1516 32.9502 0.0 32.9056 32.9376 4.0312
1.0365 13.0 10218 1.1478 31.68 0.0 31.6488 31.7003 4.0293
1.0161 14.0 11004 1.1427 32.6651 0.0424 32.6345 32.6538 4.1113
0.9805 15.0 11790 1.1343 34.0304 0.0636 33.9433 33.999 4.0674
0.9661 16.0 12576 1.1309 34.8704 0.0848 34.8014 34.8501 4.0681
0.9511 17.0 13362 1.1348 32.8744 0.0 32.8277 32.8547 4.1081
0.9392 18.0 14148 1.1326 32.9349 0.1908 32.8895 32.9376 4.2627
0.9341 19.0 14934 1.1263 33.5111 0.1696 33.4559 33.4934 4.1546
0.9396 20.0 15720 1.1349 33.9121 0.2545 33.8438 33.8993 4.1705
0.9314 21.0 16506 1.1276 33.0779 0.106 33.0546 33.0903 4.1399
0.8987 22.0 17292 1.1333 33.8566 0.1696 33.7943 33.843 4.1419
0.8895 23.0 18078 1.1343 33.6108 0.1484 33.5738 33.636 4.2328
0.8847 24.0 18864 1.1355 33.4257 0.2757 33.3804 33.4495 4.1711
0.8832 25.0 19650 1.1355 33.6211 0.3393 33.5937 33.636 4.1959

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
9
Safetensors
Model size
223M params
Tensor type
F32
·

Finetuned from