Edit model card

t5-small_adafactor

This model is a fine-tuned version of oMateos2020/t5-small_adafactor on the xsum dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1167
  • Rouge1: 32.8631
  • Rouge2: 11.658
  • Rougel: 26.6192
  • Rougelsum: 26.6224
  • Gen Len: 18.7663

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adafactor
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.1315 0.02 200 2.1865 31.9486 10.9605 25.7418 25.7408 18.8466
2.1297 0.05 400 2.1965 31.9598 10.9463 25.784 25.7867 18.8525
2.1284 0.07 600 2.1981 32.231 11.1003 26.0155 26.0226 18.8466
2.1315 0.09 800 2.1873 31.9161 10.8642 25.7166 25.7273 18.8227
2.1212 0.12 1000 2.1892 32.4646 11.1852 26.2451 26.2439 18.8259
2.1028 0.14 1200 2.1978 32.2886 11.1346 26.0795 26.0827 18.7685
2.1221 0.16 1400 2.1936 32.2901 11.0821 25.9983 26.0024 18.7798
2.1168 0.19 1600 2.1922 32.1655 11.1451 25.986 25.9893 18.8232
2.1166 0.21 1800 2.1836 32.2611 11.174 26.0594 26.0688 18.7633
2.1053 0.24 2000 2.1929 32.3321 11.213 26.1859 26.1903 18.7758
2.1126 0.26 2200 2.1811 32.2078 11.1792 26.0776 26.0817 18.8197
2.1038 0.28 2400 2.1836 32.2799 11.2511 26.1191 26.1251 18.7884
2.1181 0.31 2600 2.1805 32.1197 11.1586 26.0441 26.0441 18.8045
2.1217 0.33 2800 2.1806 32.3051 11.2638 26.1319 26.1386 18.7886
2.116 0.35 3000 2.1741 32.2799 11.1887 26.1224 26.1363 18.7769
2.1118 0.38 3200 2.1767 32.387 11.2053 26.077 26.0845 18.8407
2.1164 0.4 3400 2.1743 32.5008 11.4021 26.3291 26.3297 18.7731
2.1068 0.42 3600 2.1673 32.2347 11.1676 26.0657 26.0662 18.817
2.1276 0.45 3800 2.1664 32.2434 11.2862 26.094 26.0994 18.7713
2.1313 0.47 4000 2.1636 32.694 11.3724 26.4071 26.4008 18.7709
2.1229 0.49 4200 2.1633 32.456 11.4057 26.2733 26.2689 18.7586
2.129 0.52 4400 2.1641 32.309 11.2133 26.1062 26.1121 18.7729
2.1425 0.54 4600 2.1577 32.5879 11.4001 26.3045 26.3078 18.8104
2.1536 0.56 4800 2.1507 32.5152 11.4035 26.3054 26.3116 18.7941
2.148 0.59 5000 2.1503 32.8088 11.5641 26.5346 26.5311 18.7602
2.1541 0.61 5200 2.1491 32.8185 11.5816 26.5261 26.527 18.7654
2.155 0.64 5400 2.1466 32.7229 11.5339 26.4363 26.442 18.8404
2.1579 0.66 5600 2.1435 32.884 11.6042 26.5862 26.5891 18.7713
2.1601 0.68 5800 2.1393 32.8027 11.5328 26.4521 26.4567 18.7904
2.1765 0.71 6000 2.1393 32.8059 11.5751 26.5499 26.5551 18.7768
2.2176 0.73 6200 2.1345 33.0734 11.8056 26.7546 26.7607 18.7756
2.2126 0.75 6400 2.1328 32.7478 11.5925 26.5333 26.5359 18.7819
2.1916 0.78 6600 2.1298 32.658 11.491 26.379 26.3869 18.8101
2.2162 0.8 6800 2.1297 32.7843 11.5629 26.4736 26.4728 18.8187
2.2358 0.82 7000 2.1287 32.9181 11.6378 26.5966 26.5987 18.8039
2.2371 0.85 7200 2.1265 32.8413 11.674 26.5905 26.5831 18.7962
2.256 0.87 7400 2.1245 32.7412 11.5627 26.4976 26.503 18.7728
2.2566 0.89 7600 2.1220 32.8165 11.6069 26.5301 26.5295 18.7871
2.2954 0.92 7800 2.1197 32.7399 11.5417 26.4914 26.4938 18.7752
2.2766 0.94 8000 2.1187 32.853 11.6411 26.5909 26.5938 18.7852
2.3273 0.96 8200 2.1169 32.9376 11.709 26.6665 26.6672 18.7734
2.3182 0.99 8400 2.1167 32.8631 11.658 26.6192 26.6224 18.7663

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.12.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
0

Dataset used to train oMateos2020/t5-small_adafactor

Evaluation results