Edit model card

t5-small-finetuned-cnndm_3epoch_v2

This model is a fine-tuned version of t5-small on the cnn_dailymail dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6070
  • Rouge1: 24.7696
  • Rouge2: 11.9467
  • Rougel: 20.4495
  • Rougelsum: 23.3341
  • Gen Len: 18.9999

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.9695 0.07 5000 1.7781 24.2253 11.472 20.0367 22.8469 18.9962
1.9536 0.14 10000 1.7575 24.2983 11.469 20.0054 22.9144 18.9995
1.9452 0.21 15000 1.7406 24.2068 11.4601 20.0021 22.8375 19.0
1.931 0.28 20000 1.7302 24.1589 11.4183 19.9736 22.7804 18.9996
1.9182 0.35 25000 1.7381 24.1634 11.5435 19.9643 22.7371 18.9999
1.9072 0.42 30000 1.7239 24.4401 11.6323 20.1243 22.9468 19.0
1.9027 0.49 35000 1.7162 24.1801 11.4788 20.0011 22.832 18.9996
1.8962 0.56 40000 1.7060 24.4153 11.6275 20.1742 23.0865 18.9998
1.8905 0.63 45000 1.7004 24.1446 11.5402 19.9986 22.7949 18.9983
1.8764 0.7 50000 1.6876 24.342 11.5448 20.0993 22.9509 18.9993
1.8772 0.77 55000 1.6879 24.3596 11.6063 20.1592 22.9966 19.0
1.8669 0.84 60000 1.6776 24.6201 11.6668 20.2639 23.201 18.9994
1.8692 0.91 65000 1.6838 24.2924 11.6129 20.1071 22.9112 18.9997
1.8552 0.98 70000 1.6885 24.2878 11.6773 20.1272 22.8797 18.9992
1.8205 1.04 75000 1.6717 24.5579 11.6421 20.2593 23.1442 19.0
1.8074 1.11 80000 1.6604 24.495 11.6542 20.1854 23.1091 18.9996
1.7951 1.18 85000 1.6705 24.4504 11.6601 20.2185 23.0597 18.9999
1.7937 1.25 90000 1.6645 24.5535 11.6921 20.2087 23.1099 18.9999
1.8017 1.32 95000 1.6647 24.5179 11.8005 20.2903 23.13 18.9993
1.7918 1.39 100000 1.6568 24.518 11.7528 20.222 23.0767 18.9991
1.7985 1.46 105000 1.6588 24.4636 11.636 20.1038 23.032 19.0
1.7944 1.53 110000 1.6498 24.6611 11.78 20.3059 23.2404 18.9999
1.7934 1.6 115000 1.6551 24.7267 11.823 20.3377 23.273 18.9997
1.7764 1.67 120000 1.6467 24.5052 11.8052 20.2617 23.1228 18.9996
1.7846 1.74 125000 1.6489 24.5423 11.8407 20.3464 23.1433 18.9999
1.7799 1.81 130000 1.6438 24.4915 11.7827 20.2592 23.1299 18.9999
1.7806 1.88 135000 1.6353 24.7804 11.9212 20.4678 23.359 19.0
1.7784 1.95 140000 1.6338 24.7892 11.8836 20.4227 23.373 18.9997
1.7551 2.02 145000 1.6341 24.6828 11.8257 20.3862 23.2536 18.9997
1.728 2.09 150000 1.6328 24.6697 11.851 20.3943 23.2738 18.9993
1.7201 2.16 155000 1.6309 24.7364 11.8505 20.365 23.2885 18.9992
1.7233 2.23 160000 1.6346 24.7298 12.0026 20.4444 23.3156 18.9999
1.7096 2.3 165000 1.6253 24.6443 11.9004 20.4138 23.2583 18.9999
1.7084 2.37 170000 1.6233 24.6688 11.8885 20.3623 23.2608 18.9996
1.7236 2.44 175000 1.6243 24.7174 11.8924 20.4012 23.2948 18.9996
1.7108 2.51 180000 1.6188 24.6013 11.8153 20.2969 23.1867 18.9997
1.711 2.58 185000 1.6125 24.7673 11.8646 20.3805 23.3114 18.9997
1.7108 2.65 190000 1.6101 24.8047 11.9763 20.494 23.3873 18.9998
1.7114 2.72 195000 1.6123 24.7019 11.9201 20.414 23.2823 18.9999
1.7004 2.79 200000 1.6083 24.7525 11.9197 20.4581 23.3371 18.9999
1.7104 2.86 205000 1.6061 24.7057 11.8818 20.4017 23.286 18.9999
1.7063 2.93 210000 1.6063 24.7707 11.934 20.4473 23.3316 18.9999
1.7039 3.0 215000 1.6070 24.7696 11.9467 20.4495 23.3341 18.9999

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.0.0
  • Tokenizers 0.11.6
Downloads last month
20

Dataset used to train Sevil/t5-small-finetuned-cnndm_3epoch_v2

Evaluation results