Edit model card

t5-v1_1-base-fce-e8-b16

This model is a fine-tuned version of google/t5-v1_1-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3409
  • Rouge1: 87.1583
  • Rouge2: 79.8003
  • Rougel: 86.6556
  • Rougelsum: 86.6858
  • Gen Len: 14.8987

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adafactor
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.9063 0.06 100 0.8111 27.4937 22.9629 27.3015 27.2771 7.4286
0.7836 0.11 200 0.5104 85.4419 76.9583 84.8358 84.8509 15.0488
0.6368 0.17 300 0.4682 86.2542 77.5212 85.6688 85.6923 14.8298
0.5924 0.23 400 0.4734 86.4845 78.0506 85.9059 85.9008 14.8352
0.5694 0.28 500 0.4081 86.352 78.0709 85.8245 85.8281 14.8585
0.5335 0.34 600 0.4179 86.5893 78.4175 86.0693 86.0625 14.8745
0.5246 0.4 700 0.3990 86.4139 78.4306 85.9523 85.9443 14.8617
0.504 0.45 800 0.4233 86.7504 78.7906 86.2416 86.2447 14.8759
0.4818 0.51 900 0.4008 86.7978 78.8187 86.2413 86.2432 14.8699
0.4756 0.56 1000 0.4028 86.9123 79.0247 86.3563 86.3635 14.8640
0.4772 0.62 1100 0.3789 86.5028 78.5736 85.9794 85.9983 14.8717
0.4638 0.68 1200 0.3818 86.6276 78.7383 86.084 86.0903 14.9124
0.4614 0.73 1300 0.3839 86.8128 79.2001 86.3591 86.3519 14.8695
0.4326 0.79 1400 0.3751 86.9302 79.3511 86.4188 86.4311 14.9019
0.4485 0.85 1500 0.3654 86.6862 79.0433 86.1832 86.1872 14.9206
0.4187 0.9 1600 0.3823 86.9451 79.2758 86.4628 86.4724 14.8795
0.4218 0.96 1700 0.3696 86.9051 79.1393 86.3682 86.3627 14.9220
0.3812 1.02 1800 0.3699 87.0233 79.4507 86.513 86.5154 14.8873
0.3116 1.07 1900 0.3763 86.9293 79.2058 86.4356 86.4445 14.8918
0.3237 1.13 2000 0.3740 87.0449 79.4088 86.5157 86.5319 14.8918
0.3071 1.19 2100 0.3690 86.5698 78.4408 85.9993 86.0409 14.9069
0.3072 1.24 2200 0.3646 86.9336 79.334 86.4284 86.4303 14.8918
0.2953 1.3 2300 0.3750 86.7437 78.949 86.2131 86.202 14.8909
0.308 1.35 2400 0.3613 86.792 79.2179 86.2832 86.2934 14.8923
0.3132 1.41 2500 0.3528 86.7653 79.0525 86.2258 86.2357 14.9110
0.3141 1.47 2600 0.3494 86.8884 79.2484 86.3719 86.3622 14.9069
0.3095 1.52 2700 0.3539 87.0166 79.5218 86.5167 86.5248 14.8905
0.3274 1.58 2800 0.3599 87.2104 79.7277 86.7135 86.7127 14.8854
0.312 1.64 2900 0.3536 86.8926 79.2971 86.3699 86.3666 14.8886
0.3134 1.69 3000 0.3518 87.0884 79.5848 86.5877 86.6005 14.9028
0.3012 1.75 3100 0.3573 86.3559 78.1413 85.8416 85.8479 14.8763
0.311 1.81 3200 0.3467 86.9837 79.4983 86.4827 86.4981 14.8937
0.303 1.86 3300 0.3422 86.9232 79.3542 86.4098 86.4427 14.9032
0.304 1.92 3400 0.3409 87.1583 79.8003 86.6556 86.6858 14.8987
0.2934 1.98 3500 0.3485 87.0529 79.6491 86.5825 86.6003 14.9000
0.247 2.03 3600 0.3586 87.0147 79.6418 86.5126 86.5339 14.9042
0.193 2.09 3700 0.3667 86.9326 79.4481 86.4675 86.4709 14.9128
0.195 2.14 3800 0.3673 86.8892 79.3638 86.3717 86.3866 14.9210
0.19 2.2 3900 0.3670 86.8789 79.4677 86.3925 86.3892 14.9023
0.2033 2.26 4000 0.3600 86.9004 79.5211 86.4043 86.407 14.9042
0.1969 2.31 4100 0.3587 87.0403 79.7208 86.5257 86.5245 14.8978
0.2035 2.37 4200 0.3630 86.8793 79.4667 86.3931 86.3875 14.8895
0.2162 2.43 4300 0.3722 86.78 79.3367 86.2742 86.2812 14.9083
0.1984 2.48 4400 0.3573 86.7248 79.2577 86.218 86.2139 14.8918
0.2058 2.54 4500 0.3617 86.6452 79.1422 86.1701 86.1838 14.8909
0.2161 2.6 4600 0.3554 86.8574 79.5476 86.3982 86.4095 14.9283
0.215 2.65 4700 0.3583 86.8873 79.5265 86.4039 86.3996 14.8923
0.2048 2.71 4800 0.3535 86.8465 79.3852 86.3446 86.344 14.8978
0.2099 2.77 4900 0.3601 86.8952 79.4424 86.3888 86.387 14.8868
0.2149 2.82 5000 0.3603 86.7871 79.2397 86.297 86.3004 14.8850
0.2251 2.88 5100 0.3448 86.9477 79.6744 86.4984 86.4911 14.9133
0.2048 2.93 5200 0.3522 86.8843 79.37 86.3702 86.3668 14.8955
0.2099 2.99 5300 0.3459 86.7938 79.2104 86.3027 86.3169 14.9137
0.1377 3.05 5400 0.4000 86.9855 79.4184 86.438 86.4375 14.9110
0.1369 3.1 5500 0.3848 86.8338 79.2098 86.2885 86.3028 14.9019
0.1357 3.16 5600 0.3914 86.7061 79.2474 86.2247 86.2237 14.9105
0.1263 3.22 5700 0.3864 86.7128 79.1103 86.2121 86.2166 14.9137
0.135 3.27 5800 0.3929 86.8134 79.4572 86.3608 86.3683 14.9124
0.1361 3.33 5900 0.3828 86.9149 79.4756 86.4152 86.3959 14.8959
0.1286 3.39 6000 0.3849 86.8025 79.3645 86.3215 86.3204 14.8996
0.1335 3.44 6100 0.3793 86.7591 79.2887 86.2778 86.2765 14.9105
0.1278 3.5 6200 0.3938 86.8352 79.4161 86.3282 86.3376 14.9169
0.1346 3.56 6300 0.3943 86.9637 79.6404 86.4753 86.4718 14.8978
0.1421 3.61 6400 0.3799 86.8445 79.4133 86.3271 86.3206 14.9151
0.1398 3.67 6500 0.3923 86.9793 79.6847 86.4935 86.4889 14.9174
0.1359 3.72 6600 0.3912 86.9095 79.3593 86.4296 86.4506 14.8959
0.1444 3.78 6700 0.3741 86.8498 79.3141 86.3586 86.3681 14.8909
0.1351 3.84 6800 0.3840 87.223 79.825 86.7127 86.7371 14.8877
0.1325 3.89 6900 0.3816 87.148 79.8102 86.6405 86.6511 14.9133
0.1315 3.95 7000 0.3796 86.7778 79.3782 86.3057 86.2939 14.9005
0.1332 4.01 7100 0.3962 87.0238 79.6621 86.5384 86.5306 14.8996
0.0834 4.06 7200 0.4271 86.9999 79.7076 86.4981 86.5026 14.9014
0.088 4.12 7300 0.4176 86.9193 79.4698 86.4085 86.4171 14.9128
0.0897 4.18 7400 0.4109 86.9287 79.5866 86.4541 86.4474 14.9037
0.0908 4.23 7500 0.4109 87.1272 79.7632 86.6206 86.6176 14.9133
0.0895 4.29 7600 0.4114 87.0107 79.7349 86.4873 86.4754 14.9023
0.0856 4.35 7700 0.4242 87.0115 79.6387 86.4786 86.49 14.8982
0.0852 4.4 7800 0.4271 86.9943 79.6717 86.5126 86.5026 14.9019
0.0919 4.46 7900 0.4216 86.9903 79.67 86.512 86.5085 14.8937
0.0907 4.51 8000 0.4180 87.0323 79.7092 86.5391 86.5343 14.8978
0.0889 4.57 8100 0.4276 86.9813 79.6367 86.4697 86.4724 14.9115
0.0907 4.63 8200 0.4209 87.0149 79.5637 86.5028 86.5059 14.9092
0.0966 4.68 8300 0.4064 86.9685 79.4665 86.4393 86.4523 14.9010
0.088 4.74 8400 0.4234 86.9921 79.5729 86.4977 86.5067 14.8800
0.0897 4.8 8500 0.4117 87.0727 79.7094 86.5465 86.5482 14.9014
0.0924 4.85 8600 0.4056 86.8789 79.409 86.3689 86.3672 14.9083
0.0916 4.91 8700 0.4127 86.8645 79.4195 86.3814 86.3729 14.8982
0.0908 4.97 8800 0.4054 86.9146 79.4138 86.4022 86.399 14.9000
0.078 5.02 8900 0.4403 87.0178 79.6166 86.5112 86.505 14.9078
0.0583 5.08 9000 0.4400 86.9828 79.649 86.4913 86.4962 14.9064
0.057 5.14 9100 0.4637 87.0435 79.6446 86.5464 86.5252 14.9037
0.0581 5.19 9200 0.4617 87.017 79.6255 86.5004 86.4907 14.9069
0.0562 5.25 9300 0.4521 86.8638 79.479 86.3298 86.338 14.9096
0.0588 5.3 9400 0.4472 86.9719 79.5608 86.4751 86.4798 14.9073
0.0571 5.36 9500 0.4472 87.0325 79.6355 86.5154 86.5278 14.9073
0.0589 5.42 9600 0.4580 87.1556 79.8992 86.627 86.6372 14.9064
0.057 5.47 9700 0.4527 87.0033 79.6457 86.4846 86.5031 14.9101
0.0595 5.53 9800 0.4538 87.0419 79.6632 86.5261 86.5434 14.9055
0.062 5.59 9900 0.4518 87.0581 79.6818 86.54 86.551 14.9005
0.0568 5.64 10000 0.4549 87.1255 79.8908 86.6143 86.6255 14.9042
0.0572 5.7 10100 0.4557 86.9927 79.5946 86.4726 86.4953 14.9023
0.0603 5.76 10200 0.4493 87.0665 79.7469 86.58 86.5934 14.8932
0.0604 5.81 10300 0.4533 87.0864 79.7039 86.5871 86.5851 14.9042
0.0564 5.87 10400 0.4653 87.082 79.766 86.5835 86.5775 14.9055
0.0579 5.93 10500 0.4677 86.9805 79.5068 86.4708 86.4744 14.8882
0.0582 5.98 10600 0.4607 86.9273 79.3762 86.4228 86.4225 14.9119
0.0454 6.04 10700 0.4917 87.038 79.6146 86.5363 86.533 14.9156
0.0399 6.09 10800 0.4986 87.0026 79.5481 86.4992 86.4924 14.9042
0.0367 6.15 10900 0.5115 87.13 79.7506 86.6082 86.621 14.9055
0.0405 6.21 11000 0.5084 87.0768 79.6986 86.5541 86.5403 14.9083
0.0386 6.26 11100 0.5092 87.1376 79.7442 86.5937 86.5767 14.8996
0.0382 6.32 11200 0.5063 87.0779 79.7205 86.561 86.5546 14.8982
0.0431 6.38 11300 0.4950 87.0998 79.7699 86.5882 86.5916 14.9028
0.0388 6.43 11400 0.5098 87.1711 79.8707 86.6425 86.6409 14.9023
0.041 6.49 11500 0.4911 87.1742 79.8319 86.6434 86.6522 14.9005
0.0379 6.55 11600 0.5023 87.2258 79.9175 86.7019 86.7018 14.9010
0.0383 6.6 11700 0.5078 87.0913 79.7547 86.5767 86.5826 14.9046
0.0387 6.66 11800 0.5111 87.1913 79.9592 86.6805 86.6742 14.9060
0.0362 6.72 11900 0.5125 87.0096 79.6639 86.5037 86.5039 14.9124
0.0343 6.77 12000 0.5210 87.0657 79.7384 86.5621 86.5561 14.9110
0.0401 6.83 12100 0.5110 87.1338 79.8537 86.6368 86.6271 14.9124
0.0353 6.88 12200 0.5169 87.082 79.756 86.5771 86.5718 14.9073
0.0384 6.94 12300 0.4998 87.1211 79.8474 86.6016 86.6065 14.9078
0.0395 7.0 12400 0.5184 87.1621 79.8793 86.6411 86.648 14.9064
0.0243 7.05 12500 0.5387 87.1588 79.8545 86.6464 86.6627 14.9019
0.0283 7.11 12600 0.5384 87.1909 79.8888 86.6567 86.6698 14.9042
0.026 7.17 12700 0.5459 87.1782 79.7991 86.6373 86.6507 14.9028
0.0303 7.22 12800 0.5301 87.1014 79.7321 86.5581 86.5743 14.9014
0.0252 7.28 12900 0.5481 87.0907 79.6948 86.5306 86.5474 14.9069
0.0273 7.34 13000 0.5469 87.0971 79.6697 86.5392 86.558 14.8987
0.0249 7.39 13100 0.5462 87.095 79.6904 86.5559 86.566 14.9037
0.0246 7.45 13200 0.5553 87.0964 79.6834 86.5572 86.5607 14.9055
0.0286 7.51 13300 0.5501 87.0933 79.7177 86.5579 86.5582 14.9092
0.0234 7.56 13400 0.5550 87.1266 79.7546 86.5833 86.5855 14.9087
0.0263 7.62 13500 0.5570 87.0957 79.6859 86.5608 86.5584 14.9064
0.0238 7.67 13600 0.5630 87.1368 79.7487 86.6036 86.6031 14.9032
0.0258 7.73 13700 0.5598 87.1527 79.7481 86.622 86.6153 14.9055
0.0249 7.79 13800 0.5649 87.15 79.7419 86.6106 86.6056 14.9046
0.0272 7.84 13900 0.5616 87.1439 79.7597 86.6085 86.6081 14.9042
0.0261 7.9 14000 0.5596 87.1359 79.7696 86.6081 86.6024 14.9051
0.0233 7.96 14100 0.5611 87.1367 79.7636 86.6112 86.6019 14.9046

Framework versions

  • Transformers 4.30.1
  • Pytorch 1.11.0a0+b6df043
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
6