Edit model card

t5-small-finetuned-xsum-ashish-5000

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6200
  • Rouge1: 14.8258
  • Rouge2: 4.7741
  • Rougel: 11.3583
  • Rougelsum: 13.2147
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 313 2.8460 13.7208 4.2759 10.4447 12.1604 19.0
2.8939 2.0 626 2.7686 14.0884 4.4571 10.8946 12.6399 19.0
2.8939 3.0 939 2.7323 14.249 4.4839 10.9701 12.7336 19.0
2.6857 4.0 1252 2.7140 14.4123 4.5447 11.09 12.8468 19.0
2.6353 5.0 1565 2.6962 14.4931 4.6524 11.1552 12.9235 19.0
2.6353 6.0 1878 2.6827 14.6765 4.6571 11.2099 13.0457 19.0
2.6005 7.0 2191 2.6743 14.6923 4.6506 11.1972 13.0305 19.0
2.5721 8.0 2504 2.6691 14.8242 4.7211 11.2794 13.1706 19.0
2.5721 9.0 2817 2.6598 14.9018 4.7961 11.3472 13.2632 19.0
2.5526 10.0 3130 2.6559 14.8855 4.8159 11.3402 13.2578 19.0
2.5526 11.0 3443 2.6533 14.8022 4.7367 11.2253 13.1308 19.0
2.5352 12.0 3756 2.6490 14.7306 4.6719 11.158 13.1083 19.0
2.5238 13.0 4069 2.6460 14.7908 4.6958 11.2061 13.1103 19.0
2.5238 14.0 4382 2.6436 14.7332 4.7132 11.1581 13.0709 19.0
2.5067 15.0 4695 2.6403 14.7062 4.7363 11.1275 13.0921 19.0
2.4922 16.0 5008 2.6382 14.735 4.6939 11.1301 13.0941 19.0
2.4922 17.0 5321 2.6353 14.8166 4.7615 11.2635 13.1526 19.0
2.4841 18.0 5634 2.6334 14.8517 4.8063 11.2705 13.1878 19.0
2.4841 19.0 5947 2.6306 14.7038 4.6747 11.1493 13.0818 19.0
2.4789 20.0 6260 2.6312 14.8127 4.7543 11.2775 13.1812 19.0
2.4644 21.0 6573 2.6285 14.7922 4.7114 11.2655 13.1716 19.0
2.4644 22.0 6886 2.6270 14.8587 4.78 11.3163 13.2017 19.0
2.4506 23.0 7199 2.6264 14.7304 4.6852 11.2138 13.1306 19.0
2.4595 24.0 7512 2.6258 14.7294 4.6597 11.2354 13.1126 19.0
2.4595 25.0 7825 2.6257 14.6318 4.6467 11.1913 13.0587 19.0
2.4523 26.0 8138 2.6250 14.7609 4.7037 11.2777 13.1711 19.0
2.4523 27.0 8451 2.6231 14.7342 4.7566 11.2569 13.1351 19.0
2.4317 28.0 8764 2.6223 14.725 4.7248 11.247 13.1234 19.0
2.4374 29.0 9077 2.6231 14.6911 4.7196 11.2372 13.0854 19.0
2.4374 30.0 9390 2.6234 14.6889 4.7202 11.2565 13.1003 19.0
2.4323 31.0 9703 2.6222 14.7264 4.7543 11.2752 13.1442 19.0
2.4295 32.0 10016 2.6215 14.7613 4.723 11.2632 13.1389 19.0
2.4295 33.0 10329 2.6212 14.7716 4.7676 11.3014 13.1637 19.0
2.4282 34.0 10642 2.6211 14.7547 4.7437 11.296 13.1552 19.0
2.4282 35.0 10955 2.6203 14.7717 4.7502 11.2999 13.1498 19.0
2.4265 36.0 11268 2.6208 14.7952 4.7795 11.3294 13.1866 19.0
2.4145 37.0 11581 2.6203 14.8122 4.7814 11.3385 13.1882 19.0
2.4145 38.0 11894 2.6202 14.8281 4.7798 11.3381 13.2065 19.0
2.4241 39.0 12207 2.6202 14.8163 4.7801 11.3492 13.2034 19.0
2.4163 40.0 12520 2.6200 14.8258 4.7741 11.3583 13.2147 19.0

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.12.1+cu113
  • Datasets 2.6.1
  • Tokenizers 0.13.2
Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.