Edit model card

t5-small_multinews_model

This model is a fine-tuned version of t5-small on the multi_news dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6269
  • Rouge Rouge1: 0.1471
  • Rouge Rouge2: 0.0483
  • Rouge Rougel: 0.1131
  • Rouge Rougelsum: 0.1131
  • Bleu Bleu: 0.0003
  • Bleu Precisions: [0.5848502090652357, 0.18492208339182928, 0.08486295668446923, 0.04842115016777968]
  • Bleu Brevity Penalty: 0.0022
  • Bleu Length Ratio: 0.1408
  • Bleu Translation Length: 191567
  • Bleu Reference Length: 1360656

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge Rouge1 Rouge Rouge2 Rouge Rougel Rouge Rougelsum Bleu Bleu Bleu Precisions Bleu Brevity Penalty Bleu Length Ratio Bleu Translation Length Bleu Reference Length
2.9189 1.0 7870 2.6869 0.1448 0.0474 0.1117 0.1117 0.0003 [0.5827522821123012, 0.1820493433028088, 0.08242051182628926, 0.04574874477953644] 0.0023 0.1411 192037 1360656
2.8435 2.0 15740 2.6535 0.1460 0.0474 0.1122 0.1122 0.0003 [0.5809636959568958, 0.18126278620071182, 0.08254004826406995, 0.04636911719064694] 0.0023 0.1410 191907 1360656
2.7922 3.0 23610 2.6389 0.1461 0.0477 0.1124 0.1124 0.0003 [0.581669805398619, 0.18257649098318213, 0.08343485040444401, 0.0471782007379682] 0.0022 0.1405 191160 1360656
2.814 4.0 31480 2.6280 0.1468 0.0478 0.1129 0.1129 0.0003 [0.5844809737428239, 0.18360803285143726, 0.08381524001996615, 0.04753093788548009] 0.0022 0.1406 191262 1360656
2.7869 5.0 39350 2.6269 0.1471 0.0483 0.1131 0.1131 0.0003 [0.5848502090652357, 0.18492208339182928, 0.08486295668446923, 0.04842115016777968] 0.0022 0.1408 191567 1360656

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
3

Finetuned from

Dataset used to train asandhir/t5-small_multinews_model