asandhir's picture
End of training
c9135a5
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
datasets:
  - multi_news
model-index:
  - name: t5-small_multinews_model
    results: []

t5-small_multinews_model

This model is a fine-tuned version of t5-small on the multi_news dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6269
  • Rouge Rouge1: 0.1471
  • Rouge Rouge2: 0.0483
  • Rouge Rougel: 0.1131
  • Rouge Rougelsum: 0.1131
  • Bleu Bleu: 0.0003
  • Bleu Precisions: [0.5848502090652357, 0.18492208339182928, 0.08486295668446923, 0.04842115016777968]
  • Bleu Brevity Penalty: 0.0022
  • Bleu Length Ratio: 0.1408
  • Bleu Translation Length: 191567
  • Bleu Reference Length: 1360656

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge Rouge1 Rouge Rouge2 Rouge Rougel Rouge Rougelsum Bleu Bleu Bleu Precisions Bleu Brevity Penalty Bleu Length Ratio Bleu Translation Length Bleu Reference Length
2.9189 1.0 7870 2.6869 0.1448 0.0474 0.1117 0.1117 0.0003 [0.5827522821123012, 0.1820493433028088, 0.08242051182628926, 0.04574874477953644] 0.0023 0.1411 192037 1360656
2.8435 2.0 15740 2.6535 0.1460 0.0474 0.1122 0.1122 0.0003 [0.5809636959568958, 0.18126278620071182, 0.08254004826406995, 0.04636911719064694] 0.0023 0.1410 191907 1360656
2.7922 3.0 23610 2.6389 0.1461 0.0477 0.1124 0.1124 0.0003 [0.581669805398619, 0.18257649098318213, 0.08343485040444401, 0.0471782007379682] 0.0022 0.1405 191160 1360656
2.814 4.0 31480 2.6280 0.1468 0.0478 0.1129 0.1129 0.0003 [0.5844809737428239, 0.18360803285143726, 0.08381524001996615, 0.04753093788548009] 0.0022 0.1406 191262 1360656
2.7869 5.0 39350 2.6269 0.1471 0.0483 0.1131 0.1131 0.0003 [0.5848502090652357, 0.18492208339182928, 0.08486295668446923, 0.04842115016777968] 0.0022 0.1408 191567 1360656

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3