Edit model card

flan-t5-large

This model is a fine-tuned version of google/flan-t5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Gen Len: 100.0179
  • Loss: 1.7264
  • Rouge1: 45.7249
  • Rouge2: 19.5118
  • Rougel: 33.0194
  • Rougelsum: 42.7326

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 14
  • eval_batch_size: 14
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 56
  • total_eval_batch_size: 56
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Gen Len Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 302 106.2758 1.7235 44.3127 18.5321 31.8618 41.1286
1.8444 2.0 604 98.7378 1.7037 44.6191 18.529 32.4108 41.5764
1.8444 3.0 906 100.9276 1.7065 45.2044 19.0224 32.7758 42.1475
1.6046 4.0 1208 97.4516 1.7114 45.5049 19.1992 32.905 42.5556
1.565 5.0 1510 100.0179 1.7264 45.7249 19.5118 33.0194 42.7326

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
0
Unable to determine this model's library. Check the docs .

Finetuned from