Edit model card

flan-t5-large-reaction-extraction

This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8736
  • Rouge1: 34.4249
  • Rouge2: 21.4943
  • Rougel: 33.4902
  • Rougelsum: 33.466
  • Gen Len: 28.1622

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.0236 1.0 2885 0.9221 24.983 14.6971 24.5305 24.5497 42.0762
0.8177 2.0 5771 0.8503 30.0616 18.1568 29.5486 29.5537 35.5198
0.6784 3.0 8657 0.8324 32.537 20.2178 31.7925 31.75 27.7862
0.5961 4.0 11543 0.8330 32.9769 20.8184 32.2179 32.2036 33.6372
0.4985 5.0 14425 0.8736 34.4249 21.4943 33.4902 33.466 28.1622

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu117
  • Datasets 2.6.1
  • Tokenizers 0.13.3
Downloads last month
3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.