Edit model card

longt5_xl_summ_screen_25

This model is a fine-tuned version of longt5_xl_summ_screen_memsum_20/checkpoint-140 on the learn3r/summ_screen_memsum_oracle dataset. It achieves the following results on the evaluation set:

  • Loss: 4.0742
  • Rouge1: 39.5624
  • Rouge2: 10.2833
  • Rougel: 21.2004
  • Rougelsum: 34.0767
  • Gen Len: 152.0325

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.1846 0.97 14 4.6151 37.719 9.6532 20.2955 32.3806 100.9527
0.1441 1.95 28 4.1640 36.8632 9.6545 20.9349 31.8954 105.5799
0.1379 2.99 43 4.0742 39.5624 10.2833 21.2004 34.0767 152.0325
0.089 3.97 57 4.5216 40.2528 10.9254 21.6978 34.6793 176.0976
0.1028 4.87 70 4.1434 32.7739 9.1305 19.6246 27.9012 59.6775

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
9
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.

Evaluation results