Edit model card

longt5_xl_summ_screen_memsum_bp_30

This model is a fine-tuned version of longt5_xl_summ_screen_memsum_bp_20/checkpoint-140 on the learn3r/summ_screen_fd_memsum_bp dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6817
  • Rouge1: 47.1842
  • Rouge2: 18.22
  • Rougel: 28.4626
  • Rougelsum: 45.5778
  • Gen Len: 308.9083

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.0707 0.97 14 2.7097 41.4751 15.5831 25.1976 39.9229 453.5296
0.0608 1.95 28 2.7271 45.691 17.905 27.9519 43.8787 387.4172
0.0851 2.99 43 3.0001 47.1647 17.8993 28.7561 45.661 261.5680
0.0697 3.97 57 2.9297 46.6892 17.8922 28.0724 44.8821 365.3047
0.0296 4.94 71 2.9017 44.2702 17.7874 26.7598 42.6857 440.6391
0.0312 5.98 86 3.0489 47.7884 18.1788 28.6688 46.0744 306.6716
0.0383 6.96 100 2.6817 47.1842 18.22 28.4626 45.5778 308.9083
0.0367 8.0 115 3.0245 45.5573 17.2161 28.0573 43.7772 227.8550
0.04 8.97 129 3.2873 44.0164 17.1682 26.4769 42.3752 429.8757
0.028 9.74 140 2.9815 46.6542 17.8515 28.146 45.0274 337.4822

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
11
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.

Dataset used to train learn3r/longt5_xl_summ_screen_memsum_bp_30

Evaluation results