Edit model card
YAML Metadata Error: "base_model" with value "/exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen_bp_only/checkpoint-210" is not valid. Use a model id from https://hf.co/models.

longt5_xl_summ_screen_bp_only_30

This model is a fine-tuned version of /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen_bp_only/checkpoint-210 on the learn3r/summ_screen_fd_bp dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2376
  • Rouge1: 40.4388
  • Rouge2: 16.4662
  • Rougel: 28.0771
  • Rougelsum: 38.3405
  • Gen Len: 246.7396

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 15.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.324 0.97 14 2.2376 40.4388 16.4662 28.0771 38.3405 246.7396
0.2707 1.95 28 2.3204 40.2873 16.7641 27.3895 38.2689 307.3787
0.2217 2.99 43 2.5281 31.9916 13.8136 22.1895 30.623 501.9320
0.1776 3.97 57 2.7530 31.7535 13.8852 22.8653 30.3796 489.6183
0.1424 4.94 71 2.6578 32.117 14.2141 22.3733 30.8328 502.1124
0.1449 5.98 86 2.5508 35.3448 13.8478 24.9044 33.6108 357.3136
0.1191 6.96 100 3.1622 37.2189 16.0076 25.7011 35.294 408.8669
0.0879 8.0 115 2.8510 39.8825 16.8073 27.2428 37.9568 318.2278
0.0899 8.97 129 2.9138 31.7139 13.7066 21.8844 30.5075 500.4053
0.0656 9.95 143 3.1616 33.055 14.5841 22.5883 31.7565 488.1686
0.0542 10.99 158 3.3630 43.7514 18.9011 29.9017 41.6887 198.8077
0.0557 11.97 172 3.3826 42.3089 18.2735 29.0356 40.4154 270.9675
0.0542 12.94 186 3.4408 40.7691 16.529 28.3999 38.9723 186.7308
0.0596 13.98 201 3.5253 37.0037 15.9098 25.2808 35.3868 398.4704
0.0385 14.61 210 3.4990 32.5815 14.2951 22.4501 31.2928 499.3107

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train learn3r/longt5_xl_summ_screen_bp_only_30

Evaluation results