Edit model card

bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv-earlystopping

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8347
  • Rouge1: 53.9049
  • Rouge2: 35.5953
  • Rougel: 39.788
  • Rougelsum: 51.4101
  • Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 0.31 125 1.0240 52.5632 32.977 34.672 49.9905 142.0
No log 0.63 250 1.0056 52.5508 32.4826 34.6851 49.835 141.6852
No log 0.94 375 0.8609 53.0475 32.9384 35.3322 50.272 141.6481
0.8255 1.26 500 0.9022 52.2493 31.5622 33.389 49.6612 142.0
0.8255 1.57 625 0.8706 53.3568 33.2533 35.7531 50.4568 141.8889
0.8255 1.88 750 0.8186 52.7375 33.4439 37.1094 50.5323 142.0
0.8255 2.2 875 0.8041 53.4992 34.6929 37.9614 51.091 142.0
0.5295 2.51 1000 0.7907 52.6185 33.8053 37.1725 50.4881 142.0
0.5295 2.83 1125 0.7740 52.7107 33.1023 36.0865 50.0365 142.0
0.5295 3.14 1250 0.8200 52.5607 33.7948 37.2312 50.3345 142.0
0.5295 3.45 1375 0.8188 53.9233 34.446 36.7566 51.3135 142.0
0.351 3.77 1500 0.8071 53.9096 35.5977 38.6832 51.4986 142.0
0.351 4.08 1625 0.8347 53.9049 35.5953 39.788 51.4101 142.0

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.2.1
  • Tokenizers 0.12.1
Downloads last month
2