Edit model card

bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv-v3-e16

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8960
  • Rouge1: 57.7198
  • Rouge2: 44.5711
  • Rougel: 47.6281
  • Rougelsum: 56.2372
  • Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 16
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 398 0.8634 53.7416 34.3731 37.1193 51.3075 142.0
0.8276 2.0 796 0.8001 53.9975 35.1019 38.2722 51.7878 142.0
0.5311 3.0 1194 0.7988 53.409 34.3201 37.5443 50.738 142.0
0.3538 4.0 1592 0.7698 53.679 34.7209 37.7895 51.2497 142.0
0.3538 5.0 1990 0.7863 54.2493 36.0643 39.1249 51.9758 142.0
0.2367 6.0 2388 0.7810 54.4042 37.4276 41.529 52.1544 142.0
0.164 7.0 2786 0.8055 56.0408 39.6744 42.8323 54.163 142.0
0.1146 8.0 3184 0.8098 55.2046 38.5399 41.9178 53.0001 142.0
0.089 9.0 3582 0.8199 57.1523 41.7614 44.5914 55.1602 142.0
0.089 10.0 3980 0.8644 56.943 41.5063 44.4929 54.9515 142.0
0.0647 11.0 4378 0.8413 57.0321 41.964 45.3971 55.0957 142.0
0.0485 12.0 4776 0.8735 56.7275 41.8577 44.3911 54.9824 142.0
0.0365 13.0 5174 0.8858 57.6103 43.8831 47.0374 56.0675 142.0
0.0271 14.0 5572 0.8974 57.39 42.8693 45.9344 55.7404 142.0
0.0271 15.0 5970 0.8990 57.9433 44.7301 47.843 56.5407 142.0
0.0232 16.0 6368 0.8960 57.7198 44.5711 47.6281 56.2372 142.0

Framework versions

  • Transformers 4.19.2
  • Pytorch 1.11.0+cu113
  • Datasets 2.2.2
  • Tokenizers 0.12.1
Downloads last month
2
Inference API
This model can be loaded on Inference API (serverless).