theojolliffe's picture
update model card README.md
75a6348
|
raw
history blame
5.53 kB
metadata
license: mit
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-cnn-pubmed-arxiv-pubmed-arxiv-earlystopping
    results: []

bart-cnn-pubmed-arxiv-pubmed-arxiv-earlystopping

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed-arxiv on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8794
  • Rouge1: 55.9136
  • Rouge2: 40.6124
  • Rougel: 43.8806
  • Rougelsum: 54.2039
  • Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 0.31 125 1.2057 50.9436 30.6436 32.6348 48.0674 141.3519
No log 0.63 250 1.0933 52.0677 31.2561 32.8008 49.0282 141.9815
No log 0.94 375 0.9685 51.6623 32.148 34.0536 48.9779 141.5556
1.1594 1.26 500 0.9725 50.4646 30.6781 32.1995 47.3852 142.0
1.1594 1.57 625 0.9342 52.2146 32.2166 33.7256 49.2233 142.0
1.1594 1.88 750 0.8715 52.2443 33.66 36.0575 49.7769 141.6481
1.1594 2.2 875 0.8334 53.0976 33.9638 36.0616 50.7417 141.8889
0.6845 2.51 1000 0.8241 52.3152 32.8571 35.302 49.6273 142.0
0.6845 2.83 1125 0.7986 54.075 35.0318 37.4544 51.4955 142.0
0.6845 3.14 1250 0.8532 52.1242 32.5844 34.6821 49.6048 141.7037
0.6845 3.45 1375 0.8319 52.0714 32.8862 35.3255 49.3984 141.7593
0.4488 3.77 1500 0.8033 53.2189 34.7029 37.5627 50.8068 142.0
0.4488 4.08 1625 0.8322 53.1666 34.8916 37.733 50.9602 142.0
0.4488 4.4 1750 0.7985 51.8809 32.9926 36.3812 49.6592 142.0
0.4488 4.71 1875 0.8049 54.2959 36.648 39.2174 52.2153 141.8148
0.3017 5.03 2000 0.8148 53.1564 35.2561 38.4413 50.9793 141.7778
0.3017 5.34 2125 0.8153 53.5528 35.217 37.9034 51.3596 141.0741
0.3017 5.65 2250 0.8009 52.4906 34.9253 37.9829 50.3951 141.6111
0.3017 5.97 2375 0.7509 54.3645 37.5095 40.5725 52.1743 142.0
0.2052 6.28 2500 0.8019 54.5817 36.5587 40.0273 52.5349 142.0
0.2052 6.6 2625 0.8176 55.3618 38.556 41.5709 53.5806 142.0
0.2052 6.91 2750 0.7956 55.5057 38.0122 40.8857 53.1755 141.9815
0.2052 7.22 2875 0.7966 54.4586 37.4821 40.7638 52.4144 142.0
0.1465 7.54 3000 0.8311 54.3973 37.1016 40.2977 52.3982 142.0
0.1465 7.85 3125 0.8227 53.9072 36.5277 39.0963 51.9937 141.8889
0.1465 8.17 3250 0.7947 54.7043 38.5848 41.2942 52.8724 142.0
0.1465 8.48 3375 0.7954 54.5769 37.8265 40.6915 52.6429 141.9444
0.115 8.79 3500 0.8433 54.7883 38.0489 41.414 52.3718 142.0
0.115 9.11 3625 0.8416 56.5204 41.3216 44.451 54.7371 142.0
0.115 9.42 3750 0.8164 55.2908 39.0328 41.5761 53.4643 142.0
0.115 9.74 3875 0.8363 55.2659 39.4302 42.1691 53.7407 141.8889
0.0912 10.05 4000 0.8850 55.7855 40.6168 43.1968 54.2718 142.0
0.0912 10.36 4125 0.8268 56.1701 40.7518 42.987 54.1229 141.9259
0.0912 10.68 4250 0.8564 55.4179 39.6097 42.3691 53.4582 141.8889
0.0912 10.99 4375 0.8557 56.1136 41.4924 45.8591 54.6113 141.6667
0.0707 11.31 4500 0.8432 55.0109 39.3858 42.0807 53.4629 142.0
0.0707 11.62 4625 0.8377 54.3239 37.7401 40.4619 52.4602 142.0
0.0707 11.93 4750 0.8794 55.9136 40.6124 43.8806 54.2039 142.0

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.2.1
  • Tokenizers 0.12.1