Edit model card

finetuned_bart_large_custom

This model is a fine-tuned version of facebook/bart-large-cnn on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.8324
  • Rouge1: 39.9143
  • Rouge2: 10.7144
  • Rougel: 21.1537
  • Rougelsum: 35.81
  • Gen Len: 131.6667

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 16 4.3093 39.1367 9.9819 21.0796 35.3746 132.0741
No log 2.0 32 4.2921 39.0619 9.8356 21.7437 35.6597 131.7037
No log 3.0 48 4.3876 39.5314 10.337 21.0096 35.9973 131.2593
No log 4.0 64 4.4020 39.3551 9.9689 21.4343 35.3958 131.1481
No log 5.0 80 4.3744 39.7603 10.4124 21.6535 35.4996 132.963
No log 6.0 96 4.4821 39.9859 11.0712 22.2449 35.7868 132.4074
No log 7.0 112 4.6017 38.765 10.3317 20.9319 34.6675 132.2593
No log 8.0 128 4.4419 39.9964 10.3341 20.9618 35.8621 130.2222
No log 9.0 144 4.4990 39.8075 10.3829 21.3509 35.9882 128.7407
No log 10.0 160 4.7017 38.6152 9.9282 20.4588 34.4487 131.9259
No log 11.0 176 4.5497 39.0296 9.9429 20.8087 34.4624 132.6296
No log 12.0 192 4.7301 38.8819 9.5937 20.929 34.7983 131.4444
No log 13.0 208 4.5114 38.4163 9.6869 20.373 34.1491 123.8519
No log 14.0 224 4.7097 38.4294 9.5615 20.1514 35.0332 131.7407
No log 15.0 240 4.6300 38.9564 9.6386 20.0618 34.8298 129.963
No log 16.0 256 4.6916 38.5582 10.136 20.8347 34.4795 129.8519
No log 17.0 272 4.6959 38.3264 9.5281 20.5576 34.6148 128.2963
No log 18.0 288 4.6756 37.5569 9.123 19.8291 33.5111 126.6667
No log 19.0 304 4.7579 38.5704 9.3654 20.1826 34.8297 131.4815
No log 20.0 320 4.8128 40.158 10.3889 20.9267 36.8965 130.1852
No log 21.0 336 4.7659 39.4144 10.2445 20.4763 35.328 134.2593
No log 22.0 352 4.7983 40.2859 11.0388 21.1643 36.0311 131.9259
No log 23.0 368 4.7954 39.2676 10.5795 21.1116 35.3949 130.1481
No log 24.0 384 4.7991 39.8126 10.3955 21.2952 35.7538 130.5926
No log 25.0 400 4.8371 39.3481 10.2857 20.9862 35.1724 125.1481
No log 26.0 416 4.8589 40.0988 10.4426 21.7284 35.7289 130.3333
No log 27.0 432 4.8423 39.9233 10.3253 21.5853 36.1194 131.1111
No log 28.0 448 4.8274 40.0388 10.1713 20.991 35.3966 130.4444
No log 29.0 464 4.8313 39.8516 10.6207 21.0394 35.6627 130.8148
No log 30.0 480 4.8324 39.9143 10.7144 21.1537 35.81 131.6667

Framework versions

  • Transformers 4.37.0
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.1
Downloads last month
1
Safetensors
Model size
406M params
Tensor type
F32
·

Finetuned from