Edit model card

bert-small2bert-small-finetuned-cnn_daily_mail-summarization-finetuned-multi_news

This model is a fine-tuned version of mrm8488/bert-small2bert-small-finetuned-cnn_daily_mail-summarization on the multi_news dataset. It achieves the following results on the evaluation set:

  • Loss: 4.3760
  • Rouge1: 38.5318
  • Rouge2: 12.7285
  • Rougel: 21.4358
  • Rougelsum: 33.4565
  • Gen Len: 128.985

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 5
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
4.6946 0.89 400 4.5393 37.164 11.5191 20.2519 32.1568 126.415
4.5128 1.78 800 4.4185 38.2345 12.2053 20.954 33.0667 128.975
4.2926 2.67 1200 4.3866 38.4475 12.6488 21.3046 33.2768 129.0
4.231 3.56 1600 4.3808 38.7008 12.6323 21.307 33.3693 128.955
4.125 4.44 2000 4.3760 38.5318 12.7285 21.4358 33.4565 128.985

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.11.0
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
9

Dataset used to train keith97/bert-small2bert-small-finetuned-cnn_daily_mail-summarization-finetuned-multi_news

Evaluation results