Edit model card

IC_ver3b_coco_swin_gpt2_2

This model is a fine-tuned version of on the coco dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8483
  • Rouge1: 41.3447
  • Rouge2: 15.7294
  • Rougel: 37.6633
  • Rougelsum: 37.6744
  • Bleu: 9.4309
  • Gen Len: 11.3368

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 96
  • eval_batch_size: 96
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len
1.2141 0.25 300 1.0093 35.2179 11.1228 32.1546 32.167 6.2018 11.3368
1.0037 0.51 600 0.9600 36.4586 11.8379 33.324 33.3342 7.0081 11.3368
0.9644 0.76 900 0.9303 38.5343 13.2266 35.2902 35.3055 7.539 11.3368
0.9367 1.02 1200 0.9004 39.2182 13.7589 35.7747 35.7799 7.6492 11.3368
0.8842 1.27 1500 0.8876 39.4537 14.1037 35.9758 35.9776 8.4067 11.3368
0.86 1.53 1800 0.8758 40.4179 15.0774 37.0166 37.0401 8.8897 11.3368
0.8465 1.78 2100 0.8665 40.4073 15.1125 36.9767 36.9877 8.9602 11.3368
0.8421 2.04 2400 0.8592 40.62 15.2042 36.9224 36.9359 9.1313 11.3368
0.8106 2.29 2700 0.8548 41.0356 15.399 37.4562 37.4635 9.2534 11.3368
0.7963 2.54 3000 0.8521 41.1998 15.6442 37.6659 37.6682 9.4605 11.3368
0.795 2.8 3300 0.8493 41.1215 15.581 37.4725 37.4978 9.5488 11.3368

Framework versions

  • Transformers 4.30.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.13.1
  • Tokenizers 0.13.3
Downloads last month
2
Unable to determine this model’s pipeline type. Check the docs .