Edit model card

IC_ver6M_coco_swin_gpt2_50A_1e

This model is a fine-tuned version of VK246/IC_ver6L_coco_swin_gpt2_50B_1e on the coco dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8502
  • Cider: 74.2614
  • Rouge1: 41.2346
  • Rouge2: 15.6726
  • Rougel: 37.3373
  • Rougelsum: 37.3448
  • Bleu-1: 42.1725
  • Bleu-2: 24.0988
  • Bleu-3: 15.0597
  • Bleu-4: 9.8745
  • Gen Len: 11.2806

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 96
  • eval_batch_size: 96
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Cider Rouge1 Rouge2 Rougel Rougelsum Bleu-1 Bleu-2 Bleu-3 Bleu-4 Gen Len
0.4052 0.34 1000 0.9934 68.6639 39.9165 14.5681 36.2437 36.2466 41.4603 23.2118 14.2892 9.3187 11.2806
0.5281 0.68 2000 0.8502 74.2614 41.2346 15.6726 37.3373 37.3448 42.1725 24.0988 15.0597 9.8745 11.2806

Framework versions

  • Transformers 4.32.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Finetuned from