VK246's picture
End of training
73fe9a2
metadata
base_model: VK246/IC_ver6N_coco_swin_gpt2_50B_1e
tags:
  - generated_from_trainer
datasets:
  - coco
metrics:
  - rouge
model-index:
  - name: IC_ver6O_coco_swin_gpt2_50A_1e
    results: []

IC_ver6O_coco_swin_gpt2_50A_1e

This model is a fine-tuned version of VK246/IC_ver6N_coco_swin_gpt2_50B_1e on the coco dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8849
  • Cider: 72.4854
  • Rouge1: 40.7439
  • Rouge2: 15.2616
  • Rougel: 36.9501
  • Rougelsum: 36.949
  • Bleu-1: 41.8372
  • Bleu-2: 23.7109
  • Bleu-3: 14.6914
  • Bleu-4: 9.6127
  • Gen Len: 11.2806

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 96
  • eval_batch_size: 96
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Cider Rouge1 Rouge2 Rougel Rougelsum Bleu-1 Bleu-2 Bleu-3 Bleu-4 Gen Len
0.3364 0.34 1000 1.0724 68.111 39.6376 14.2539 35.915 35.9088 40.8738 22.7246 13.9284 8.9769 11.2806
0.4722 0.68 2000 0.8849 72.4854 40.7439 15.2616 36.9501 36.949 41.8372 23.7109 14.6914 9.6127 11.2806

Framework versions

  • Transformers 4.32.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3