VK246
/

IC_ver6a_coco_swin_gpt2_50Apc_1e

Image-Text-to-Text

vision-encoder-decoder

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Edit model card

IC_ver6a_coco_swin_gpt2_50Apc_1e

This model is a fine-tuned version of on the coco dataset. It achieves the following results on the evaluation set:

Loss: 0.8477
Rouge1: 40.2406
Rouge2: 15.0629
Rougel: 36.6294
Rougelsum: 36.6164
Bleu: 9.0728
Gen Len: 11.2806

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 96
eval_batch_size: 96
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bleu	Gen Len
1.1343	0.17	500	0.9708	35.1592	11.4248	32.3362	32.3316	6.404	11.2806
0.9606	0.34	1000	0.9123	37.9656	12.9721	34.5569	34.5606	7.489	11.2806
0.9286	0.51	1500	0.8828	38.7702	13.945	35.4661	35.4648	8.022	11.2806
0.8994	0.68	2000	0.8619	39.8572	14.6183	36.3345	36.3262	8.7008	11.2806
0.8843	0.85	2500	0.8525	39.8151	14.7431	36.3033	36.2918	8.8305	11.2806

Framework versions

Transformers 4.30.2
Pytorch 2.0.1+cu118
Datasets 2.13.1
Tokenizers 0.13.3

Downloads last month: 0

Inference API

Image-Text-to-Text

Inference API (serverless) does not yet support transformers models for this pipeline type.

Evaluation results

Metadata error: specify a dataset to view leaderboard