TrTr-CMR-SYDNEY-MS-captioning

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0235
  • Accuracy: 58.76
  • Bleu-1: 0.8541
  • Bleu-2: 0.8006
  • Bleu-3: 0.7487
  • Bleu-4: 0.6993
  • Meteor: 0.7933
  • Rouge-l: 0.7655
  • Cider: 3.0086

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 50
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1024
  • num_epochs: 128
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy Bleu-1 Bleu-2 Bleu-3 Bleu-4 Meteor Rouge-l Cider
No log 1.0 39 9.5700 65.19 0.0216 0.0015 0.0007 0.0005 0.0279 0.0799 0.0178
No log 2.0 78 4.1754 65.49 0.0379 0.0018 0.0008 0.0006 0.0577 0.1657 0.0117
No log 3.0 117 3.5413 65.74 0.2514 0.1586 0.0767 0.0187 0.1411 0.2765 0.1280
No log 4.0 156 2.8280 56.59 0.4240 0.3439 0.1929 0.1080 0.3176 0.4125 0.3287
No log 5.0 195 2.0664 54.97 0.6707 0.5901 0.4928 0.4076 0.5716 0.5987 1.4822
No log 6.0 234 1.6072 55.09 0.7323 0.6586 0.5768 0.4978 0.6541 0.6633 1.9934
No log 7.0 273 1.3725 60.7 0.8113 0.7281 0.6465 0.5685 0.6930 0.6957 2.2724
No log 8.0 312 1.2215 60.57 0.8166 0.7280 0.6365 0.5539 0.7279 0.7186 2.3927
No log 9.0 351 1.1166 59.36 0.8172 0.7370 0.6502 0.5669 0.7479 0.7404 2.4606
No log 10.0 390 1.0858 60.98 0.8254 0.7438 0.6643 0.5904 0.7643 0.7445 2.4932
No log 11.0 429 1.0154 58.16 0.8209 0.7438 0.6675 0.5908 0.7556 0.7352 2.4243
No log 12.0 468 0.9940 59.48 0.8179 0.7341 0.6502 0.5687 0.7543 0.7421 2.5154
No log 13.0 507 0.9646 57.9 0.8204 0.7470 0.6773 0.6090 0.7776 0.7448 2.6404
No log 14.0 546 0.9777 58.38 0.8203 0.7442 0.6672 0.5905 0.7714 0.7432 2.5989
No log 15.0 585 0.9076 57.9 0.8647 0.8039 0.7501 0.6976 0.8136 0.7911 3.1252
No log 16.0 624 0.9375 56.43 0.8298 0.7695 0.7144 0.6630 0.8087 0.7669 2.8870
No log 17.0 663 0.9850 55.74 0.8266 0.7412 0.6682 0.5989 0.7825 0.7382 2.6386
No log 18.0 702 0.9649 55.53 0.8539 0.7830 0.7139 0.6444 0.7944 0.7638 2.7532
No log 19.0 741 0.9414 59.45 0.8439 0.7701 0.6994 0.6318 0.7585 0.7510 2.7099
No log 20.0 780 0.9716 56.31 0.8280 0.7538 0.6811 0.6130 0.7836 0.7536 2.6435
No log 21.0 819 1.0360 57.58 0.8268 0.7444 0.6703 0.6009 0.7439 0.7311 2.5731
No log 22.0 858 0.9405 55.54 0.8197 0.7381 0.6757 0.6234 0.7705 0.7430 2.7865
No log 23.0 897 1.0226 56.77 0.8227 0.7515 0.6830 0.6153 0.7648 0.7266 2.7045
No log 24.0 936 1.0538 55.75 0.8286 0.7471 0.6761 0.6129 0.7580 0.7454 2.7123
No log 25.0 975 1.0235 58.76 0.8541 0.8006 0.7487 0.6993 0.7933 0.7655 3.0086

Framework versions

  • Transformers 5.8.1
  • Pytorch 2.12.0+cu130
  • Datasets 5.0.0
  • Tokenizers 0.22.2
Downloads last month
36
Safetensors
Model size
0.1B params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support