Instructions to use swadhindas324/swin-Mistral-UCM-without-captioning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use swadhindas324/swin-Mistral-UCM-without-captioning with Transformers:
# Load model directly from transformers import AutoTokenizer, VEDM tokenizer = AutoTokenizer.from_pretrained("swadhindas324/swin-Mistral-UCM-without-captioning") model = VEDM.from_pretrained("swadhindas324/swin-Mistral-UCM-without-captioning") - Notebooks
- Google Colab
- Kaggle
swin-Mistral-UCM-captioning
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.7974
- Accuracy: 74.93
- Bleu-1: 0.8485
- Bleu-2: 0.7909
- Bleu-3: 0.7437
- Bleu-4: 0.7027
- Meteor: 0.8063
- Rouge-l: 0.8060
- Cider: 3.4543
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 8
- seed: 50
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1024
- num_epochs: 128
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Bleu-1 | Bleu-2 | Bleu-3 | Bleu-4 | Meteor | Rouge-l | Cider |
|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 1.0 | 148 | 0.7681 | 69.91 | 0.3560 | 0.2384 | 0.1671 | 0.1215 | 0.2719 | 0.2982 | 0.4223 |
| No log | 2.0 | 296 | 0.6346 | 69.59 | 0.7218 | 0.6492 | 0.5907 | 0.5395 | 0.7064 | 0.6989 | 2.7599 |
| No log | 3.0 | 444 | 0.6058 | 75.06 | 0.8570 | 0.7971 | 0.7496 | 0.7049 | 0.8177 | 0.8083 | 3.3427 |
| No log | 4.0 | 592 | 0.6234 | 73.7 | 0.8099 | 0.7363 | 0.6792 | 0.6276 | 0.7820 | 0.7740 | 3.2220 |
| No log | 5.0 | 740 | 0.6354 | 73.61 | 0.8249 | 0.7579 | 0.7078 | 0.6650 | 0.7911 | 0.7806 | 3.2551 |
| No log | 6.0 | 888 | 0.6600 | 74.44 | 0.8429 | 0.7862 | 0.7373 | 0.6933 | 0.8238 | 0.8193 | 3.4049 |
| 0.5906 | 7.0 | 1036 | 0.6738 | 74.57 | 0.8390 | 0.7797 | 0.7328 | 0.6912 | 0.7946 | 0.7973 | 3.3445 |
| 0.5906 | 8.0 | 1184 | 0.7451 | 73.86 | 0.8469 | 0.7902 | 0.7409 | 0.6979 | 0.8163 | 0.8066 | 3.4162 |
| 0.5906 | 9.0 | 1332 | 0.7000 | 75.19 | 0.8503 | 0.7928 | 0.7448 | 0.7017 | 0.8172 | 0.8065 | 3.5555 |
| 0.5906 | 10.0 | 1480 | 0.7449 | 74.66 | 0.8552 | 0.7931 | 0.7448 | 0.7011 | 0.8157 | 0.8053 | 3.4820 |
| 0.5906 | 11.0 | 1628 | 0.7309 | 74.96 | 0.8498 | 0.8003 | 0.7594 | 0.7227 | 0.7923 | 0.7952 | 3.4882 |
| 0.5906 | 12.0 | 1776 | 0.7576 | 74.54 | 0.8356 | 0.7747 | 0.7271 | 0.6839 | 0.8087 | 0.8012 | 3.3285 |
| 0.5906 | 13.0 | 1924 | 0.7656 | 74.75 | 0.8474 | 0.7953 | 0.7535 | 0.7159 | 0.8160 | 0.8147 | 3.5224 |
| 0.2906 | 14.0 | 2072 | 0.7898 | 74.23 | 0.8329 | 0.7736 | 0.7267 | 0.6855 | 0.7933 | 0.7874 | 3.4272 |
| 0.2906 | 15.0 | 2220 | 0.8058 | 74.46 | 0.8438 | 0.7815 | 0.7309 | 0.6851 | 0.8047 | 0.7996 | 3.5010 |
| 0.2906 | 16.0 | 2368 | 0.7974 | 74.93 | 0.8485 | 0.7909 | 0.7437 | 0.7027 | 0.8063 | 0.8060 | 3.4543 |
Framework versions
- Transformers 5.12.1
- Pytorch 2.12.0+cu130
- Datasets 5.0.0
- Tokenizers 0.22.2
- Downloads last month
- 7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support