Instructions to use swadhindas324/vit-Mistral-UCM-without-captioning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use swadhindas324/vit-Mistral-UCM-without-captioning with Transformers:
# Load model directly from transformers import AutoTokenizer, VEDM tokenizer = AutoTokenizer.from_pretrained("swadhindas324/vit-Mistral-UCM-without-captioning") model = VEDM.from_pretrained("swadhindas324/vit-Mistral-UCM-without-captioning") - Notebooks
- Google Colab
- Kaggle
vit-Mistral-UCM-without-captioning
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.7752
- Accuracy: 74.26
- Bleu-1: 0.8243
- Bleu-2: 0.7645
- Bleu-3: 0.7164
- Bleu-4: 0.6709
- Meteor: 0.8175
- Rouge-l: 0.7933
- Cider: 3.3030
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 50
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 0.1
- num_epochs: 128
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Bleu-1 | Bleu-2 | Bleu-3 | Bleu-4 | Meteor | Rouge-l | Cider |
|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 1.0 | 148 | 0.8768 | 72.09 | 0.4152 | 0.2825 | 0.2067 | 0.1530 | 0.3201 | 0.3517 | 0.3399 |
| No log | 2.0 | 296 | 0.7330 | 72.01 | 0.4497 | 0.3349 | 0.2616 | 0.2125 | 0.3385 | 0.3896 | 0.9183 |
| No log | 3.0 | 444 | 0.5821 | 72.81 | 0.8068 | 0.7453 | 0.6955 | 0.6512 | 0.7747 | 0.7695 | 3.1554 |
| No log | 4.0 | 592 | 0.5829 | 73.5 | 0.8522 | 0.7882 | 0.7329 | 0.6823 | 0.8173 | 0.8069 | 3.3459 |
| No log | 5.0 | 740 | 0.6105 | 72.64 | 0.8241 | 0.7669 | 0.7169 | 0.6723 | 0.8029 | 0.7968 | 3.2587 |
| No log | 6.0 | 888 | 0.6279 | 74.23 | 0.8280 | 0.7742 | 0.7272 | 0.6845 | 0.8246 | 0.8096 | 3.3476 |
| 0.6501 | 7.0 | 1036 | 0.6323 | 74.45 | 0.8648 | 0.8175 | 0.7766 | 0.7390 | 0.8395 | 0.8265 | 3.5956 |
| 0.6501 | 8.0 | 1184 | 0.6660 | 74.41 | 0.8551 | 0.7956 | 0.7462 | 0.7025 | 0.8141 | 0.8074 | 3.4659 |
| 0.6501 | 9.0 | 1332 | 0.6622 | 74.14 | 0.8649 | 0.8160 | 0.7718 | 0.7337 | 0.8467 | 0.8354 | 3.5959 |
| 0.6501 | 10.0 | 1480 | 0.6729 | 74.22 | 0.8345 | 0.7724 | 0.7250 | 0.6831 | 0.7891 | 0.7843 | 3.4044 |
| 0.6501 | 11.0 | 1628 | 0.7075 | 73.45 | 0.8164 | 0.7500 | 0.6938 | 0.6433 | 0.7997 | 0.7847 | 3.2171 |
| 0.6501 | 12.0 | 1776 | 0.7030 | 73.81 | 0.8256 | 0.7744 | 0.7298 | 0.6853 | 0.8056 | 0.7936 | 3.3060 |
| 0.6501 | 13.0 | 1924 | 0.7365 | 73.25 | 0.8213 | 0.7630 | 0.7178 | 0.6766 | 0.7975 | 0.7907 | 3.3430 |
| 0.2977 | 14.0 | 2072 | 0.7188 | 74.21 | 0.8429 | 0.7895 | 0.7433 | 0.7005 | 0.8267 | 0.8081 | 3.3791 |
| 0.2977 | 15.0 | 2220 | 0.7222 | 74.57 | 0.8542 | 0.8002 | 0.7551 | 0.7118 | 0.8351 | 0.8270 | 3.5997 |
| 0.2977 | 16.0 | 2368 | 0.7646 | 74.77 | 0.8527 | 0.7977 | 0.7511 | 0.7078 | 0.8247 | 0.8165 | 3.4923 |
| 0.2977 | 17.0 | 2516 | 0.7979 | 73.87 | 0.8278 | 0.7664 | 0.7173 | 0.6720 | 0.8007 | 0.7841 | 3.3060 |
| 0.2977 | 18.0 | 2664 | 0.7695 | 74.56 | 0.8475 | 0.7911 | 0.7435 | 0.6987 | 0.8262 | 0.8140 | 3.4166 |
| 0.2977 | 19.0 | 2812 | 0.7752 | 74.26 | 0.8243 | 0.7645 | 0.7164 | 0.6709 | 0.8175 | 0.7933 | 3.3030 |
Framework versions
- Transformers 5.12.1
- Pytorch 2.12.1+cu130
- Datasets 5.0.0
- Tokenizers 0.22.2
- Downloads last month
- 163
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support