File size: 1,916 Bytes
055322e e188b0b 055322e 3227753 055322e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
license: mit
---
# Model Card for CLIP_COCO
## Model Description
- **Homepage:** https://imirandam.github.io/BiVLC_project_page/
- **Repository:** https://github.com/IMirandaM/BiVLC
- **Paper:** https://arxiv.org/abs/2406.09952
- **Point of Contact:** [Imanol Miranda](mailto:imanol.miranda@ehu.eus)
### Model Summary
CLIP_COCO is a model presented in the [BiVLC](https://github.com/IMirandaM/BiVLC) paper for experimentation. It has been fine-tuned with OpenCLIP framework using as basis the CLIP ViT-B-32 model pre-trained by 'openai'. The idea behind this fine-tuning is to have a baseline to compare the [CLIP_TROHN-Text](https://huggingface.co/imirandam/CLIP_TROHN-Text) and [CLIP_TROHN-Img](https://huggingface.co/imirandam/CLIP_TROHN-Img) models. Hyperparameters:
* Learning rate: 1e-6.
* Scheduler: Cosine scheduler with 50 warmup steps.
* Optimizer: AdamW optimizer with beta1 = 0.9, beta2 = 0.98, eps = 1e-6 and weight decay = 0.1.
* Loss function: InfoNCE Loss.
* Batch size: We define a batch size of 400, resulting in 400 images x 400 captions.
* Epochs: We fine-tune all models over 10 epochs and we used validation accuracy as the model selection criterion, i.e. we selected the model with the highest accuracy on the corresponding validation set.
* Data: It is fine-tuned with COCO 2017 train split.
### Evaluation Data
The model is evaluated in [BiVLC](https://huggingface.co/datasets/imirandam/BiVLC).
### Licensing Information
This work is licensed under a MIT License.
## Citation Information
If you find this dataset useful, please consider citing our paper:
```
@misc{miranda2024bivlc,
title={BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval},
author={Imanol Miranda and Ander Salaberria and Eneko Agirre and Gorka Azkune},
year={2024},
eprint={2406.09952},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
``` |