clip-italian-final /
language: it
- wit
- ctl/conceptualCaptions
- mscoco-it
- italian
- bert
- vit
- vision
# CLIP-Italian
CLIP Italian is a CLIP-like Model for Italian. The CLIP model (Contrastive Language–Image Pre-training) was developed by researchers at OpenAI and is able to efficiently learn visual concepts from natural language supervision.
We fine-tuned a competitive Italian CLIP model with only ~1.4 million Italian image-text pairs. This model is part of the [Flax/Jax Community Week](, organized by [HuggingFace]( and TPU usage sponsored by Google.
## Training Data
We considered three main sources of data:
- [WIT](
- [Conceptual Captions](
## Training Procedure
Preprocessing, hardware used, hyperparameters...
## Evaluation Performance
## Limitations
## Usage
## Team members
- Federico Bianchi ([vinid](
- Raphael Pisoni ([4rtemi5](
- Giuseppe Attanasio ([g8a9](
- Silvia Terragni ([silviatti](
- Dario Balestri ([D3Reo](
- Gabriele Sarti ([gsarti](
- Sri Lakshmi ([srisweet](
## Useful links
- [CLIP Blog post](
- [CLIP paper](
- [Community Week README](
- [Community Week channel](
- [Hybrid CLIP example scripts](
- [Model Repository](