srisweet commited on
Commit
12ddc30
1 Parent(s): 705e8fa

Update introduction.md

Browse files
Files changed (1) hide show
  1. introduction.md +2 -2
introduction.md CHANGED
@@ -1,10 +1,10 @@
1
 
2
- CLIP-Italian is a multimodal model trained on ~1.4 million Italian text-image pairs using Italian Bert model as text encoder and Vision Transformer(ViT) as image encoder using the JAX/Flax neural network library. The training was carried out during the Hugging Face Community event on Google's TPU machines, sponsored by Google Cloud.
3
 
4
  Clip-Italian (Contrastive Language-Image Pre-training in Italian language) is based on OpenAI’s CLIP ([Radford et al., 2021](https://arxiv.org/abs/2103.00020))which is an amazing model that can learn to represent images and text jointly in the same space.
5
 
6
  In this project, we aim to propose the first CLIP model trained on Italian data, that in this context can be considered a
7
- low resource language. Using a few techniques, we have been able to fine-tune a SOTA Italian CLIP model with **only 1.4 million** training samples. Our Italian CLIP model
8
  is built upon the pre-trained [Italian BERT](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) model provided by [dbmdz](https://huggingface.co/dbmdz) and the OpenAI
9
  [vision transformer](https://huggingface.co/openai/clip-vit-base-patch32).
10
 
 
1
 
2
+ CLIP-Italian is a **multimodal** model trained on **~1.4 Million** Italian text-image pairs using **Italian Bert** model as text encoder and Vision Transformer **ViT** as image encoder using the **JAX/Flax** neural network library. The training was carried out during the **Hugging Face** Community event on **Google's TPU** machines, sponsored by **Google Cloud**.
3
 
4
  Clip-Italian (Contrastive Language-Image Pre-training in Italian language) is based on OpenAI’s CLIP ([Radford et al., 2021](https://arxiv.org/abs/2103.00020))which is an amazing model that can learn to represent images and text jointly in the same space.
5
 
6
  In this project, we aim to propose the first CLIP model trained on Italian data, that in this context can be considered a
7
+ low resource language. Using a few techniques, we have been able to fine-tune a SOTA Italian CLIP model with **only 1.4M** training samples. Our Italian CLIP model
8
  is built upon the pre-trained [Italian BERT](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) model provided by [dbmdz](https://huggingface.co/dbmdz) and the OpenAI
9
  [vision transformer](https://huggingface.co/openai/clip-vit-base-patch32).
10