Spaces:

clip-italian
/

clip-italian-demo

Running

App Files Files Community

vinid commited on Jul 17, 2021

Commit

e5ec521

1 Parent(s): 5fcf75e

new stuff in readme.md

Browse files

Files changed (1) hide show

readme.md +48 -2

readme.md CHANGED Viewed

@@ -7,21 +7,67 @@ The original CLIP model was trained on 400millions text-image pairs; this amount
 ## More Data
 ## Better Augmentations
 ## Better Training
-different optimizer and backbone freezing
 # Scientific Validity
 To better understand how well our clip-italian model works we run an experimental evaluation. Since this is the first clip-based model in Italian, we used the multilingual CLIP model as a comparison baseline.
 We selected two different tasks:
 + image-retrieval
 + zero-shot classification
 ## Image Retrieval
 ## Zero-shot classification
-# Broader Outlook

 ## More Data
+We eventually had to deal with the fact that we do not have the same data that OpenAI had during the training of CLIP.
+Thus, we opted for one choice, data of medium-high quality.
+We considered three main sources of data:
++ WIT. Most of this caption describe ontological knowledge and encyclopedic facts (e.g., Roberto Baggio in 1994).
+However, this kind of text, without more information, is not useful to learn a good mapping between images and captions. On the other hand,
+this text is written in Italian and it is good quality. To prevent polluting the data with captions that are not meaningful, we used POS tagging
+on the data and removed all the captions that were composed for the 80% or more by PROPN.
++ MSCOCO-IT
++ CC
 ## Better Augmentations
 ## Better Training
+### Optimizer
+### Backbone Freezing
+![Backbone Freezing](static/img/clip-italian.png)
 # Scientific Validity
+Those images are definitely cool and interesting, but a model is nothing without validation.
 To better understand how well our clip-italian model works we run an experimental evaluation. Since this is the first clip-based model in Italian, we used the multilingual CLIP model as a comparison baseline.
+## mCLIP
+## Tasks
 We selected two different tasks:
 + image-retrieval
 + zero-shot classification
 ## Image Retrieval
+| MRR             | CLIP-Italian | mCLIP |
+| --------------- | ------------ |-------|
+| MRR@1           |              |       |
+| MRR@5           |              |       |
+| MRR@10          |              |       |
 ## Zero-shot classification
+| Accuracy          | CLIP-Italian | mCLIP |
+| --------------- | ------------ |-------|
+| Accuracy@1      |              |       |
+| Accuracy@5      |              |       |
+| Accuracy@10     |              |       |
+| Accuracy@100    |  81.08       | 67.11 |
+# Broader Outlook
+This readme has been designed using resources from Flaticon.com