g8a9 commited on
Commit
c158f70
2 Parent(s): bd6347d aacfe19

Merge branch 'main' of https://huggingface.co/spaces/clip-italian/clip-italian-demo into main

Browse files
Files changed (1) hide show
  1. introduction.md +6 -3
introduction.md CHANGED
@@ -66,12 +66,11 @@ We considered four main sources of data:
66
  [Srinivasan et al., 2021](https://arxiv.org/pdf/2103.01913.pdf)). We focused on the *Reference Description* captions
67
  described in the paper as they are the ones of highest quality. Nonetheless, many of these captions describe ontological knowledge and encyclopedic facts (e.g., Roberto Baggio in 1994).
68
  However, this kind of text, without more information, is not useful to learn a good mapping between images and captions.
69
- On the other hand, this text is written in Italian and it is of good quality. We cannot just remove short captions as some of those
70
- are still good (e.g., "running dog"). Thus, to prevent polluting the data with captions that are not meaningful, we used *POS tagging*
71
  on the text and removed all the captions that were composed for the 80% or more by PROPN (around ~10% of the data). This is a simple solution that allowed us to retain much
72
  of the dataset, without introducing noise.
73
 
74
- Captions like: *'Dora Riparia', 'Anna Maria Mozzoni', 'Joey Ramone Place', 'Kim Rhodes', 'Ralph George Hawtrey' * have been removed.
75
 
76
  + [MSCOCO-IT](https://github.com/crux82/mscoco-it). This image-caption dataset comes from the work by [Scaiella et al., 2019](http://www.ai-lc.it/IJCoL/v5n2/IJCOL_5_2_3___scaiella_et_al.pdf). The captions come from the original
77
  MSCOCO dataset and have been translated with Microsoft Translator. The 2017 version of the MSCOCO training set contains more than
@@ -265,16 +264,20 @@ Look at the following - slightly cherry picked - examples:
265
 
266
  ### Colors
267
  Here's "a yellow flower"
 
268
  <img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/fiore_giallo.png" alt="drawing" width="500"/>
269
 
270
  And here's "a blue flower"
 
271
  <img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/fiore_blu.png" alt="drawing" width="500"/>
272
 
273
  ### Counting
274
  What about "one cat"?
 
275
  <img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/gatto.png" alt="drawing" width="500"/>
276
 
277
  And what about "two cats"?
 
278
  <img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/due_gatti.png" alt="drawing" width="500"/>
279
 
280
  ### Complex Queries
 
66
  [Srinivasan et al., 2021](https://arxiv.org/pdf/2103.01913.pdf)). We focused on the *Reference Description* captions
67
  described in the paper as they are the ones of highest quality. Nonetheless, many of these captions describe ontological knowledge and encyclopedic facts (e.g., Roberto Baggio in 1994).
68
  However, this kind of text, without more information, is not useful to learn a good mapping between images and captions.
69
+ To prevent polluting the data with captions that are not meaningful, we used *POS tagging*
 
70
  on the text and removed all the captions that were composed for the 80% or more by PROPN (around ~10% of the data). This is a simple solution that allowed us to retain much
71
  of the dataset, without introducing noise.
72
 
73
+ Captions like *'Dora Riparia', 'Anna Maria Mozzoni', 'Joey Ramone Place', 'Kim Rhodes', 'Ralph George Hawtrey' * have been removed.
74
 
75
  + [MSCOCO-IT](https://github.com/crux82/mscoco-it). This image-caption dataset comes from the work by [Scaiella et al., 2019](http://www.ai-lc.it/IJCoL/v5n2/IJCOL_5_2_3___scaiella_et_al.pdf). The captions come from the original
76
  MSCOCO dataset and have been translated with Microsoft Translator. The 2017 version of the MSCOCO training set contains more than
 
264
 
265
  ### Colors
266
  Here's "a yellow flower"
267
+
268
  <img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/fiore_giallo.png" alt="drawing" width="500"/>
269
 
270
  And here's "a blue flower"
271
+
272
  <img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/fiore_blu.png" alt="drawing" width="500"/>
273
 
274
  ### Counting
275
  What about "one cat"?
276
+
277
  <img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/gatto.png" alt="drawing" width="500"/>
278
 
279
  And what about "two cats"?
280
+
281
  <img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/due_gatti.png" alt="drawing" width="500"/>
282
 
283
  ### Complex Queries