Silvia Terragni commited on
Commit
bb7e7d5
1 Parent(s): 413a2a7

fixed typos

Browse files
Files changed (1) hide show
  1. introduction.md +2 -2
introduction.md CHANGED
@@ -84,7 +84,7 @@ Each photo comes along with an Italian caption.
84
 
85
  Instead of relying on open-source translators, we decided to use DeepL. **Translation quality** of the data was the main
86
  reason of this choice. With the few images (wrt OpenAI) that we have, we cannot risk polluting our own data. CC is a great resource
87
- but the captions have to be handled accordingly. We translated 700K captions and we evaluated their quality:
88
 
89
  Three of us looked at a sample of 100 of the translations and rated them with scores from 1 to 4.
90
  The meaning of the value is as follows: 1, the sentence has lost is meaning or it's not possible to understand it; 2, it is possible to get the idea
@@ -99,7 +99,7 @@ weighting - of 0.858 (great agreement!).
99
  | person walking down the aisle | persona che cammina lungo la navata |
100
  | popular rides at night at the county fair | giostre popolari di notte alla fiera della contea |
101
 
102
- \t\t\t
103
  We know that we annotated our own data; in the spirit of fairness we also share the annotations and the captions so
104
  that those interested can check the quality. The Google Sheet is [here](https://docs.google.com/spreadsheets/d/1m6TkcpJbmJlEygL7SXURIq2w8ZHuVvsmdEuCIH0VENk/edit?usp=sharing).
105
 
 
84
 
85
  Instead of relying on open-source translators, we decided to use DeepL. **Translation quality** of the data was the main
86
  reason of this choice. With the few images (wrt OpenAI) that we have, we cannot risk polluting our own data. CC is a great resource
87
+ but the captions have to be handled accordingly. We translated 700K captions and we evaluated their quality.
88
 
89
  Three of us looked at a sample of 100 of the translations and rated them with scores from 1 to 4.
90
  The meaning of the value is as follows: 1, the sentence has lost is meaning or it's not possible to understand it; 2, it is possible to get the idea
 
99
  | person walking down the aisle | persona che cammina lungo la navata |
100
  | popular rides at night at the county fair | giostre popolari di notte alla fiera della contea |
101
 
102
+
103
  We know that we annotated our own data; in the spirit of fairness we also share the annotations and the captions so
104
  that those interested can check the quality. The Google Sheet is [here](https://docs.google.com/spreadsheets/d/1m6TkcpJbmJlEygL7SXURIq2w8ZHuVvsmdEuCIH0VENk/edit?usp=sharing).
105