Spaces:
Running
Running
update text for IRR
Browse files- introduction.md +16 -5
introduction.md
CHANGED
@@ -238,19 +238,30 @@ can be a killer app in this context, providing a way to search for images and te
|
|
238 |
of photos in digital format. For example, the [Istituto Luce Cinecittà](https://it.wikipedia.org/wiki/Istituto_Luce_Cinecitt%C3%A0) is an Italian governative entity that collects photos of Italy since the
|
239 |
early 1900 and it is part of the largest movie studios in Europe (Cinecittà).
|
240 |
|
241 |
-
#
|
242 |
|
243 |
-
|
|
|
|
|
|
|
|
|
244 |
|
245 |
-
|
246 |
|
247 |
-
|
248 |
|
249 |
Gwet, K. L. (2008). [Computing inter‐rater reliability and its variance in the presence of high agreement.](https://bpspsychub.onlinelibrary.wiley.com/doi/full/10.1348/000711006X126600) British Journal of Mathematical and Statistical Psychology, 61(1), 29-48.
|
250 |
|
|
|
|
|
251 |
Reimers, N., & Gurevych, I. (2020, November). [Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation.](https://aclanthology.org/2020.emnlp-main.365/) In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 4512-4525).
|
252 |
|
253 |
-
|
|
|
|
|
|
|
|
|
|
|
254 |
|
255 |
# Other Notes
|
256 |
This readme has been designed using resources from Flaticon.com
|
|
|
238 |
of photos in digital format. For example, the [Istituto Luce Cinecittà](https://it.wikipedia.org/wiki/Istituto_Luce_Cinecitt%C3%A0) is an Italian governative entity that collects photos of Italy since the
|
239 |
early 1900 and it is part of the largest movie studios in Europe (Cinecittà).
|
240 |
|
241 |
+
# Limitations
|
242 |
|
243 |
+
Currently, the model is not without limits. To mention one, its counting capabilities seem very cool, but from our experiments the model
|
244 |
+
finds difficult to count after three; this is a general limitation.
|
245 |
+
There are even more serious limitations: we found some emergence of biases and stereotypes that got in our model from different factors: searching for "una troia" ("a bitch") on the
|
246 |
+
CC dataset shows the picture of a woman. This issue is common to many machine learning algorithms (check [Abit et al., 2021](https://arxiv.org/abs/2101.05783) for bias in GPT-3 as an example) and
|
247 |
+
suggest we need to work even harder on this problem that affects our **society**.
|
248 |
|
249 |
+
# References
|
250 |
|
251 |
+
Abid, A., Farooqi, M., & Zou, J. (2021). [Persistent anti-muslim bias in large language models.](https://arxiv.org/abs/2101.05783) arXiv preprint arXiv:2101.05783.
|
252 |
|
253 |
Gwet, K. L. (2008). [Computing inter‐rater reliability and its variance in the presence of high agreement.](https://bpspsychub.onlinelibrary.wiley.com/doi/full/10.1348/000711006X126600) British Journal of Mathematical and Statistical Psychology, 61(1), 29-48.
|
254 |
|
255 |
+
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). [Learning Transferable Visual Models From Natural Language Supervision.](https://arxiv.org/abs/2103.00020) ICML.
|
256 |
+
|
257 |
Reimers, N., & Gurevych, I. (2020, November). [Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation.](https://aclanthology.org/2020.emnlp-main.365/) In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 4512-4525).
|
258 |
|
259 |
+
Scaiella, A., Croce, D., & Basili, R. (2019). [Large scale datasets for Image and Video Captioning in Italian.](http://www.ai-lc.it/IJCoL/v5n2/IJCOL_5_2_3___scaiella_et_al.pdf) IJCoL. Italian Journal of Computational Linguistics, 5(5-2), 49-60.
|
260 |
+
|
261 |
+
Sharma, P., Ding, N., Goodman, S., & Soricut, R. (2018, July). [Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning.](https://aclanthology.org/P18-1238.pdf) In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 2556-2565).
|
262 |
+
|
263 |
+
Srinivasan, K., Raman, K., Chen, J., Bendersky, M., & Najork, M. (2021). [WIT: Wikipedia-based image text dataset for multimodal multilingual machine learning](https://arxiv.org/pdf/2103.01913.pdf). arXiv preprint arXiv:2103.01913.
|
264 |
+
|
265 |
|
266 |
# Other Notes
|
267 |
This readme has been designed using resources from Flaticon.com
|