flax-community
/

clip-vit-base-patch32_marian-es

clip-vision-marian

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

gchhablani commited on Jul 21, 2021

Commit

62389b1

•

1 Parent(s): 6a7132a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ The Spanish image captioning model was trained on a subset of Conceptual 12M dat
 [Conceptual 12M](https://github.com/google-research-datasets/conceptual-12m), Introduced by Changpinyo et al. in [Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts](https://arxiv.org/abs/2102.08981).
 ### Please update the dataset link here
-The translated dataset can be downloaded from [conceptual-12m-multilingual-marian](https://huggingface.co/datasets/flax-community/conceptual-12m-multilingual-marian). We do not provide images as we do not own any of them. One can download images from the `image_url` section of the original Conceptual 12M dataset.
 ## Data Cleaning 🧹
 Though the original dataset contains 12M image-text pairs, a lot of the URLs are invalid now, and in some cases, images are corrupt or broken. We remove such examples from our data, which leaves us with approximately 10M image-text pairs, out of which we took only 2.5M image, caption pairs.

 [Conceptual 12M](https://github.com/google-research-datasets/conceptual-12m), Introduced by Changpinyo et al. in [Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts](https://arxiv.org/abs/2102.08981).
 ### Please update the dataset link here
+The translated dataset can be downloaded from [conceptual-12m-multilingual-marian-es](https://huggingface.co/datasets/flax-community/conceptual-12m-multilingual-marian-es). We do not provide images as we do not own any of them. One can download images from the `image_url` section of the original Conceptual 12M dataset.
 ## Data Cleaning 🧹
 Though the original dataset contains 12M image-text pairs, a lot of the URLs are invalid now, and in some cases, images are corrupt or broken. We remove such examples from our data, which leaves us with approximately 10M image-text pairs, out of which we took only 2.5M image, caption pairs.