# Data requirements To execute the code the following data are needed, once downloaded the path to the data must be specified in the misc/config.py file. * [Ms-CoCo dataset (annotations and images)](http://cocodataset.org/#home) * [Ms CoCo rest-val split](https://cs.stanford.edu/people/karpathy/deepimagesent/coco.zip) from "Deep Visual-Semantic Alignments for Generating Image Descriptions" by Karpathy et al. * [Word embedding](http://www.cs.toronto.edu/~rkiros/models/utable.npy) and [dictionnary](http://www.cs.toronto.edu/~rkiros/models/dictionary.txt) from the paper "Skip-Thought Vectors" by Kiros et al. * [Pre-initialized weights of the image pipeline](https://cloud.lip6.fr/index.php/s/sEiwuVj7UXWwSjf) ## Additionnal data for localization evaluation * [Visual Genome dataset (images and data and region descriptions)](https://visualgenome.org/)