OFA-OCR / datasets.md
JustinLin610's picture
first commit
ee21b96

Datasets

We provide links to download our preprocessed dataset. If you would like to process the data on your own, we will soon provide scripts for you to do so.

Pretraining

The pretraining datasets used in OFA are all publicly available. Here we provide the public links to these data, it is recommended that you download the data from the links first, and then process the downloaded dataset into a similar format as the examples we provided.

Vision & Language Tasks

Vision Tasks

Language Tasks