A newer version of the Gradio SDK is available:
5.29.0
Finetuning Data
We use GeoChat-Instruct to finetune our model. The instruction following dataset is present in GeoChat_Instruct.json and the images are present in the huggingface repo. The images are split into multiple files. Download the separate files in the same folder and run the following script to merge them.
cat images_parta* > images.zip
Unzip the images in a folder and provide the folder path in training and evaluation scripts.
Data file name | Size |
---|---|
GeoChat_Instruct | 263 MB |
Pretraining Dataset
We use the same pretraining dataset as of LlaVA-v1.5. The pretraining dataset used in this release is a subset of CC-3M dataset, filtered with a more balanced concept coverage distribution. Please see here for a detailed description of the dataset structure and how to download the images.
If you already have CC-3M dataset on your disk, the image names follow this format: GCC_train_000000000.jpg
. You may edit the image
field correspondingly if necessary.
Data | Chat File | Meta Data | Size |
---|---|---|---|
CC-3M Concept-balanced 595K | chat.json | metadata.json | 211 MB |
LAION/CC/SBU BLIP-Caption Concept-balanced 558K | blip_laion_cc_sbu_558k.json | metadata.json | 181 MB |