Edit model card

This repo hosts the data generation pipeline, COCO-based instructional data, and the image-pair input benchmark for "Scalable Enhancement of VLMs with Pretrained Models".

Please refer to https://huggingface.co/datasets/VLMaug/VLMaug for preview of our pair-image QA benchmark dataset.

Filename Description
VLMaug_coco_instruction_data Our COCO-based instructional data, which consists of 118,287 folders. Each folder contains multiple possible modified/generated images along with their corresponding single-image or image-pair QAs. Please unzip all the zip files (data_partition[1~5].zip).
VLMaug-master.zip Our code for data generation, DPO training, visualization, evaluation, and reproduce the numbers for baselines.
VLMaug_twopairQA_benchmark.parquet Our image-pair input benchmark, which consists of 15,692 images and 7,846 QAs. Please refer to https://huggingface.co/datasets/VLMaug/VLMaug for preview.
Downloads last month

-

Downloads are not tracked for this model. How to track
Unable to determine this model's library. Check the docs .