### ScienceQA #### Prepare Data 1. Please see ScienceQA [repo](https://github.com/lupantech/ScienceQA) for setting up the dataset. 2. Generate ScienceQA dataset for LLaVA conversation-style format. ```Shell python scripts/convert_sqa_to_llava.py \ convert_to_llava \ --base-dir /path/to/ScienceQA/data/scienceqa \ --prompt-format "QCM-LEA" \ --split {train,val,minival,test,minitest} ``` #### Training 1. Pretraining You can download our pretrained projector weights from our [Model Zoo](), or train your own projector weights using [`pretrain.sh`](https://github.com/haotian-liu/LLaVA/blob/main/scripts/pretrain.sh). 2. Finetuning See [`finetune_sqa.sh`](https://github.com/haotian-liu/LLaVA/blob/main/scripts/finetune_sqa.sh). #### Evaluation 1. Multiple-GPU inference You may evaluate this with multiple GPUs, and concatenate the generated jsonl files. Please refer to our script for [batch evaluation](https://github.com/haotian-liu/LLaVA/blob/main/scripts/sqa_eval_batch.sh) and [results gathering](https://github.com/haotian-liu/LLaVA/blob/main/scripts/sqa_eval_gather.sh). 2. Single-GPU inference (a) Generate LLaVA responses on ScienceQA dataset ```Shell python -m llava.eval.model_vqa_science \ --model-path liuhaotian/llava-lcs558k-scienceqa-vicuna-13b-v1.3 \ --question-file /path/to/ScienceQA/data/scienceqa/llava_test_QCM-LEA.json \ --image-folder /path/to/ScienceQA/data/scienceqa/images/test \ --answers-file vqa/results/ScienceQA/test_llava-13b.jsonl \ --conv-mode llava_v1 ``` (b) Evaluate the generated responses ```Shell python eval_science_qa.py \ --base-dir /path/to/ScienceQA/data/scienceqa \ --result-file vqa/results/ScienceQA/test_llava-13b.jsonl \ --output-file vqa/results/ScienceQA/test_llava-13b_output.json \ --output-result vqa/results/ScienceQA/test_llava-13b_result.json \ ``` For reference, we attach our prediction file [`test_sqa_llava_lcs_558k_sqa_12e_vicuna_v1_3_13b.json`](https://github.com/haotian-liu/LLaVA/blob/main/llava/eval/table/results/test_sqa_llava_lcs_558k_sqa_12e_vicuna_v1_3_13b.json) and [`test_sqa_llava_13b_v0.json`](https://github.com/haotian-liu/LLaVA/blob/main/llava/eval/table/results/test_sqa_llava_13b_v0.json) for comparison when reproducing our results, as well as for further analysis in detail.