LLaVA_v1

Runtime error

App Files Files Community

LLaVA_v1 / docs /ScienceQA.md

badayvedat

feat: Add LLaVA model

a824a18 about 1 year ago

preview code

raw

history blame

2.33 kB

	### ScienceQA

	#### Prepare Data
	1. Please see ScienceQA [repo](https://github.com/lupantech/ScienceQA) for setting up the dataset.
	2. Generate ScienceQA dataset for LLaVA conversation-style format.

	```Shell
	python scripts/convert_sqa_to_llava.py \
	convert_to_llava \
	--base-dir /path/to/ScienceQA/data/scienceqa \
	--prompt-format "QCM-LEA" \
	--split {train,val,minival,test,minitest}
	```

	#### Training

	1. Pretraining

	You can download our pretrained projector weights from our [Model Zoo](), or train your own projector weights using [`pretrain.sh`](https://github.com/haotian-liu/LLaVA/blob/main/scripts/pretrain.sh).

	2. Finetuning

	See [`finetune_sqa.sh`](https://github.com/haotian-liu/LLaVA/blob/main/scripts/finetune_sqa.sh).

	#### Evaluation

	1. Multiple-GPU inference
	You may evaluate this with multiple GPUs, and concatenate the generated jsonl files. Please refer to our script for [batch evaluation](https://github.com/haotian-liu/LLaVA/blob/main/scripts/sqa_eval_batch.sh) and [results gathering](https://github.com/haotian-liu/LLaVA/blob/main/scripts/sqa_eval_gather.sh).

	2. Single-GPU inference

	(a) Generate LLaVA responses on ScienceQA dataset

	```Shell
	python -m llava.eval.model_vqa_science \
	--model-path liuhaotian/llava-lcs558k-scienceqa-vicuna-13b-v1.3 \
	--question-file /path/to/ScienceQA/data/scienceqa/llava_test_QCM-LEA.json \
	--image-folder /path/to/ScienceQA/data/scienceqa/images/test \
	--answers-file vqa/results/ScienceQA/test_llava-13b.jsonl \
	--conv-mode llava_v1
	```

	(b) Evaluate the generated responses

	```Shell
	python eval_science_qa.py \
	--base-dir /path/to/ScienceQA/data/scienceqa \
	--result-file vqa/results/ScienceQA/test_llava-13b.jsonl \
	--output-file vqa/results/ScienceQA/test_llava-13b_output.json \
	--output-result vqa/results/ScienceQA/test_llava-13b_result.json \
	```

	For reference, we attach our prediction file [`test_sqa_llava_lcs_558k_sqa_12e_vicuna_v1_3_13b.json`](https://github.com/haotian-liu/LLaVA/blob/main/llava/eval/table/results/test_sqa_llava_lcs_558k_sqa_12e_vicuna_v1_3_13b.json) and [`test_sqa_llava_13b_v0.json`](https://github.com/haotian-liu/LLaVA/blob/main/llava/eval/table/results/test_sqa_llava_13b_v0.json) for comparison when reproducing our results, as well as for further analysis in detail.