Model

llava-internlm2-7b is a LLaVA model fine-tuned from InternLM2-Chat-7B and CLIP-ViT-Large-patch14-336 with LLaVA-Pretrain and LLaVA-Instruct by XTuner.

Results

Model	MMBench Test (EN)	MMBench Dev (EN)	MMBench Test (CN)	MMBench Dev (CN)	CCBench Dev	MME	SEEDBench_IMG	MMVet	MMMU Dev	MathVista MiniTest	HallusionBench aAcc
LLaVA-v1.5-7B (XTuner)	67.7	69.2	61.0	59.7	28.4	1716	66.4	32.2	33.7	24.2	46.2
LLaVA-v1.5-13B (XTuner)	68.8	69.5	64.7	63.1	32.9	1766	67.9	35.9	35.2	26.2	46.9
LLaVA-InternLM-7B (XTuner)	69.0	68.5	66.7	63.8	37.3	1637	65.7	32.4	36.9	26.3	49.1
LLaVA-InternLM2-7B	73.3	74.6	71.7	72.0	42.5	1700	71.2	35.9	40.1	25.5	46.8
LLaVA-InternLM2-20B	75.1	73.5	73.7	72.8	46.3	1868	70.2	37.2	39.4	24.6	47.7

Quickstart

Installation

pip install -U 'xtuner[deepspeed]'

Chat

xtuner chat internlm/internlm2-chat-7b \
  --visual-encoder openai/clip-vit-large-patch14-336 \
  --llava xtuner/llava-internlm2-7b \
  --prompt-template internlm2_chat \
  --image $IMAGE_PATH

Training

Alignment module pretraining (saved by default in ./work_dirs/)

NPROC_PER_NODE=8 xtuner train llava_internlm2_chat_7b_clip_vit_large_p14_336_e1_gpu8_pretrain --deepspeed deepspeed_zero2

Instruction following fine-tuning (saved by default in ./work_dirs/)

NPROC_PER_NODE=8 xtuner train llava_internlm2_chat_7b_qlora_clip_vit_large_p14_336_lora_e1_gpu8_finetune --deepspeed deepspeed_zero2

MMBench Evaluation

XTuner integrates the MMBench evaluation, and you can perform evaluations with the following command!

xtuner mmbench internlm/internlm2-chat-7b \
  --visual-encoder openai/clip-vit-large-patch14-336 \
  --llava xtuner/llava-internlm2-7b \
  --prompt-template internlm2_chat \
  --data-path $MMBENCH_DATA_PATH \
  --work-dir $RESULT_PATH

After the evaluation is completed, if it's a development set, it will directly print out the results; If it's a test set, you need to submit mmbench_result.xlsx to the official MMBench for final evaluation to obtain precision results!

Citation

@misc{2023xtuner,
    title={XTuner: A Toolkit for Efficiently Fine-tuning LLM},
    author={XTuner Contributors},
    howpublished = {\url{https://github.com/InternLM/xtuner}},
    year={2023}
}

xtuner
/

llava-internlm2-7b