Q-Bench-Leaderboard / qbench_a2_single.csv
zhangzicheng's picture
Update qbench_a2_single.csv
8916730 verified
raw
history blame
960 Bytes
Model,Completeness,Precision,Relevance,Sum
BlueImage-GPT (Close-Source),1.96,1.57,1.95,5.49
InfiMM (Zephyr-7B),0.77,1.08,1.71,3.58
Emu2-Chat (LLaMA-33B),1.07,1.24,1.88,4.19
Fuyu-8B (Persimmon-8B),0.88,0.83,1.82,3.53
BakLLava (Mistral-7B),1.0,0.77,1.61,3.38
SPHINX,0.79,1.14,1.72,3.65
mPLUG-Owl2 (LLaMA-7B),1.06,1.24,1.36,3.67
LLaVA-v1.5 (Vicuna-v1.5-7B),0.9,1.13,1.18,3.21
LLaVA-v1.5 (Vicuna-v1.5-13B),0.91,1.28,1.29,3.47
InternLM-XComposer-VL (InternLM),1.08,1.26,1.87,4.21
IDEFICS-Instruct (LLaMA-7B),0.83,1.03,1.33,3.18
Qwen-VL (QwenLM),0.98,0.75,1.63,3.36
Shikra (Vicuna-7B),0.89,1.11,1.33,3.34
Otter-v1 (MPT-7B),0.96,0.83,1.83,3.61
Kosmos-2,1.12,1.06,1.85,4.03
InstructBLIP (Flan-T5-XL),0.87,1.04,1.11,3.02
InstructBLIP (Vicuna-7B),0.79,1.21,0.84,2.84
VisualGLM-6B (GLM-6B),0.82,0.97,1.21,2.99
mPLUG-Owl (LLaMA-7B),1.06,1.28,1.6,3.94
LLaMA-Adapter-V2,0.85,1.15,1.44,3.45
LLaVA-v1 (Vicuna-13B),0.91,1.25,1.6,3.76
MiniGPT-4 (Vicuna-13B),1.0,1.26,1.41,3.67