Spaces:

q-future
/

Q-Bench-Leaderboard

Running

Q-Bench-Leaderboard / qbench_a2_single.csv

Update qbench_a2_single.csv

8916730 verified 5 months ago

960 Bytes

	Model,Completeness,Precision,Relevance,Sum
	BlueImage-GPT (Close-Source),1.96,1.57,1.95,5.49
	InfiMM (Zephyr-7B),0.77,1.08,1.71,3.58
	Emu2-Chat (LLaMA-33B),1.07,1.24,1.88,4.19
	Fuyu-8B (Persimmon-8B),0.88,0.83,1.82,3.53
	BakLLava (Mistral-7B),1.0,0.77,1.61,3.38
	SPHINX,0.79,1.14,1.72,3.65
	mPLUG-Owl2 (LLaMA-7B),1.06,1.24,1.36,3.67
	LLaVA-v1.5 (Vicuna-v1.5-7B),0.9,1.13,1.18,3.21
	LLaVA-v1.5 (Vicuna-v1.5-13B),0.91,1.28,1.29,3.47
	InternLM-XComposer-VL (InternLM),1.08,1.26,1.87,4.21
	IDEFICS-Instruct (LLaMA-7B),0.83,1.03,1.33,3.18
	Qwen-VL (QwenLM),0.98,0.75,1.63,3.36
	Shikra (Vicuna-7B),0.89,1.11,1.33,3.34
	Otter-v1 (MPT-7B),0.96,0.83,1.83,3.61
	Kosmos-2,1.12,1.06,1.85,4.03
	InstructBLIP (Flan-T5-XL),0.87,1.04,1.11,3.02
	InstructBLIP (Vicuna-7B),0.79,1.21,0.84,2.84
	VisualGLM-6B (GLM-6B),0.82,0.97,1.21,2.99
	mPLUG-Owl (LLaMA-7B),1.06,1.28,1.6,3.94
	LLaMA-Adapter-V2,0.85,1.15,1.44,3.45
	LLaVA-v1 (Vicuna-13B),0.91,1.25,1.6,3.76
	MiniGPT-4 (Vicuna-13B),1.0,1.26,1.41,3.67