yonatanbitton
commited on
Commit
•
29a84a4
1
Parent(s):
4175c3d
Update visitbench_leaderboard_Single~Image_Nov072023.tsv
Browse files
visitbench_leaderboard_Single~Image_Nov072023.tsv
CHANGED
@@ -1,20 +1,17 @@
|
|
1 |
Category Model Elo # Matches Win vs. Reference (w/ # ratings)
|
2 |
-
Single Image
|
3 |
-
Single Image
|
4 |
-
Single Image
|
5 |
-
Single Image
|
6 |
-
Single Image LlamaAdapter-v2
|
7 |
-
Single Image
|
8 |
-
Single Image
|
9 |
-
Single Image
|
10 |
-
Single Image
|
11 |
-
Single Image
|
12 |
-
Single Image
|
13 |
-
Single Image
|
14 |
-
Single Image
|
15 |
-
Single Image
|
16 |
-
Single Image
|
17 |
-
Single Image
|
18 |
-
Single Image openflamingo 845 5591 2.95% (n=509)
|
19 |
-
Single Image panda_gpt_13b_output 786 5573 2.70% (n=519)
|
20 |
-
Single Image mmgpt_output 718 5604 0.19% (n=527)
|
|
|
1 |
Category Model Elo # Matches Win vs. Reference (w/ # ratings)
|
2 |
+
Single Image GPT4V 1349 677 65.44% (n=136)
|
3 |
+
Single Image Human Verified Reference 1338 6480 ---
|
4 |
+
Single Image LLaVA-Plus 1187 812 30.15% (n=136)
|
5 |
+
Single Image LLaVA 13B 1091 5574 18.53% (n=475)
|
6 |
+
Single Image LlamaAdapter-v2 1066 5573 14.14% (n=488)
|
7 |
+
Single Image mPLUG-Owl 1025 5561 15.83% (n=480)
|
8 |
+
Single Image idefics9b 997 940 9.72% (n=144)
|
9 |
+
Single Image Lynx(8B) 990 929 11.43% (n=140)
|
10 |
+
Single Image InstructBLIP 964 5612 14.12% (n=503)
|
11 |
+
Single Image Otter 947 5597 7.01% (n=499)
|
12 |
+
Single Image Octopus V2 920 913 8.90% (n=146)
|
13 |
+
Single Image VisualGPT 911 5585 1.57% (n=510)
|
14 |
+
Single Image MiniGPT-4 900 5560 3.36% (n=506)
|
15 |
+
Single Image OpenFlamingo 845 5591 2.95% (n=509)
|
16 |
+
Single Image PandaGPT 13b 786 5573 2.70% (n=519)
|
17 |
+
Single Image MMGPT 718 5604 0.19% (n=527)
|
|
|
|
|
|