yonatanbitton commited on
Commit
29a84a4
1 Parent(s): 4175c3d

Update visitbench_leaderboard_Single~Image_Nov072023.tsv

Browse files
visitbench_leaderboard_Single~Image_Nov072023.tsv CHANGED
@@ -1,20 +1,17 @@
1
  Category Model Elo # Matches Win vs. Reference (w/ # ratings)
2
- Single Image gpt4v_response 1349 677 65.44% (n=136)
3
- Single Image human_verified_reference 1338 6480 ---
4
- Single Image llava-a1-predictions 1187 812 30.15% (n=136)
5
- Single Image llava13b_output 1091 5574 18.53% (n=475)
6
- Single Image LlamaAdapter-v2 prediction 1066 5573 14.14% (n=488)
7
- Single Image lynx(7B)_v2 prediction 1051 796 15.15% (n=132)
8
- Single Image mPLUG-Owl prediction 1025 5561 15.83% (n=480)
9
- Single Image lynx_test_2 1002 731 11.02% (n=127)
10
- Single Image idefics9b_prediction 997 940 9.72% (n=144)
11
- Single Image Lynx(8B) predictions 990 929 11.43% (n=140)
12
- Single Image instruct_blip_output 964 5612 14.12% (n=503)
13
- Single Image otter 947 5597 7.01% (n=499)
14
- Single Image Octopus V2 prediction 920 913 8.90% (n=146)
15
- Single Image lynx_test_1 913 738 2.82% (n=142)
16
- Single Image visual_gpt_davinci003_output 911 5585 1.57% (n=510)
17
- Single Image MiniGPT-4 prediction 900 5560 3.36% (n=506)
18
- Single Image openflamingo 845 5591 2.95% (n=509)
19
- Single Image panda_gpt_13b_output 786 5573 2.70% (n=519)
20
- Single Image mmgpt_output 718 5604 0.19% (n=527)
 
1
  Category Model Elo # Matches Win vs. Reference (w/ # ratings)
2
+ Single Image GPT4V 1349 677 65.44% (n=136)
3
+ Single Image Human Verified Reference 1338 6480 ---
4
+ Single Image LLaVA-Plus 1187 812 30.15% (n=136)
5
+ Single Image LLaVA 13B 1091 5574 18.53% (n=475)
6
+ Single Image LlamaAdapter-v2 1066 5573 14.14% (n=488)
7
+ Single Image mPLUG-Owl 1025 5561 15.83% (n=480)
8
+ Single Image idefics9b 997 940 9.72% (n=144)
9
+ Single Image Lynx(8B) 990 929 11.43% (n=140)
10
+ Single Image InstructBLIP 964 5612 14.12% (n=503)
11
+ Single Image Otter 947 5597 7.01% (n=499)
12
+ Single Image Octopus V2 920 913 8.90% (n=146)
13
+ Single Image VisualGPT 911 5585 1.57% (n=510)
14
+ Single Image MiniGPT-4 900 5560 3.36% (n=506)
15
+ Single Image OpenFlamingo 845 5591 2.95% (n=509)
16
+ Single Image PandaGPT 13b 786 5573 2.70% (n=519)
17
+ Single Image MMGPT 718 5604 0.19% (n=527)