teowu commited on
Commit
174048a
1 Parent(s): 281b093

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -4
README.md CHANGED
@@ -1,9 +1,25 @@
 
1
 
2
- Use `transformers==4.36.1`. Preview version only.
3
 
4
- This model has reached 75.99\% accuracy on Q-Bench A1 *test* (multi-choice questions), notably superior than GPT-4V (73.44\%).
5
 
6
- ### Load Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
  ```python
9
  import torch
@@ -16,7 +32,7 @@ model = AutoModelForCausalLM.from_pretrained("q-future/co-instruct-preview",
16
  device_map={"":"cuda:0"})
17
  ```
18
 
19
- ### Chat
20
 
21
  ```python
22
  import requests
 
1
+ ## Performance
2
 
3
+ This model has reached 75.12\%(*12\% better than previous version*)/74.98\%(*8.5\% better than previous version*) on Q-Bench A1 *dev/test* (multi-choice questions).
4
 
5
+ It also outperforms the following close-source models with much larger model capacities:
6
 
7
+ | Model | *dev* | *test* |
8
+ | ---- | ---- | ---- |
9
+ | Co-Instruct-Preview (mPLUG-Owl2) | **75.12\%** | **74.98\%** |
10
+ | \*GPT-4V-Turbo | 74.41\% | 74.10\% |
11
+ | \*Qwen-VL-**Max** | 73.63\% | 73.90\% |
12
+ | \*GPT-4V (Nov. 2023) | 71.78\% | 73.44\% |
13
+ | \*Gemini-Pro | 68.16\% | 69.46\% |
14
+ | Q-Instruct (mPLUG-Owl2, Nov. 2023) | 67.42\% | 70.43\% |
15
+ | \*Qwen-VL-Plus | 66.01\% | 68.93\% |
16
+ | mPLUG-Owl2 | 62.14\% | 62.68\% |
17
+
18
+ \*: Proprietary Models.
19
+
20
+ We are also constructing multi-image benchmark sets (image pairs, triple-quadruple images), and the results on multi-image benchmarks will be released soon!
21
+
22
+ ## Load Model
23
 
24
  ```python
25
  import torch
 
32
  device_map={"":"cuda:0"})
33
  ```
34
 
35
+ ## Chat
36
 
37
  ```python
38
  import requests