add more results

Browse files

Files changed (3) hide show

.gitignore +1 -0
MMMU.png +0 -3
README.md +45 -6

.gitignore ADDED Viewed

	@@ -0,0 +1 @@


1	+ results/

MMMU.png DELETED Viewed

Git LFS Details

SHA256: 6992e642aaad7cb1a3aad6993760dac6984905b99b1f35478ddf46d4d89d2a3f
Pointer size: 131 Bytes
Size of remote file: 147 kB

README.md CHANGED Viewed

@@ -24,10 +24,13 @@ A multimodal large-scale model, characterized by its open-source nature, closely
 视觉编码部分继承自Qwen-VL-Chat，即Openclip ViT-bigG。
 ## Quick Start
 ```
 from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig
 tokenizer = AutoTokenizer.from_pretrained(
     pretrained_model_name_or_path="huizhang0110/CatVision",
@@ -55,19 +58,33 @@ response, history = model.chat(
 ## Benchmark
-Our model achieved favorable results on the [MMMU](https://eval.ai/web/challenges/challenge-page/2179/leaderboard/5377) and [CMMMU]() leaderboards.
 - **[MMMU](https://eval.ai/web/challenges/challenge-page/2179/leaderboard/5377)**
-![MMMU](./MMMU.png)
 - **[CMMMU](https://github.com/CMMMU-Benchmark/CMMMU/blob/main/README.md)**
 | Model                          | Val (900) | Test (11K) |
 |--------------------------------|:---------:|:------------:|
-| GPT-4V(ision) (Playground)     | **42.5**  |   **43.7**   |
 | Qwen-VL-PLUS*                  |   39.5    |     36.8     |
-| CatVision                      |   39.6    |     ----     |
 | Yi-VL-34B                      |   36.2    |     36.5     |
 | Yi-VL-6B                       |   35.8    |     35.0     |
 | Qwen-VL-7B-Chat                |   30.7    |     31.3     |
@@ -81,6 +98,28 @@ Our model achieved favorable results on the [MMMU](https://eval.ai/web/challenge
 | Frequent Choice                |   24.1    |     26.0     |
 | Random Choice                  |   21.6    |     21.6     |
 - **Show Case**
@@ -101,7 +140,7 @@ Our model achieved favorable results on the [MMMU](https://eval.ai/web/challenge
 ```
 @misc{CatVision,
   author = {zhanghui@4paradigm.com},
-  title = {Open Qwen-VL-Plus},
   year = {2024},
   publisher = {huggingface},
   howpublished = {\url{https://huggingface.co/huizhang0110/CatVision}}

 视觉编码部分继承自Qwen-VL-Chat，即Openclip ViT-bigG。
+- We are continuously collecting instruction data, optimizing the model, and looking forward to supporting more tasks.
+我们正在持续收集指令数据，优化模型，期待能支持更多的功能。
 ## Quick Start
 ```
 from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig
 tokenizer = AutoTokenizer.from_pretrained(
     pretrained_model_name_or_path="huizhang0110/CatVision",
 ## Benchmark
+Our model achieved favorable results on the many leaderboards.
 - **[MMMU](https://eval.ai/web/challenges/challenge-page/2179/leaderboard/5377)**
+| Model                          | Val (900) | Test (11K)   |
+|--------------------------------|:---------:|:------------:|
+| Gemini Ultra                   |   59.4    |     ----     |
+| GPT4V                          |   56.8    |     55.7     |
+| Gemini Pro                     |   47.9    |     ----     |
+| Yi-VL-34B                      |   45.9    |     41.6     |
+| Qwen-VL-PLUS                   |   45.2    |     40.8     |
+| *CatVision*                    |   45.9    |     40.1     |
+| Macro-VL                       |   41.2    |     40.4     |
+| InfiMM-Zephyr-7B                |   39.4    |     35.5     |
+| Yi-VL-6B                       |   39.1    |     37.8     |
+| SVIT                           |   38.0    |     34.1     |
+| LLaVA-1.5-13B                  |   36.4    |     33.6     |
+| Emu2-Chat                      |   36.3    |     34.1     |
+| Qwen-VL-7B-Chat                |   35.9    |     32.9     |
 - **[CMMMU](https://github.com/CMMMU-Benchmark/CMMMU/blob/main/README.md)**
 | Model                          | Val (900) | Test (11K) |
 |--------------------------------|:---------:|:------------:|
+| GPT-4V(ision) (Playground)     |   42.5    |     43.7   |
 | Qwen-VL-PLUS*                  |   39.5    |     36.8     |
+| *CatVision*                    |   39.6    |     ----     |
 | Yi-VL-34B                      |   36.2    |     36.5     |
 | Yi-VL-6B                       |   35.8    |     35.0     |
 | Qwen-VL-7B-Chat                |   30.7    |     31.3     |
 | Frequent Choice                |   24.1    |     26.0     |
 | Random Choice                  |   21.6    |     21.6     |
+- **[MMBench](https://mmbench.opencompass.org.cn/leaderboard)**
+| Model               | mmbench_cn (test) | mmbench_cn (dev) | mmbench_en (test) | mmbench_zh (dev) | ccbench |
+|---------------------|:-----------------:|:----------------:|:-----------------:|:----------------:|:-------:|
+| Qwen-VL-PLUS(BASE)  | 83.3              | 83.2             | 82.7              | 81.5             | 77.6    |
+| GPT4v               | 77.0              | 75.1             | 74.4              | 75.0             | 46.5    |
+| Qwen-VL-PLUS        | 67.0              | 66.2             | 70.7              | 69.6             | 55.1    |
+| *CatVision*         | 70.9              | 71.8             | 70.2              | 71.6             | 49.8    |
+| Qwen-VL-Chat        | 61.8              | 60.6             | 56.3              | 56.7             | 41.2    |
+- **[MME](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models)**
+| Model         | Perception | Cognition |
+|---------------|:----------:|:---------:|
+| GPT4v         | 1409.43    | 517.14    |
+| Qwen-VL-PLUS  | 1681.25    | 502.14    |
+| *CatVision*   | 1560.90    | 366.43    |
+| Qwen-VL-Chat  | 1487.57    | 360.71    |
+- **Open Compress**
+wait
 - **Show Case**
 ```
 @misc{CatVision,
   author = {zhanghui@4paradigm.com},
+  title = {CatVision},
   year = {2024},
   publisher = {huggingface},
   howpublished = {\url{https://huggingface.co/huizhang0110/CatVision}}