---
license: apache-2.0
language:
- en
metrics:
- accuracy
base_model:
- Qwen/Qwen2-VL-7B-Instruct
pipeline_tag: image-text-to-text
---

# MMEvol Model Card

## Model Details

Here are the pretrained weights and instruction tuning weights
| Model            | Pretrained Projector | Base LLM  | PT Data                                                      | IT Data | Download |
| ---------------- | -------------------- | --------- | ------------------------------------------------------------ | ------- | -------- |
| MMEvol-Qwen2-7B  | [mm_projector](https://huggingface.co/Tongyi-ConvAI/MMEvol-Qwen2-7B/tree/main) | Qwen2-7B  | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt](https://huggingface.co/Tongyi-ConvAI/MMEvol-Qwen2-7B/tree/main) |

## Performance

### VLMEvalKit Support (OpenCompass)

| Model            | MME_C | MMStar | HallBench | MathVista_mini | MMMU_val | AI2D | POPE | BLINK | RWQA |
| ---------------- | ----- | ------ | --------- | -------------- | -------- | ---- | ---- | ----- | ---- |
| MMEvol-Qwen2-7B  | 55.8  | 51.6   | 64.1      | 52.4           | 45.1     | 74.7 | 87.8 | 47.7  | 63.9 |

### VLMEvalKit Not Support (VQADataSet)

| Model            | VQA_v2 | GQA  | MIA  | MMSInst |
| ---------------- | ------ | ---- | ---- | ------- |
| MMEvol-Qwen2-7B  | 83.1   | 65.5 | 77.6 | 41.8    |


## Paper or resources for more information
- Page: https://mmevol.github.io/
- arXiv: https://arxiv.org/pdf/2409.05840

## License
Llama 3 is licensed under the LLAMA 3 Community License, 
Copyright (c) Meta Platforms, Inc. All Rights Reserved.

## Contact us if you have any questions

- Run Luo — r.luo@siat.ac.cn
- Haonan Zhang — zchiowal@gmail.com