--- license: apache-2.0 language: - en metrics: - accuracy base_model: - Qwen/Qwen2-VL-7B-Instruct pipeline_tag: image-text-to-text --- # MMEvol Model Card ## Model Details Here are the pretrained weights and instruction tuning weights | Model | Pretrained Projector | Base LLM | PT Data | IT Data | Download | | ---------------- | -------------------- | --------- | ------------------------------------------------------------ | ------- | -------- | | MMEvol-Qwen2-7B | [mm_projector](https://huggingface.co/Tongyi-ConvAI/MMEvol-Qwen2-7B/tree/main) | Qwen2-7B | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt](https://huggingface.co/Tongyi-ConvAI/MMEvol-Qwen2-7B/tree/main) | ## Performance ### VLMEvalKit Support (OpenCompass) | Model | MME_C | MMStar | HallBench | MathVista_mini | MMMU_val | AI2D | POPE | BLINK | RWQA | | ---------------- | ----- | ------ | --------- | -------------- | -------- | ---- | ---- | ----- | ---- | | MMEvol-Qwen2-7B | 55.8 | 51.6 | 64.1 | 52.4 | 45.1 | 74.7 | 87.8 | 47.7 | 63.9 | ### VLMEvalKit Not Support (VQADataSet) | Model | VQA_v2 | GQA | MIA | MMSInst | | ---------------- | ------ | ---- | ---- | ------- | | MMEvol-Qwen2-7B | 83.1 | 65.5 | 77.6 | 41.8 | ## Paper or resources for more information - Page: https://mmevol.github.io/ - arXiv: https://arxiv.org/pdf/2409.05840 ## License Llama 3 is licensed under the LLAMA 3 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved. ## Contact us if you have any questions - Run Luo — r.luo@siat.ac.cn - Haonan Zhang — zchiowal@gmail.com