File size: 1,081 Bytes
74c88a0
 
a332377
03b2d9a
74c88a0
a332377
 
 
 
 
 
 
9a5b623
a332377
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
---
license: apache-2.0
tags:
- MobileVLM V2
---
## Model Summery
MobileVLM V2 is a family of significantly improved vision language models upon MobileVLM, which proves that a delicate orchestration of novel architectural design, an improved training scheme tailored for mobile VLMs, and rich high-quality dataset curation can substantially benefit VLMs’ performance. Specifically, MobileVLM V2 1.7B achieves better or on-par performance on standard VLM benchmarks compared with much larger VLMs at the 3B scale. Notably, MobileVLM_V2-3B model outperforms a large variety of VLMs at the 7B+ scale.

The MobileVLM_V2-1.7B was built on our [MobileLLaMA-1.4B-Chat](](https://huggingface.co/mtgv/MobileLLaMA-1.4B-Chat)) to facilitate the off-the-shelf deployment. 

## Model Sources
- Repository: https://github.com/Meituan-AutoML/MobileVLM
- Paper: [MobileVLM V2: Faster and Stronger Baseline for Vision Language Model](https://arxiv.org/abs/2402.03766)

## How to Get Started with the Model
Inference examples can be found at [Github](https://github.com/Meituan-AutoML/MobileVLM).