Image-Text-to-Text
Safetensors
pmod_llava_llama

p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay

This is the official model checkpoint of p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay. Please refer to this repository for our code.

Model Description

This model is pretrained on LCS-558K image caption data, and instruction-tuned on 779K LLaVA-NeXT instruction data.

Citation

If you find our model helpful for your research and applications, please cite our paper:

@article{zhang2024pmod,
  title={p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay},
  author={Zhang, Jun and Meng, Desen and Qi, Ji and Huang, Zhenpeng and Wu, Tao and Wang, Limin},
  journal={arXiv preprint arXiv:2412.04449},
  year={2024}
}

License

Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.

Downloads last month
69
Safetensors
Model size
7.06B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for MCG-NJU/p-MoD-LLaVA-NeXT-7B

Finetuned
(51)
this model