Haplo-8B-672 / README.md
rayruiyang's picture
Update README.md
570643d verified
metadata
license: apache-2.0

Haplo Model Card

This work presents a simple yet efficient method to construct a baseline for the native and end-to-end large multi-modal model in a single transformer. The proposed model demonstrates superior performance compared to other LMMs using one transformer and significantly narrows the performance gap with compositional LMMs.

Model date: Haplo models were trained in September 2024.

Paper or resources for more information: https://haplo-vl.github.io/

Performace

Model SEEDB POPE RWQA MMB MMStar VQAv2 GQA SQA
Haplo-8B-672 75.1 88.6 61.4 73.6 57.2 81.0 65.5 95.3
Haplo-8B-MI-672 75.5 88.2 62.0 75.0 57.6 80.7 65.0 94.4

Intended use

Primary intended uses: The primary use of Haplo is research on large multimodal models and chatbots.

Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.