Papers
arxiv:2408.08872

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Published on Aug 16
· Submitted by akhaliq on Aug 19
#1 Paper of the day
Authors:
Le Xue ,
,
An Yan ,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated datasets, a training recipe, model architectures, and a resulting suite of LMMs. xGen-MM, short for xGen-MultiModal, expands the Salesforce xGen initiative on foundation AI models. Our models undergo rigorous evaluation across a range of tasks, including both single and multi-image benchmarks. Our pre-trained base model exhibits strong in-context learning capabilities and the instruction-tuned model demonstrates competitive performance among open-source LMMs with similar model sizes. In addition, we introduce a safety-tuned model with DPO, aiming to mitigate harmful behaviors such as hallucinations and improve safety. We open-source our models, curated large-scale datasets, and our fine-tuning codebase to facilitate further advancements in LMM research. Associated resources will be available on our project page above.

Community

Paper submitter

The link gives a 404, I assume xgen-mm hasn't been merged yet?

Paper author

Hi, we plan to make the links public today. Since yesterday was the weekend, we need infrastructure's access to turn things public on Monday.

Paper author

Hi, our model/dataset links are live now

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 6

Browse 6 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2408.08872 in a dataset README.md to link it from this page.

Spaces citing this paper 3

Collections including this paper 21