Ask, Solve, Generate - BLIP3o-8B

This repository contains the released self-evolved BLIP3o-8B adapters for:

Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewards

The adapters were trained with the Ask-Solve-Generate self-evolving recipe using a 10k-image unlabeled pool. The base model remains frozen; this release contains the role-specific LoRA adapters needed by the public codebase.

Links

Contents

  • solver/: LoRA adapter for visual understanding and self-consistency scoring.
  • proposer/: LoRA adapter for generating image-grounded questions.
  • generator/: LoRA adapter for generation conditioning.
  • dit_lora/: LoRA adapter for the BLIP3o diffusion transformer.
  • dit_lora_metadata.json: DiT LoRA configuration metadata.
  • trainer_state.json: lightweight step metadata.

Training optimizer state, logs, generated samples, and the private data-construction pipeline are not included.

Usage

Install the public codebase, then pass this snapshot as CHECKPOINT_DIR.

git clone https://github.com/mbzuai-oryx/Ask-Solve-Generate.git
cd Ask-Solve-Generate

For understanding evaluation:

CHECKPOINT_DIR=/path/to/this/snapshot \
ADAPTER=solver \
bash BLIP3o/eval/understanding_eval_our.sh

For generation evaluation:

CHECKPOINT_DIR=/path/to/this/snapshot \
ADAPTER=generator \
bash BLIP3o/eval/geneval/generation_our.sh

Citation

@article{thawkar2026asksolvegenerate,
  title={Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewards},
  author={Thawkar, Ritesh and Venkatraman, Shravan and Thawakar, Omkar and Shaker, Abdelrahman and Khan, Fahad and Cholakkal, Hisham and Khan, Salman and Anwer, Rao Muhammad},
  journal={arXiv preprint arXiv:2606.27376},
  year={2026}
}

License

These released adapters are provided under the Apache License 2.0. Users must also comply with the terms of the upstream BLIP3o base model and any third-party components used with it.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ritesh-hf/Ask-Solve-Generate-BLIP3o-8B

Adapter
(1)
this model

Collection including Ritesh-hf/Ask-Solve-Generate-BLIP3o-8B

Paper for Ritesh-hf/Ask-Solve-Generate-BLIP3o-8B