Instructions to use Ritesh-hf/Ask-Solve-Generate-BLIP3o-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Ritesh-hf/Ask-Solve-Generate-BLIP3o-8B with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Ask, Solve, Generate - BLIP3o-8B
This repository contains the released self-evolved BLIP3o-8B adapters for:
Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewards
The adapters were trained with the Ask-Solve-Generate self-evolving recipe using a 10k-image unlabeled pool. The base model remains frozen; this release contains the role-specific LoRA adapters needed by the public codebase.
Links
- Paper: https://arxiv.org/abs/2606.27376
- Code: https://github.com/mbzuai-oryx/Ask-Solve-Generate
- Project page: https://mbzuai-oryx.github.io/Ask-Solve-Generate/
Contents
solver/: LoRA adapter for visual understanding and self-consistency scoring.proposer/: LoRA adapter for generating image-grounded questions.generator/: LoRA adapter for generation conditioning.dit_lora/: LoRA adapter for the BLIP3o diffusion transformer.dit_lora_metadata.json: DiT LoRA configuration metadata.trainer_state.json: lightweight step metadata.
Training optimizer state, logs, generated samples, and the private data-construction pipeline are not included.
Usage
Install the public codebase, then pass this snapshot as CHECKPOINT_DIR.
git clone https://github.com/mbzuai-oryx/Ask-Solve-Generate.git
cd Ask-Solve-Generate
For understanding evaluation:
CHECKPOINT_DIR=/path/to/this/snapshot \
ADAPTER=solver \
bash BLIP3o/eval/understanding_eval_our.sh
For generation evaluation:
CHECKPOINT_DIR=/path/to/this/snapshot \
ADAPTER=generator \
bash BLIP3o/eval/geneval/generation_our.sh
Citation
@article{thawkar2026asksolvegenerate,
title={Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewards},
author={Thawkar, Ritesh and Venkatraman, Shravan and Thawakar, Omkar and Shaker, Abdelrahman and Khan, Fahad and Cholakkal, Hisham and Khan, Salman and Anwer, Rao Muhammad},
journal={arXiv preprint arXiv:2606.27376},
year={2026}
}
License
These released adapters are provided under the Apache License 2.0. Users must also comply with the terms of the upstream BLIP3o base model and any third-party components used with it.
- Downloads last month
- -
Model tree for Ritesh-hf/Ask-Solve-Generate-BLIP3o-8B
Base model
BLIP3o/BLIP3o-Model-8B