Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

AlphaGRPO-RT2I

๐Ÿค— Models | ๐Ÿ“„ Paper | ๐ŸŒ Project Page | ๐Ÿ’ป GitHub | ๐Ÿงฉ Base Model

Model Summary

AlphaGRPO-RT2I is a PEFT LoRA adapter for BAGEL-7B-MoT, trained with AlphaGRPO for reasoning text-to-image generation.

This repository contains adapter weights only. Please load it together with the BAGEL base model. The adapter uses LoRA rank 32 and alpha 64.

Usage

Set the adapter path when running AlphaGRPO/BAGEL inference or evaluation:

export BAGEL_LORA_PATH=/path/to/AlphaGRPO-rt2i

For installation, evaluation scripts, and full usage examples, please see the GitHub repository.

Citation

@inproceedings{huang2026alphagrpo,
  title={AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in Unified Multimodal Models via Decompositional Verifiable Reward},
  author={Huang, Runhui and Wu, Jie and Yang, Rui and Liu, Zhe and Zhao, Hengshuang},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2026}
}
Downloads last month
14
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for huangrh9/AlphaGRPO-rt2i

Base model

Qwen/Qwen2.5-7B
Adapter
(2)
this model

Paper for huangrh9/AlphaGRPO-rt2i