Text-to-Image
Transformers
Safetensors
PyTorch
minit2i
feature-extraction
diffusion-transformer
custom_code
Instructions to use toilaluan/minit2i-transformer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use toilaluan/minit2i-transformer with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("toilaluan/minit2i-transformer", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
MiniT2I Transformer
The MiniT2I diffusion transformer (MMJiT backbone) as a plain ๐ค transformers
model. Loads with trust_remote_code โ no diffusers, no MiniT2I install.
import torch
from transformers import AutoModel
transformer = AutoModel.from_pretrained(
"<user>/minit2i-transformer", trust_remote_code=True, dtype=torch.bfloat16
)
It predicts the clean image from a noised input and conditions on flan-t5-large
text embeddings (forward(img, context, attn_mask)). The flow-matching schedule,
classifier-free guidance and sampling loop are provided by the
minit2i library; pair this with a text encoder repo
(e.g. <user>/text_encoder) to generate images.
Citation
@misc{minit2i2026,
title = {MiniT2I: A Minimalist Baseline for Text-to-Image Synthesis},
author = {Wang, Xianbang and Zhao, Hanhong and Lu, Yiyang and Zhou, Kangyang and Ma, Linrui and He, Kaiming},
year = {2026},
url = {https://peppaking8.github.io/#/post/minit2i}
}
- Downloads last month
- 22