--- license: apache-2.0 language: - en pipeline_tag: image-to-image tags: - Diffusion Transformer - Image Editing - Scepter - ACE ---

ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer

Tongyi Lab, Alibaba Group

[**Paper**](https://arxiv.org/abs/2410.00086) **|** [**Project Page**](https://ali-vilab.github.io/ace-page/) **|** [**Code**](https://github.com/ali-vilab/ACE)
ACE is a unified foundational model framework that supports a wide range of visual generation tasks. By defining CU for unifying multi-modal inputs across different tasks and incorporating long-context CU, we introduce historical contextual information into visual generation tasks, paving the way for ChatGPT-like dialog systems in visual generation.

## BibTeX ```bibtex @article{han2024ace, title={ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer}, author={Han, Zhen and Jiang, Zeyinzi and Pan, Yulin and Zhang, Jingfeng and Mao, Chaojie and Xie, Chenwei and Liu, Yu and Zhou, Jingren}, journal={arXiv preprint arXiv:2410.00086}, year={2024} } ```