IDEA-CCNL
/

Taiyi-Stable-Diffusion-XL-3.5B

@@ -61,7 +61,7 @@ Our machine evaluation involved a comprehensive comparison of various models. Th
 As shown in the figures below, a comparison of different models in Chinese and English text-to-image generation performance is presented. The XL version models, such as SD-XL and Taiyi-XL, show significant improvements over the 1.5 version models like SD-v1.5 and Alt-Diffusion. DALL-E 3 is renowned for its vibrant colors and its ability to closely follow text prompts, setting a high standard. Our Taiyi-XL model, with its photographic style, closely matches the performance of Midjourney and excels in bilingual (Chinese and English) text-to-image generation.
-尽管Taiyi-XL可能还未能与商业模型相媲美，但它比当前双语开源模型优越不少。我们认为我们模型与商业模型的差距主要归因于训练数据的数量、质量和多样性的差异。我们的模型仅使用符合版权要求的图文数据进行训练。正如大家所知的，版权问题仍然是文生图和AIGC模型最大的问题。
 Although Taiyi-XL may not yet rival commercial models, it excels among current bilingual open-source models. The gap with commercial models is mainly due to differences in the quantity, quality, and diversity of training data. Our model is trained exclusively on copyright-compliant image-text data. As is well known, copyright issues remain the biggest challenge in text-to-image and AI-generated content (AIGC) models.
@@ -77,9 +77,19 @@ We also evaluated the impact of using Latent Consistency Models (LCM) to acceler
 ## 引用 Citation
-如果您在您的工作中使用了我们的模型，可以引用我们的[总论文](https://arxiv.org/abs/2209.02970)：
-If you are using the resource for your work, please cite the our [paper](https://arxiv.org/abs/2209.02970):
 ```text
 @article{fengshenbang,

 As shown in the figures below, a comparison of different models in Chinese and English text-to-image generation performance is presented. The XL version models, such as SD-XL and Taiyi-XL, show significant improvements over the 1.5 version models like SD-v1.5 and Alt-Diffusion. DALL-E 3 is renowned for its vibrant colors and its ability to closely follow text prompts, setting a high standard. Our Taiyi-XL model, with its photographic style, closely matches the performance of Midjourney and excels in bilingual (Chinese and English) text-to-image generation.
+尽管Taiyi-XL可能还未能与商业模型相媲美，但它比当前双语开源模型优越不少。我们认为我们模型与商业模型的差距主要归因于训练数据的数量、质量和多样性的差异。我们的模型仅使用学术数据集和符合版权要求的图文数据进行训练。正如大家所知的，版权问题仍然是文生图和AIGC模型最大的问题。对于中国人像或者元素我们也希望开源社区进一步数据微调。
 Although Taiyi-XL may not yet rival commercial models, it excels among current bilingual open-source models. The gap with commercial models is mainly due to differences in the quantity, quality, and diversity of training data. Our model is trained exclusively on copyright-compliant image-text data. As is well known, copyright issues remain the biggest challenge in text-to-image and AI-generated content (AIGC) models.
 ## 引用 Citation
+如果您在您的工作中使用了我们的模型，可以引用我们的论文：
+If you are using the resource for your work, please cite the our paper:
+```text
+@misc{wu2024taiyidiffusionxl,
+      title={Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support},
+      author={Xiaojun Wu and Dixiang Zhang and Ruyi Gan and Junyu Lu and Ziwei Wu and Renliang Sun and Jiaxing Zhang and Pingjian Zhang and Yan Song},
+      year={2024},
+      eprint={2401.14688},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
 ```text
 @article{fengshenbang,