wuxiaojun commited on
Commit
e551372
1 Parent(s): fd6f6ce

init readme

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -61,7 +61,7 @@ Our machine evaluation involved a comprehensive comparison of various models. Th
61
 
62
  As shown in the figures below, a comparison of different models in Chinese and English text-to-image generation performance is presented. The XL version models, such as SD-XL and Taiyi-XL, show significant improvements over the 1.5 version models like SD-v1.5 and Alt-Diffusion. DALL-E 3 is renowned for its vibrant colors and its ability to closely follow text prompts, setting a high standard. Our Taiyi-XL model, with its photographic style, closely matches the performance of Midjourney and excels in bilingual (Chinese and English) text-to-image generation.
63
 
64
- 尽管Taiyi-XL可能还未能与商业模型相媲美,但它比当前双语开源模型优越不少。我们认为我们模型与商业模型的差距主要归因于训练数据的数量、质量和多样性的差异。我们的模型仅使用符合版权要求的图文数据进行训练。正如大家所知的,版权问题仍然是文生图和AIGC模型最大的问题。
65
 
66
  Although Taiyi-XL may not yet rival commercial models, it excels among current bilingual open-source models. The gap with commercial models is mainly due to differences in the quantity, quality, and diversity of training data. Our model is trained exclusively on copyright-compliant image-text data. As is well known, copyright issues remain the biggest challenge in text-to-image and AI-generated content (AIGC) models.
67
 
@@ -77,9 +77,19 @@ We also evaluated the impact of using Latent Consistency Models (LCM) to acceler
77
 
78
  ## 引用 Citation
79
 
80
- 如果您在您的工作中使用了我们的模型,可以引用我们的[总论文](https://arxiv.org/abs/2209.02970):
81
 
82
- If you are using the resource for your work, please cite the our [paper](https://arxiv.org/abs/2209.02970):
 
 
 
 
 
 
 
 
 
 
83
 
84
  ```text
85
  @article{fengshenbang,
 
61
 
62
  As shown in the figures below, a comparison of different models in Chinese and English text-to-image generation performance is presented. The XL version models, such as SD-XL and Taiyi-XL, show significant improvements over the 1.5 version models like SD-v1.5 and Alt-Diffusion. DALL-E 3 is renowned for its vibrant colors and its ability to closely follow text prompts, setting a high standard. Our Taiyi-XL model, with its photographic style, closely matches the performance of Midjourney and excels in bilingual (Chinese and English) text-to-image generation.
63
 
64
+ 尽管Taiyi-XL可能还未能与商业模型相媲美,但它比当前双语开源模型优越不少。我们认为我们模型与商业模型的差距主要归因于训练数据的数量、质量和多样性的差异。我们的模型仅使用学术数据集和符合版权要求的图文数据进行训练。正如大家所知的,版权问题仍然是文生图和AIGC模型最大的问题。对于中国人像或者元素我们也希望开源社区进一步数据微调。
65
 
66
  Although Taiyi-XL may not yet rival commercial models, it excels among current bilingual open-source models. The gap with commercial models is mainly due to differences in the quantity, quality, and diversity of training data. Our model is trained exclusively on copyright-compliant image-text data. As is well known, copyright issues remain the biggest challenge in text-to-image and AI-generated content (AIGC) models.
67
 
 
77
 
78
  ## 引用 Citation
79
 
80
+ 如果您在您的工作中使用了我们的模型,可以引用我们的论文:
81
 
82
+ If you are using the resource for your work, please cite the our paper:
83
+ ```text
84
+ @misc{wu2024taiyidiffusionxl,
85
+ title={Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support},
86
+ author={Xiaojun Wu and Dixiang Zhang and Ruyi Gan and Junyu Lu and Ziwei Wu and Renliang Sun and Jiaxing Zhang and Pingjian Zhang and Yan Song},
87
+ year={2024},
88
+ eprint={2401.14688},
89
+ archivePrefix={arXiv},
90
+ primaryClass={cs.CL}
91
+ }
92
+ ```
93
 
94
  ```text
95
  @article{fengshenbang,