wuxiaojun commited on
Commit
5da2d8e
1 Parent(s): 9f3ebd3

init readme

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -16,7 +16,7 @@ inference: False
16
 
17
  ![不同风格、不同prompt的生成效果展示](./imgs/high-resolution.jpg)
18
 
19
- 文生图模型如谷歌的Imagen、OpenAI的DALL-E 3和Stability AI的Stable Diffusion引领了AIGC和数字艺术创作的新浪潮。然而,基于SD v1.5的中文文生图模型,如taiyi-diffusion-v0.1和alt-diffusion的效果仍然一般。中国的许多AI绘画平台仅支持英文,或依赖中译英的翻译工具。目前的开源文生图模型主要支持英文,双语支持有限。我们的工作,Taiyi-Diffusion-XL(Taiyi-XL),在这些发展的基础上,专注于保留英文理解能力的同时增强中文文生图生成能力,更好地支持双语文生图。
20
 
21
  The surge in text-to-image models like Google's Imagen, OpenAI's DALL-E 3, and Stability AI's Stable Diffusion has revolutionized digital art creation. However, the effectiveness of Chinese text-to-image models, such as taiyi-diffusion-v0.1 and alt-diffusion based on SD v1.5, remains moderate. Many AI art platforms in China support only English or rely on Chinese-to-English translation tools. Current open-source text-to-image models predominantly support English, with limited bilingual capabilities. Our work, Taiyi-Diffusion-XL (Taiyi-XL), builds on these developments, focusing on enhancing Chinese text-to-image generation while retaining English proficiency, addressing the unique challenges of bilingual language processing.
22
 
@@ -55,11 +55,11 @@ Our machine evaluation involved a comprehensive comparison of various models. Th
55
 
56
  ## 人类偏好评估 Human Preference Evaluation
57
 
58
- 如下图所示,比较了不同模型在中英文文生图生成方面的表现。XL版本模型,如SD-XL和Taiyi-XL,在1.5版本模型如SD-v1.5和Alt-Diffusion上显示出显著改进。DALL-E 3以其生动的色彩和紧跟文本提示的能力而著称,设定了高标准。我们的Taiyi-XL模型以其摄影风格紧密匹配Midjourney的表现,并在双语(中英文)文生图生成方面表现出色。
59
 
60
  As shown in the figures below, a comparison of different models in Chinese and English text-to-image generation performance is presented. The XL version models, such as SD-XL and Taiyi-XL, show significant improvements over the 1.5 version models like SD-v1.5 and Alt-Diffusion. DALL-E 3 is renowned for its vibrant colors and its ability to closely follow text prompts, setting a high standard. Our Taiyi-XL model, with its photographic style, closely matches the performance of Midjourney and excels in bilingual (Chinese and English) text-to-image generation.
61
 
62
- 尽管Taiyi-XL可能还未能与商业模型相媲美,但它在当前双语开源模型中表现卓越。我们任务我们模型与商业模型的差距主要归因于训练数据的数量、质量和多样性的差异。我们的模型仅使用符合版权要求的图文数据进行训练。正如大家所知的,版权问题仍然是文生图和AIGC模型最大的问题。
63
 
64
  Although Taiyi-XL may not yet rival commercial models, it excels among current bilingual open-source models. The gap with commercial models is mainly due to differences in the quantity, quality, and diversity of training data. Our model is trained exclusively on copyright-compliant image-text data. As is well known, copyright issues remain the biggest challenge in text-to-image and AI-generated content (AIGC) models.
65
 
 
16
 
17
  ![不同风格、不同prompt的生成效果展示](./imgs/high-resolution.jpg)
18
 
19
+ 文生图模型如谷歌的Imagen、OpenAI的DALL-E 3和Stability AI的Stable Diffusion引领了AIGC和数字艺术创作的新浪潮。然而,基于SD v1.5的中文文生图模型,如Taiyi-Diffusion-v0.1和Alt-Diffusion的效果仍然一般。中国的许多AI绘画平台仅支持英文,或依赖中译英的翻译工具。目前的开源文生图模型主要支持英文,双语支持有限。我们的工作,Taiyi-Diffusion-XL(Taiyi-XL),在这些发展的基础上,专注于保留英文理解能力的同时增强中文文生图生成能力,更好地支持双语文生图。
20
 
21
  The surge in text-to-image models like Google's Imagen, OpenAI's DALL-E 3, and Stability AI's Stable Diffusion has revolutionized digital art creation. However, the effectiveness of Chinese text-to-image models, such as taiyi-diffusion-v0.1 and alt-diffusion based on SD v1.5, remains moderate. Many AI art platforms in China support only English or rely on Chinese-to-English translation tools. Current open-source text-to-image models predominantly support English, with limited bilingual capabilities. Our work, Taiyi-Diffusion-XL (Taiyi-XL), builds on these developments, focusing on enhancing Chinese text-to-image generation while retaining English proficiency, addressing the unique challenges of bilingual language processing.
22
 
 
55
 
56
  ## 人类偏好评估 Human Preference Evaluation
57
 
58
+ 如下图所示,比较了不同模型在中英文文生图生成方面的表现。XL版本模型,如SD-XL和Taiyi-XL,在1.5版本模型如SD-v1.5和Alt-Diffusion上显示出显著改进。DALL-E 3以其生动的色彩和prompt-following的能力而著称。Taiyi-XL模型偏向生成摄影风格的图片,与Midjourney较为类似,但是Taiyi-XL并在双语(中英文)文生图生成方面表现更出色。
59
 
60
  As shown in the figures below, a comparison of different models in Chinese and English text-to-image generation performance is presented. The XL version models, such as SD-XL and Taiyi-XL, show significant improvements over the 1.5 version models like SD-v1.5 and Alt-Diffusion. DALL-E 3 is renowned for its vibrant colors and its ability to closely follow text prompts, setting a high standard. Our Taiyi-XL model, with its photographic style, closely matches the performance of Midjourney and excels in bilingual (Chinese and English) text-to-image generation.
61
 
62
+ 尽管Taiyi-XL可能还未能与商业模型相媲美,但它比当前双语开源模型优越不少。我们认为我们模型与商业模型的差距主要归因于训练数据的数量、质量和多样性的差异。我们的模型仅使用符合版权要求的图文数据进行训练。正如大家所知的,版权问题仍然是文生图和AIGC模型最大的问题。
63
 
64
  Although Taiyi-XL may not yet rival commercial models, it excels among current bilingual open-source models. The gap with commercial models is mainly due to differences in the quantity, quality, and diversity of training data. Our model is trained exclusively on copyright-compliant image-text data. As is well known, copyright issues remain the biggest challenge in text-to-image and AI-generated content (AIGC) models.
65