diff --git a/.gitattributes b/.gitattributes index eb343f9f0f6bff5dcb70a1b6ef74f00e4121bd9f..267f364ba162717db9bd792ae19a34c98cd1fd51 100644 --- a/.gitattributes +++ b/.gitattributes @@ -34,3 +34,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text *.zst filter=lfs diff=lfs merge=lfs -text *tfevents* filter=lfs diff=lfs merge=lfs -text fc/sd_webui_masactrl_orig/resources/img/xyz_grid-0010-1508457017.png filter=lfs diff=lfs merge=lfs -text +gf/sd-webui-rich-text/assets/color.png filter=lfs diff=lfs merge=lfs -text +gf/sd-webui-rich-text/assets/font.png filter=lfs diff=lfs merge=lfs -text +gf/sd-webui-rich-text/assets/footnote.png filter=lfs diff=lfs merge=lfs -text +gf/sd-webui-rich-text/assets/size.png filter=lfs diff=lfs merge=lfs -text diff --git a/gf/LightDiffusionFlow/.gitignore b/gf/LightDiffusionFlow/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..4f1a899b9288a2855487058b3902225955da3248 --- /dev/null +++ b/gf/LightDiffusionFlow/.gitignore @@ -0,0 +1,3 @@ +.idea +__pycache__/ +dev \ No newline at end of file diff --git a/gf/LightDiffusionFlow/CONTRIBUTING.md b/gf/LightDiffusionFlow/CONTRIBUTING.md new file mode 100644 index 0000000000000000000000000000000000000000..ef41e992189d2d49508923a09e747d26dd40e95f --- /dev/null +++ b/gf/LightDiffusionFlow/CONTRIBUTING.md @@ -0,0 +1,31 @@ +### Contributing + +欢迎参与 LightDiffusionFlow 的贡献,你可以给我们提出意见、建议,报告 bug,或者贡献代码。在参与贡献之前,请阅读以下指引。 + +#### 咨询问题 + +简单的咨询,如询问如何使用或其他StableDiffision相关的技术问题,请在官方微信群中询问,效率更高。 + +#### 关于 issue + +如果你对 LightDiffusionFlow 的现状有意见、建议或者发现了 bug,欢迎通过 issue 给我们提出。提 issue 之前,请阅读以下指引。 + +- 搜索以往的 issue ,看是否已经提过,避免重复提出; +- 请确认你遇到的问题,是否在最新版本已被修复; +- 如果是报告 bug,请尽量提供可以复现的条件: + - 操作系统、浏览器版本 + - Stable Diffusion WebUI 版本以及相关插件的版本(如果有的话) + - LightDiffusionFlow 插件版本 + - Bug 表现 + - 操作步骤 + - 是否必现 + - 最好可以提供截图 +- 如果你的问题已经得到解决,请关闭你的 issue。 + +#### 贡献者名单 + +非常感谢以下几位对LightDiffusionFlow做出的贡献: + +- 非常感谢 [Hali](https://github.com/ThisHaliHali) 为本项目提供的灵感、建议以及在开发期间的各种支持! + +- 我们的项目是以 [Ilian Iliev](https://github.com/ilian6806) 的 [stable-diffusion-webui-state](https://github.com/ilian6806/stable-diffusion-webui-state) 项目为基础进行的二次开发,这为我们提供了很大的帮助,感谢 Ilian Iliev ! diff --git a/gf/LightDiffusionFlow/ChangeLog.md b/gf/LightDiffusionFlow/ChangeLog.md new file mode 100644 index 0000000000000000000000000000000000000000..63eb27ecbf83aa6e14824a4fd6708d45dc54bac4 --- /dev/null +++ b/gf/LightDiffusionFlow/ChangeLog.md @@ -0,0 +1,126 @@ +## 2.1.4 + +### Features: + * 增加本地保存Flow配置功能,可直接下拉列表选择已保存的Flow文件。(导入或使用新flow之前,建议先刷新页面,目前插件暂时没有重置参数的功能) + * 增加记录额外LoRA模型数据,以便导入时从Civitai搜索缺失的lora模型。 + * 模型模糊匹配优化,增加匹配子文件夹下的模型。 + * 设置页面增加保存flow的默认路径选项,以及相同作用的命令行选项"--local-flows-path"。 + +### Bug Fixes: + * 修复了部分已知的参数设置失败的问题,如ControlNet在Canny模式下的"Canny Low/Hight Threshold"等参数。 + + +## 2.1.3 + +### Features: + * 缺失LoRA模型提示。 + * 插件增加Core和ALL两种运行模式,Core模式仅支持WebUI的基本参数+ControlNet参数。All模式则会尽可能多的支持页面上的参数,但是缺点是有概率导致UI卡住,无限转圈。 + +### Bug Fixes: + * 增加了错误参数纠正步骤,处理Controlnet tab保存序号错误的问题。 + * 修复了一些已知的问题。 + * 修复了一些img2img模式丢失的参数。 + + +## 2.1.2 + +### Features: + * UI组件增加elem_id。 + * flow文件导出时,支持自定义名称。 + * ControlNet模型支持模糊匹配。 + * 如本地没有ControlNet模型,会提供下载链接。 + * 缺失插件提示。 + * 插件增加中英文切换。 + +### Bug Fixes: + * 图片组件保存功能优化,支持所有带elem_id的图片组件。 + 因为gradio组件bug影响,目前暂时不支持不带elem_id的组件保存功能。 + 目前新版本似乎已经修复,但秋叶启动器的gradio版本似乎还是旧版?待跟进。 + issue链接: https://github.com/gradio-app/gradio/issues/5548 + * refiner参数保存bug修复。 + * 插件含tab的组件保存编号错误bug修复。 + * 插件导入提示优化。 + * 参数保存逻辑优化。 + * 之前版本会保存多余参数,当前版本已修改,但注意缺失插件的提示,不一定是真的需要。 + +## 2.1.1 + +### Bug Fixes: + * sd webui v1.6.0 之前的版本兼容问题解决。 + +## 2.1.0 + +### Bug Fixes: + * py实现与js一样的创建组件编号,优化导入逻辑,部分dropdown组件改由py处理,降低UI面板无限转圈概率。 + * 修复了一些已知的问题。 + +### Features: + * UI面板挪到右侧,图片信息栏下方。 + +## 2.0.4 + +### Bug Fixes: + * 修复了部分插件tab面板以外的参数保存问题。 + * 支持解析jpeg图片的png info。 + * 修复了一些已知的问题。 + +## 2.0.3 + +### Bug Fixes: + * 修复了对比模型选项出现的bug。 + * 修复了一些已知的问题。 + * 确定了插件名称和文件后缀。 + +### Features: + * 导入flow文件时,会自动在Civitai搜索相关模型,如果找到对应模型则提供下载链接。 + +## 2.0.2 + +### Bug Fixes: + * 修复了ControlNet max_models_num设置导致的错误。 + * 修复了一些已知的问题。 + +### Features: + * 支持到最新的WebUI v1.6. + * 设置模型时,支持单独匹配Hash或者文件名,满足其中一项就能设置成功。 + +## 2.0.1 + +### Bug Fixes: + * 修复了一些已知的问题。 + +### Features: + * 增加HTTP调用接口,支持设置预加载文件。 + +## 2.0.0 + +### Bug Fixes: + * 现在可以更好的兼容汉化文件。 + +### Features: + * 重写第三方插件支持代码,通用支持所有插件参数(不包括图片)。 + +## 1.2.2 + +### Features: + * 插件提示文字统一使用英文,插件说明文档调整并增加中文说明文档。 + +## 1.2.1 + +### Bug Fixes: + * png_info解析转换时,部分key值设置错误问题修复。 + +## 1.2.0 + +### Features: + * 添加支持解析图片文件,读取png info并还原到界面上 + +### Bug Fixes: + * 解决了部分汉化文件读取错误的问题,现在可以更好的兼容汉化文件。 + +## 1.1.3 + +### Bug Fixes: + * 因“使用下拉菜单取代单选列表选项”导致部分选项设置不上的问题解决。 + * 解决了一些venus平台和秋叶启动器上的兼容问题,秋叶启动器开启前需要关闭“启动云端页面汉化”选项。 + * 页面已经设置了图片的选项再覆盖设置其他图片没有触发保存问题修复。 \ No newline at end of file diff --git a/gf/LightDiffusionFlow/LICENSE b/gf/LightDiffusionFlow/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..16a64428d1abd8e9f0a28f3c3ac26215a44061e5 --- /dev/null +++ b/gf/LightDiffusionFlow/LICENSE @@ -0,0 +1,237 @@ +Tencent is pleased to support the open source community by making LightDiffusionFlow (extension for Stable Diffusion) available. + +Copyright (C) 2023 THL A29 Limited, a Tencent company. All rights reserved. The below software in this distribution may have been modified by THL A29 Limited ("Tencent Modifications"). All Tencent Modifications are Copyright (C) THL A29 Limited. + +LightDiffusionFlow (extension for Stable Diffusion) is licensed under the GNU AFFERO GENERAL PUBLIC LICENSE v3 except for the third-party components listed below. + + +Terms of the GNU AFFERO GENERAL PUBLIC LICENSE v3: +-------------------------------------------------------------------- +GNU AFFERO GENERAL PUBLIC LICENSE +Version 3, 19 November 2007 + +Copyright © 2007 Free Software Foundation, Inc. +Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. + +Preamble +The GNU Affero General Public License is a free, copyleft license for software and other kinds of works, specifically designed to ensure cooperation with the community in the case of network server software. + +The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, our General Public Licenses are intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. + +When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. + +Developers that use our General Public Licenses protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License which gives you legal permission to copy, distribute and/or modify the software. + +A secondary benefit of defending all users' freedom is that improvements made in alternate versions of the program, if they receive widespread use, become available for other developers to incorporate. Many developers of free software are heartened and encouraged by the resulting cooperation. However, in the case of software used on network servers, this result may fail to come about. The GNU General Public License permits making a modified version and letting the public access it on a server without ever releasing its source code to the public. + +The GNU Affero General Public License is designed specifically to ensure that, in such cases, the modified source code becomes available to the community. It requires the operator of a network server to provide the source code of the modified version running there to the users of that server. Therefore, public use of a modified version, on a publicly accessible server, gives the public access to the source code of the modified version. + +An older license, called the Affero General Public License and published by Affero, was designed to accomplish similar goals. This is a different license, not a version of the Affero GPL, but Affero has released a new version of the Affero GPL which permits relicensing under this license. + +The precise terms and conditions for copying, distribution and modification follow. + +TERMS AND CONDITIONS +0. Definitions. +"This License" refers to version 3 of the GNU Affero General Public License. + +"Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. + +"The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. + +To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. + +A "covered work" means either the unmodified Program or a work based on the Program. + +To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. + +To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. + +An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. + +1. Source Code. +The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. + +A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. + +The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. + +The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. + +The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. + +The Corresponding Source for a work in source code form is that same work. + +2. Basic Permissions. +All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. + +You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. + +Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. + +3. Protecting Users' Legal Rights From Anti-Circumvention Law. +No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. + +When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. + +4. Conveying Verbatim Copies. +You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. + +You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. + +5. Conveying Modified Source Versions. +You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: + +a) The work must carry prominent notices stating that you modified it, and giving a relevant date. +b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". +c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. +d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. +A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. + +6. Conveying Non-Source Forms. +You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: + +a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. +b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. +c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. +d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. +e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. +A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. + +A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. + +"Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. + +If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). + +The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. + +Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. + +7. Additional Terms. +"Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. + +When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. + +Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: + +a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or +b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or +c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or +d) Limiting the use for publicity purposes of names of licensors or authors of the material; or +e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or +f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. +All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. + +If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. + +Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. + +8. Termination. +You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). + +However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. + +Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. + +Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. + +9. Acceptance Not Required for Having Copies. +You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. + +10. Automatic Licensing of Downstream Recipients. +Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. + +An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. + +You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. + +11. Patents. +A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". + +A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. + +Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. + +In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. + +If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. + +If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. + +A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. + +Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. + +12. No Surrender of Others' Freedom. +If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. + +13. Remote Network Interaction; Use with the GNU General Public License. +Notwithstanding any other provision of this License, if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software. This Corresponding Source shall include the Corresponding Source for any work covered by version 3 of the GNU General Public License that is incorporated pursuant to the following paragraph. + +Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the work with which it is combined will remain governed by version 3 of the GNU General Public License. + +14. Revised Versions of this License. +The Free Software Foundation may publish revised and/or new versions of the GNU Affero General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU Affero General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU Affero General Public License, you may choose any version ever published by the Free Software Foundation. + +If the Program specifies that a proxy can decide which future versions of the GNU Affero General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. + +Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. + +15. Disclaimer of Warranty. +THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + +16. Limitation of Liability. +IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + +17. Interpretation of Sections 15 and 16. +If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. + +END OF TERMS AND CONDITIONS + +How to Apply These Terms to Your New Programs +If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. + +To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU Affero General Public License as + published by the Free Software Foundation, either version 3 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU Affero General Public License for more details. + + You should have received a copy of the GNU Affero General Public License + along with this program. If not, see . +Also add information on how to contact you by electronic and paper mail. + +If your software can interact with users remotely through a computer network, you should also make sure that it provides a way for users to get its source. For example, if your program is a web application, its interface could display a "Source" link that leads users to an archive of the code. There are many ways you could offer source, and different solutions will be better for different programs; see section 13 for the specific requirements. + +You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU AGPL, see . + + + +Other dependencies and licenses: + + +Open Source Software Licensed under the MIT License: +The below software in this distribution may have been modified by THL A29 Limited ("Tencent Modifications"). All Tencent Modifications are Copyright (C) 2023 THL A29 Limited. +-------------------------------------------------------------------- +1. stable-diffusion-webui-state +Copyright (c) 2023 Ilian Iliev + + +Terms of the MIT License: +-------------------------------------------------------------------- +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. \ No newline at end of file diff --git a/gf/LightDiffusionFlow/LICENSES/Licenses.md b/gf/LightDiffusionFlow/LICENSES/Licenses.md new file mode 100644 index 0000000000000000000000000000000000000000..40430bd3ce8a92e7bab0489040a9a2888890acd9 --- /dev/null +++ b/gf/LightDiffusionFlow/LICENSES/Licenses.md @@ -0,0 +1,24 @@ + +## [stable-diffusion-webui-state](https://github.com/ilian6806/stable-diffusion-webui-state) + +MIT License + +Copyright (c) 2023 Ilian Iliev + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/gf/LightDiffusionFlow/README.md b/gf/LightDiffusionFlow/README.md new file mode 100644 index 0000000000000000000000000000000000000000..ccaf6fc0998b7a9313b4829030a27e46ca3affda --- /dev/null +++ b/gf/LightDiffusionFlow/README.md @@ -0,0 +1,63 @@ +

+ + +

+ +# sd-webui-lightdiffusionflow + +[**English**](./README.md) | [**中文**](./README_CN.md) + +This extension is developed for AUTOMATIC1111's [Stable Diffusion web UI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) that provides import/export options for parameters. +"Stable Diffusion Web UI" hereafter referred to as "SD Web UI" +* [**Plugin demonstration and other instructions**](https://fvkij7wuqx9.feishu.cn/docx/HgZndihraotmmzxFni7cFZISnvb) +* [**LightFlow workflow community**](https://www.lightflow.ai/) +* [**Discord**](https://discord.gg/CaD4mchHxW) +* [**Twitter: @LightFlow_AI**](https://twitter.com/LightFlow_AI) + + +### Capabilities + +* Export/Import web UI parameters with a single file (images, inputs, sliders, checkboxes etc.) . +* Support parsing png info from image and restoring parameters back to the web UI. +* Supported extensions: + - In theory, it can support any plugin. (Except for certain plugin images, as they require the corresponding elem_id to be provided.) + +![lightflow_en](https://github.com/Tencent/LightDiffusionFlow/assets/20501414/e03cc556-9962-41a3-8738-606ee9e38a04) + +### Install + +Use **Install from URL** option with this repo url. + +### Requirements + +*None at all.* + +### Usage + * Export Parameters: +In the SD Web UI, configure the desired options, and click the "Export" button to export a flow file that records the current parameters. + + * Import Parameters: +In the SD Web UI, click the "FileBox", select the corresponding flow file, and import the parameters from the file. +You can also directly drag and drop the flow file into the "FileBox" for importing. + +**Currently, the plugin supports SD WebUI versions v1.5 and above, and compatibility testing for earlier versions has not been conducted.** +**If not due to compatibility issues with essential functionalities, it is strongly recommended to upgrade to version v1.6 or above before using this plugin!** + +### Examples + +You can find several official examples in the example/ folder. If you want to try out more possibilities, please visit [**Our open-source community**](https://www.lightflow.ai/) for more public examples. + +### Contributing + +If you have any comments, suggestions, or encounter issues with our project, please feel free to raise them through an issue, and we also welcome pull requests at any time! +You are also more than welcomed to share your own LightDiffusionFlow on [**Our open-source community**](https://www.lightflow.ai/). + +### Credits + +Licenses for borrowed code can be found in LICENSES/[**Licenses.md**](./LICENSES/Licenses.md) + +- stable-diffusion-webui-state - https://github.com/ilian6806/stable-diffusion-webui-state + + *Our team urgently needs a solution to easily share the Stable Diffusion Web UI settings, and Ilian Iliev's stable-diffusion-webui-state project has been a great help to us. Thank you, Ilian Iliev!* + +- Big thanks to [Hali](https://github.com/ThisHaliHali) for the inspiration, suggestions, and various forms of support during the development of this project. Much appreciated! diff --git a/gf/LightDiffusionFlow/README_CN.md b/gf/LightDiffusionFlow/README_CN.md new file mode 100644 index 0000000000000000000000000000000000000000..b4c951cb8e5c790a846a6ce2e6f4e766bed740f9 --- /dev/null +++ b/gf/LightDiffusionFlow/README_CN.md @@ -0,0 +1,62 @@ +

+ + +

+ +# sd-webui-lightdiffusionflow + +[**English**](./README.md) | [**中文**](./README_CN.md) + +这是为[Stable Diffusion Web UI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) 开发的一款导入/导出参数选项的插件。 +Stable Diffusion Web UI 以下简称 SD Web UI。 +* [**插件效果演示和其他说明**](https://fvkij7wuqx9.feishu.cn/docx/HgZndihraotmmzxFni7cFZISnvb) +* [**Lightflow开源社区**](https://www.lightflow.ai/) +* [**Discord**](https://discord.gg/CaD4mchHxW) +* [**Twitter: @LightFlow_AI**](https://twitter.com/LightFlow_AI) + +### 功能 + +* 实现单个文件导入/导出SD Web UI参数选项。 +* 支持解析图片的PNG Info,并自动将参数恢复到SD Web UI面板上,包括部分插件参数。 +* 目前支持的插件: + - 理论上来说,目前可以支持所有插件。 + 部分插件的图片参数除外,因为必须要有对应的elem_id。 + +![lightflow_CN](https://github.com/Tencent/LightDiffusionFlow/assets/20501414/492f7408-7729-4370-b0f9-c17a80211029) + +### 安装 + +粘贴本项目链接到 **Install from URL** 选项里,点击安装按钮即可。 + +### 插件依赖项 + +无 + +### 用法 + +* 导出配置: + - 在SD Web UI上设置好合适的选项,点击Export按钮即可导出记录了当前设置的flow文件。 +* 导入配置: + - 在SD Web UI上,点击文件框选择对应的flow文件,即可导入文件中的设置。 + - 也可以直接把flow文件拖入文件框内完成导入。 + +**目前,该插件支持SD WebUI v1.5及以上版本,尚未对早期版本进行兼容性测试。** +**如果不是因为基本功能的兼容性问题,强烈建议在使用此插件之前升级到v1.6或更高版本。** +### 注意事项 + +* 每次导入新的flow文件之前,最好刷新一下SD Web UI页面,因为插件只会修改已保存的参数选项。 +* 插件当前版本不支持秋叶启动器的"云端汉化选项",使用本插件时建议关闭。 + +### 插件共建 + +如果你对我们的项目有意见、建议或者使用中遇到问题,欢迎通过 issue 给我们提出,也欢迎随时发起PR! + +### 感谢 + +本项目参考和引用的第三方代码开源许可放在 LICENSES/[**Licenses.md**](./LICENSES/Licenses.md) + +- stable-diffusion-webui-state - https://github.com/ilian6806/stable-diffusion-webui-state + + *我们团队急需一个能快速分享 SD Web UI 设置的解决方案,Ilian Iliev 的 stable-diffusion-webui-state 项目为我们提供了很大的帮助,感谢 Ilian Iliev!* + +- 非常感谢 [Hali](https://github.com/ThisHaliHali) 为本项目提供的灵感、建议以及在开发期间的各种支持! diff --git a/gf/LightDiffusionFlow/__pycache__/preload.cpython-310.pyc b/gf/LightDiffusionFlow/__pycache__/preload.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..2110445e16fbbd6fa8704588525b8edbdddd5caf Binary files /dev/null and b/gf/LightDiffusionFlow/__pycache__/preload.cpython-310.pyc differ diff --git "a/gf/LightDiffusionFlow/example/Product Design_\347\272\277\347\250\277-\345\267\245\344\270\232\344\272\247\345\223\201.flow" "b/gf/LightDiffusionFlow/example/Product Design_\347\272\277\347\250\277-\345\267\245\344\270\232\344\272\247\345\223\201.flow" new file mode 100644 index 0000000000000000000000000000000000000000..2f2291b0c9f99a8c15d8ff89027047a8eacd4f1a --- /dev/null +++ "b/gf/LightDiffusionFlow/example/Product Design_\347\272\277\347\250\277-\345\267\245\344\270\232\344\272\247\345\223\201.flow" @@ -0,0 +1,21 @@ +{ + "state-ext-control-net-txt2img_0-pixel-perfect": "true", + "state-ext-control-net-txt2img_0-model": "control_v11p_sd15_scribble [d4ba51ff]", + "state-txt2img_steps": "30", + "state-txt2img_height": "768", + "state-txt2img_seed": "1405674945", + "state-ext-control-net-txt2img-tab": "ControlNet Unit 0 [Scribble]", + "state-txt2img_neg_prompt": "colorful,illustration, painting, cartoons, sketch, (worst quality:2), (low quality:2), (normal quality:2), lowres, bad anatomy, bad hands, ((monochrome)), ((grayscale)), oversaturated, extra limb, missing limbs, deformed hands, long neck, long body, imperfect, (bad hands), signature, watermark, username, artist name, conjoined fingers, deformed fingers, ugly eyes, imperfect eyes, skewed eyes, unnatural face, unnatural body", + "state-ext-control-net-txt2img_0-preprocessor": "scribble_pidinet", + "state-txt2img_sampling": "DPM++ SDE Karras", + "state-ext-control-net-txt2img_0-allow-preview": "true", + "state-ext-control-net-txt2img_0-control-type": "Scribble", + "state-txt2img_enable_hr": "true", + "state-txt2img_hr_upscaler": "R-ESRGAN 4x+", + "state-tab": "txt2img", + "state-setting_sd_model_checkpoint": "chilloutmix_NiPrunedFp32Fix.safetensors [fc2511737a]", + "state-ext-control-net-txt2img_0-enable": "true", + "state-ext-control-net-txt2img-toggled": "false", + "state-txt2img_prompt": "product design,white material, Apple style,modern style,minimalist style,white background,classic color matching,light red,C4d rendering,ultra-high definition,32K UHD", + "txt2img_controlnet_ControlNet-0_input_image": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAhwAAALQCAAAAAD+v/ZvAACgg0lEQVR4nOy9d5wkV3X2/z3nVnXPzkbtSlpplXOWUELkKEzO0SYZbILB5rVJr20MNsYYcCY6YptgcrRBZEQQAoEQyjmnlXa1q80z3XXveX5/9ITe3VkhtPP+1NOe5/ORZru7+nbVrafuPfmYmMc8Zobf3ycwj8HFPDnmsUvMk2Meu8Q8OeaxS8yTYx67xDw55rFLzJNjHrvEPDnmsUvMk2Meu8Q8OeaxS8yTYx67xDw55rFLzJNjHrvEPDnmsUvMk2Meu8Q8OeaxS8yTYx67xDw55rFLzJNjHrvEPDnmsUvMk2Meu8Q8OeaxS8yTYx67xDw55rFLzJNjHrvEPDnmsUvMk2Meu8Q8OeaxS8yTYx67xDw55rFLzJNjHrvEPDnmsUvMk2Meu8Q8OeaxS8yTYx67xDw55rFLzJNjHrvEPDnmsUvMk2Meu8Q8OeaxS8yTYx67xDw55rFLzJNjHrvEPDnmsUvMk2Meu8Q8OeaxS8yTYx67xPCRI6DXQ0ZSiJAEZCjzrWV+RdhwzVhOBoQTSsgIB5BcGr7H4P85howckC0BJdFtQUkgDEAGUer7+ezmFoaMHCUROBJmNGoVcyAUNSXd3yc35zBk5IDeGoEUVJQEhSQD6BFlHvcewzhfRTTmVZXHU8lIE28mL/fzic01VPf3CcwuZBQSub7uiq3rr1jxlJJ117oNt2899kAWnbb0/j67uYah21Yk73zrixdfv546y0SVqRuosj3p9Y+2+/vs5haGhBzdliZv/MZPf/p7AR6AIQML6sZS5nl/vb8jm1BxcxWOjJLmhdWZMSTkoNsiJyOXn//eBanggckQLTVAKu0O7a72efKrjvc2JQVmdNrhUDxiXsOdEcNCjpJAYVve9t5UqBta3f2XqnX58oO6V4+P2OJN4wWoG6pDX/jW3hpTSiucaUvZPHbCkJCj2wpv6rjs1eeanLCDn3bGr63I1cWLDylXjS2sFzUX/vu3txUw0fo/j35cddll+56yMLyYk6sJw9k8dsSQkANKGv/aZz8dHiPjPOgNZy6jQCokeoIFXPLej2TMrFTxa695wbZ6r4OPeue+bBtVzDNjFxgWcnTaY0/9DpYy8Gv/clBJklMSTXLIVOQqn/WJr25VKtQ0eJg48vlvrZmwms1jZwwJOcLP/7d/b+qGuqx8xocUiZysKBn0VoaeTHLTlz9yEa1uFfbAx//Vtrrhie85YV5T2SWGhBxrzvqDzWVkPJUHP/UVe5YE4SVqZIFZT1MNl8Ga971/E4B/eK+/PbvK7P9nL/PweZF0ZmhOItRIihKhonF9f3+ghlPeO66ushpFVqNQVkhFMfXFrO8dX2GO/27zzSMNeOG6iQ/Ud9g8JGmOkqNIufT+GUUX7o+74R8vTZbUlRqV6cNyPzcUzRcWABVv1PVPIhknfU3RkVRUyv1wKQOMOUqOidsdHUnbbn8uAIv/ZptCKh11pa4UXYWihPrZETlLH0gY9aquNr8Jkrde8fX75yoGHXOUHGqiIzXSlv/6v0csS0AFR3zwh2qy1JEi7rhia0fSDntFlrRZ+hNGSfZvyvrPUYxkb1UpUjO/cmyHuUqO8d6fL59uUFEbS+tk2B9KXYV0xSnLF+3xkH+SQsol9930LGWNvwSv2PO/OqFvn0KqWfARqavu/XMtA4s5rK107baX/zDzwOP2fc7Sn+zLb97sgX326VfcePNDTjrvIVjV+Ht/J2Vt7ziRISMffhO4nvqR0dbWq590B7b0H17szBvSd8D9zc77hiwVfeQI2Pt1d+ZGUudUko2M+Jnf/tlfsP/q5s2PXIYzclZXHZXSt7V0pKJm474V4HbKD0Pl48eZkb6srvL9dkEDiTlKDknlQ4afdp2KVEpc3XJY3LYVN+ljFRdm6dpXYrS/FNN6jaQoio4a3bgHtACW3aQmLl5MzVvmVdkdMUfW0aCAUBAiCrp62/teU/G67x4qp+DXvqgbtDZXWv9HHF5xVoKDP/BSWfPSb5GTiRw0RJhjLSrGBQ0s3Gvjb26q4oTTreGHQYMIRL6/L3dQcH+z894hirKaLClCUr746V9bSvqQFEVdxYZjoAZItD6r4+xJkkJ3ngks+mJIKlLRtr4Bf69lkOrDrb6kq85Lwe3tTW9bmd9bJjFHyNGzcn7pf7aoSBe//1MnPnrf5K+RiqTSxBvwqnYMFrDswt9gxR1FOeumw7xF+99D0TOUTqsj3Q9ilvDlyT6epWueDcbTb5BKFHXnNdoe5gg5siR92f1R3aK/JRmM8FfbcmmiSN2zIIH1Yunt4e9M/OUFG6RxXXwIWHr0p6WsRto6PeDXDEiphieo29UdxyfgoMvUVc9YOg/NGXKodOITq4yld+g7qWqR4A1qJGXlokeDY5VDAufvjvHR6i81LuknR9cAL/za5th+t7htYUrpoJWYrdqkKLp8b6/xx1wuqTO/sUxgjpAjtPrVGNbacPV+CRJ2eJY6Cin0ORupAHPMqeFJn0zwtCI1KptfNkqC1jNK06+qjm/cE9jjcBz7UGdc0vdW4qR9vjdvCJvGHCFH2XASTk31msdM5GG9RSpSRFZzPAYkgwoH2i9bCAdc3e1JJOd/+BhL+MnXqF/WHDsT95GnVFC/XVlFm294NCROvFA9wXcec4Yc684ElpwCJDj6eOwFm4qyupK6l1NBBY6DkyDt64kXKStUVHTpQ0jOGTf3Dxh/6GDPaNXwV42krvLWN44AR153/1zhIGLQydGVsjp6uWP+nm8cCkuoLr3zAx/cqChSFGnLaTAV5+fQIj33sqecUh24dsIZm7XuTU7imVKRJvTV8g9O4qClnjhmk3prSuchgD2xq6ysyTCQiPK/1ucy6OQokkJvbYNDBdUxn/mW1OlMmT27/4C1++w2jnPEBn1whE+qKEKRizb9xwOo/G+n1JBQ/FsFdvqp4Htu7KioqOj9i5Ilf6IUPd9+ViNJ0f1fajodcHL0fKjvhmRWQarhfKkoRyNJJSsv3t6xVgF8/JIV+Cu2qSspK7KaG5bAvhdL6t3v0vk4hp/6TPDqk5LU1Zh09nKg/dubIpQnFpnuuP7X2tUH3HzuYPHSP8OLlK0qTc2NFNddm5wC7vq3zQ19IcKWIfGOX7QfzqfW5jo3gBnV/i82Vj/8e0BA4K19kce1t1Ue+VwilBgZ16O+UEP+t98PSqI4CdVtgv+l4ekDTg6L4l/49LgHqXblFsHXmoStbOecCOz8PzKjTN88x+uSrrrsAadEd13VVDWpFEf+9iPabPxLjdMCkdl/AbDl4hywAHezLiOZR/42qaR/P8eSwGnIgTR3wxp2DwNOjuzp2ldVRLLSqLKul/ZaR4W6doC1L9sopWr67hUPIfvWwke0O5+gDhqSR5it+EibBQtjpFBIVCxaSFD2W+CJy3LOWJWpaT70NqfYv8I4xnjdrR1L/C9lx4CTw9FX109WAixihG4riVSSWWQv77ksqqpMF2UxjGyhgx5z4p6MdSKoKOZeojzwJBv77ldIJEClE5hzcDvCzr62qgivOkHNn+3LAn13NT993oP/4JmP+s9GoP+tWU+DTg7d/REqDAzJ2FYJCZIkr1j7X3gupimhQxQH+NYXNy+1i82rCAyUSPZ8s82vuyNEBEoVSLQXYmz7cpC9oe2Q7a2tsdbqpz3rMZ/96Xu/vvVnbv+LM+IGlRwNAoF96yJKkiFAUBgxmcKKI96wFvXebYH1LiYQbNuw3mzEwBMORsJetI9zy98Uk5uc2sC5rBDGR4xEDRlV/PYRdPn5Fx3X6KLjUpfiJgJ661MQ/2sCPgaUHKoxwuBnfxmmvgqRxo83JcmoIH/2s6VSMiWjixvTKdExdsZKrhub2m8y41p6dJE+sTob2YSNWCps2ZoWYNd9KqwEJMhweoJqH1mw7Qd/8MJtpHFZJisFgGNl2Ipl7QoDSg4TyOGSp16MRqyZej+44TKyWxBUGxuKP+zzS4pqICqmhY9rN+0Ta5spQbJixHjLaFWtfncVVCWXrRtV8E25jJ5p3ZfdQnIBVpEfVNybf/ij/VsQ9Sce9ZENI0blGAYlhFX5f4uAen8bWnaBrhqplPdQV+DThgaz9E6pqCiKbnsm+Ld0wxKAdr81IqXLXpLal/UNGBrXK4DRi4saSbctS6TksOQXx8N3NRFqmpUvGWnDK7Xxgg+cCMD+f7GuZ00rKr0/8b8k3mNAVw5VJDp+yTtS405MLeNJlCsBuSxY9dpVi992ZvfA30hplGayVjFA8aWHls41241Z8eaV0P0Hj4pg0R5YKUG1acv+Fdch9erZklZ6x7lYS4557XdWmZvf+raTHvy1hpICopDE/5aCpoN6mVYs2vzjljLSjda0AFiSfPV7vuGGcPTISy/802j5+48pjYf5tD0ilXOf2rafxeRrYSQd/ionXdB1sufF7eJm1Llefkb2c7PMUVBDOgVLV67TiPZ4Rx0y/Nbznv7M937+YgNPQRKx4+kOKe7vpWtXKBrXhUu91fOWTO8Ye+Icv1EqoV7EcCj0+T2pd9A3T7pkbx4wFRfYUUdSXFkDfxeSGj2cfY0a0id/bJw6LklNKHLo423wn4Symise2Asg8sqonvULdaT8vyeIcEBXDslply9tjIa69C9v1RbClywBN1jQBFg2PeutrUb0+VjMLrp6iW1YMPm6Fa0MduCRbtxmWaXSae0Hj9BgZcV6WWlTwA1LcHzHLX4ukqoj/muFoDFloS+c+exvd7H/PfbSQSWHdeiu+xCIRgQyA6vMcpe04O2Ty3rtQIVxZoBZn49FrcWHyAWTnjYqYMHjkriOKoQ9qHtV5SDG9kJbmuJgvf1n6WiIbSEgHf56eopTZcG6Lz35Rf9z+Ze/fgWQozf0MGNAyeHRzq3PrZl6bRKuLJIx8u9n7nTWx7zCkE9fjepuNNWWO9BUVDoAzyjYD+7Kqepw7JKbNkcF3HnkwXbtpamXNWwFFo6AXV2ZLODNb1xIRRnJCIIvPO3ED333bIoqJ7oDOnuzhQG9vOKyq9+znRxhyTGVsvz7z975+OYFqijTdg6Di07O66+e+PcUHrRH0qZLK2jp+CdsPd4z+HVLj/TqAkoBhInlh1LrszfKwjL+1/+zMFOP7/PmZywEIml8+endJELeGvJK+wNKjhTGn9487Ss3TE1A4jmfPSGN73R869SRnPquxZoFDz3a853sICCkRxXy/0Rgisdr79qSxageUfJZnVRhAjPSw2jouOFWdV2Peg/esObOz138W57UTj/4k4f8TxgehSGvNTeg5KCr6z7bzn2xoYIKtT/wr49pqb3T4Wo/gBI+dXzY2BfWe/0dWa/5yuT71SpamLwr98cvuySpyFbEaYkfb4TAjCCPPxwo3wlQtOjy6nctMfSR3z/0X775tteNH7Qg5ec+68c0KcWwi6b3t7o0M7L0f8CnqFuDVTDyu0XdyRjhfoTe2+7fPmr86Ze3eVS3Fw86FeYXzwbeokYq0ktZbk796zqXVvuGXnWoCElbHpDgYZuloq1FKvqYkbCfdhX6K6sSYC+9RL1QxSHGgK4czs/+FbcpbaAYKix/+ftNCWnn5TyO76jPHdZgJxx+PFd1hfoMp9gpTv3NjRVdb3jlwvVKNGdddcSq7kgKkhUwKRYeUKqFe4wKYhRkHCeC1sW1jNc/OQuSf+Tp/5SreZnj/oA2vXqcPj9aOKTqSR+UhTcz9Tiw/SYKg/XgVm6rO7R3cpE9Pmh+9uNCIvGQp9apqW3jq5efWh21n2d6pDPXSsrWJW7FwYSVVYuh6rx3IyrpfSsMU0nX/87TfjQvc/z/iQIQ4H9yZRBmYD1BIgz2/5gMUdNl0sgw+Z84YqllmzKzB1x22V5246UQPd//BA7YC2csyTF4enYapbU8Lb+wYFgQxVC5GOO7a0jZMTPSyo+6Cpe8zSzFIe8orjDB15/6khunAj1iIuBkmDBg5EghhWDLOdtICHdCeIWMJZ+QaKqcaTe+5dLPf/7TX3rr369jzBGu2zerL5TUlc5/8QYtHnUAnybN8kPduCTMijUcYo0nyqL49S88vw4DzylRzO8iGB2NaYff09+GsE/cmXFe/YhwFG7c/Zkz//zcXn3kxg11hy1k7P4WenZAqBQ13QvTRFEFAG+P1rD0J73agCpSnPM8KqtwDn2nItRIdx5F33bjYK3j2nyrq+2zTsoroH5GjqKQvljhVK2Hd3u/PJHSEtIToX1SXzkPxYajauATRVL5fgurMFIC7MXfU0jRlBLDlt8yYOToqSFFnwSoHHM3zBMs+WKoW3JHXeUt72jThp4z5Y9vKY2y9KrtSZ/af/JbFWer2Y4cWefsmVh6tVTUab7XWzlXrNtachOKotJEUdP5e4NltzbT2kjojRj1M7JKzvHrJMDoefvSK3/aK6MdQ8aNQSOHckiNur87CgbJrKfOGh/qqZ/KWvNHB5MwbCIOdK8vSSp6X88ZNwHjqPzPtN+n0lsLpjC2MsH5PWfu11uYU/FjabK0dVGO0OfwtOy2/iIu5WfLMOqPKkvl8grMjEWP+z9PrsBbT3vxFxRNHrak2kEjR1MkxXsNquldwpOddpc6RVlZPzrSAU5713+e/ZNr3+EG+4yHSv4MfZbyiurl+k7Nr43v8ECP39wG/4yKGuktmIPzF02PG5OJ12MvAkau3j7i6w1tsH2vCo0r/yYpUeEHdLqffPooyfA3TJQeGyYMGDkiVKSNR1ADqcasd78XntcUaauK/n7vBMnab/xRV0V6O26t8xXSN7eTrV97UdZliceqNGW7jWXsMW3jj3orxzsrDFj+OUmK0hRF7+F/nQE/7VsHSuhLGPAbkkr3ytFetgRfCpX/XgEVvPBujf3/Olf/7zFg5ChSUWfdkoRNCKQ9VfbpPXEk60KfCts4+eN3lKI3uvF/IlS+S1+ntlPXKnTFCGf2HuYpcjTSq7DW2xRq1DwVoOYo5V49uVKKijSu70KbH/XbYXO5fAF42uPuokZ6Fhg19tyQ4gcfPI4KHnvuvED6/xahRt0Lt5Msqc2+p0bKKhccS0XdM6fD4mP+tJzfhiW/kPQ/NumdN566uUjdS/AH7rjU584RNelbIXXVORpGcB7Wt0RkKaQvk0jn9IscReO/byTaN3fUKP8IPIEtvCNLoeZVUHP4ul6MWQxLUbFBI0cT6urDo9PUSDj8mXT+uz54l769d0oTmmrds4f+37EDk9mvSfE28wRVBfbBTVm56JZRju4ocl8Zp0bdw4DfUi7S+n0dYMEfqjN1QJQs6Swc3rtdWY6i9b9R49XfSgo1D5+Qbf5R0VWj8nLD7OBPKJeYwfUzRzFg6Tm5aur62ndsm3xdN7S61Z4v5Uu/ubkaP+r5W6tclTNOvuvODTdkBVX+271+652K825bWcnCRkrDAvuz18jAWLC4k7sVqS+hsWK8S6t7uxLkTRuCVpex/jhEM4hx8zD/RU2emh4p73FMk0q85/hfQ6pe9EPDFfljr6IC9P59/7Ypt71y9OkwRP2t72927oASyn/Xd7OqRLL/0XV74A85sgVO/ZJxSbrj0wfRwvBjvYb/6eoz1gLaC199WZYUReXKBTxuh22l6TbHOHxIakLXLcChbe+f3lZyiSJp9b7A6f1qTkfKP1lutHiaQtKa/XpZTiOXhXrS7YWPbMPy/26GqFblgJnPcYv05T4fRQ70uMfEJzbbvq0bu3WK5R/6j7bosvJ57zptdFSeLremWnpQnTep68bDvvOPxxoRcjyqamMnnJKnvLtVnZJoPRSZ6boxx6yjA6eXz+TmFFbu5XVV+lJi1VKTznic0U1XZ8tir6eTwHz8kxa9jJmTvvq8tq9/4dmK1rAEAQ0YORrkm6/qf6dmj/c19pOsdd/rpqakp/x2yaZWU3jBj29+E9G0S5UbSNe4o4M+d0ZpHLkrFE3es+30ytdOYmuI7s1hiXx1j0QLj5q2nUVTwBSLoskt2y5KsSrlaYCu+UeSFb0YMyn4bIkaugQL//6Y8M1P/YrT/X88S/9/YcDIUXfEX94x3SEoeZen7b/o29+A8YrAn/L71FXHSp0aq1on1okOhfyDjo1ZxOgXlkWqu70n1+R0AHza99+w6GhLbHNyqdcugFbBxvocdrVLbp5hwTKmPXnWxZI/59mGyqfHgmQnH9sIsGu/X4G1kFjx7/uJzksujtaQeGcHjByl7Xe+r6/eRglGHtHyd2QzMvInfv7kBrVJUEvtJx1dEgbd5W2ecpL7y07BGGtJAtfd29KtW6FoepmvuyxRsQNFlfIlY3gXW3X0NBlDZkF0tsDY6vXN9NLRoqDW42SyDQtSRq3nGAaKL1BkkIJy8rtF2vCca4YmZeH+Fnp2QNbXnHrKS2LYk7pxxx7m5o49Zk1nwhqee8bUN/eO8vPV1fhPL+xTSUP5u/BkaQev7NrlVeIfpa7yw3DcKv/p9gJkqOjFyTl6uyjAolDnUqrK9l2jUNbFVCRI+23r+4E/XIHbgZdLWd3t+tnOSQzYytGkcnm0mwXRUz5N1Z7vqu0bWxQK2PvflrYmqrgoZFjk2nAT66ijfcrxrclxwoy0xyilK7DpJzmcdi40Us34TYCUR0anVxb1wkm4oYStStORaJIjWvs9Jhfd+VGZnINPLlaccuc5mg5gftc/tKK+/Sk/xKlzX3GAuYkBI0fdbHg3HcZ6QVXm5GNO7OINZlXNsYe0ejYLo3IPsNsbTG66oYNIaUq9cEK5jPdqiTH9CHjec1XLWseZQbMhhYnq4Uf25bsY7jjLFqA9J8q1TLyPBctejVJ8uIsZix/fS4XLn7Y+ErzwB3t28w2Pv8ggzfmMuAEjR9SXr+1NtaXkKtZ6Ph7vN8xzZ8Fri2Ii5VFgmB0wIpmkw1qgvoo7wiwtaY2skdEfJUi18a4uXYlGNzSBjHh4vV3Asihaf0sXW+X9YX/CEA9YTuHmbRJdHodMkL62te+wOOMjD07qvIUuGrjZ/VUxcKd/DguEuamUwOzI11DFehGFdOazE5ZcEaJkjBg/a9xMwaKDrKQ+vxsCWaXx1Y1gu5IJP7sJo8nU9p2u3MAe0RdDHj0+rLmoJPbqE2SLCByOONGM/HUgcfJBiMBv/970/lMST3pTsfjmB1thKnNcaxkwcnj32z5GUpAcML1YuXPVLTiJ8iQaMxBuVBWRvV5cu5Sss02pV5NpCmZc575Pi179yQk0sR4X+9WRxz8fuMPep6fpwGRhRrI7vDJFw3bDBQb7y+j8mywSSx/oGDT26WkSpQ489RlY87v/7FbSPDlmFa03HgUFVMKMUNuq9lkdC0T7pNyL/3HIQCRiBcW9yBcbEWk6EswFzdfH9Kg2lD7BsPZSB6v2gWrk7nbKRb6yztMepoSplHyj5Uyr1VcMRjiI5pRK2CUbMKRf643bOns64bu0Kf6PK1Tz+nOjGrjZ/RUxYI432RNP//76n9x82C1X3zaG4Q/Klc5H1M3Chx1fhatXty0RVHRuOLepIqBdKVJv4Z8YB1TfiI6nVEaZrtNUvt9U+ZQ95bFubcfMSyztdz6GCze6smSL+xtUd1vmQH1Epsrjm5d5kB4s5GGdWy/bc/JAJ5L2+b0/b2ie/7N9is9tdWXAyGElLX+2XgFrPtTa+q0LSuxVccOlpj3/rjrpaO9VjjNIlJRd9X/fXWWnYr8De93dpsdhfCQf2upuoMJzNS1zpKuAk0z4f28wFLCyvwytCxPsFylzQ/g0O9q9UeORK+8sjCxwuXTEUVeSomrqjz0aaGrCezvY79z1fprVn3mdlwGb3l8Rg3b2CUcm/uTD/t9v/K2bXnRgiY9fTnXm89qo70HMlSrkV5AJsp284zDhI1TnddNoJyXru0ZZRR45UsBXw8NUeZ0YH5n+XJi6XyRs5LG+sxXItzYs2JYlw2iddG1uaFrdc7aORqpLCu95//d655rPRvmzR540aLP7K2IAd8VwdP7nYtk+e3zh/N+rvL6dKp/SpihN584aQfbrvmlUuOupO47hQff2S2id0a6Mvqss2gDjtVnZdhEyee5eVZqRvlIPIsr4V0y27BEzmSnGx9hmd38bUPDQ0iJZ49e8dE0aJ1GRkmUrefGbNcLdV8z1QseDRg4FRvi/bOUpJ49hVdfyTZY5kuw+UdccIGE4H7sdC0ILT91pHKd1/ia0rOToz7mtxjaR8KL09euREfgTUh3bVR20qg4R+y+YaXIOPK0NnGuAOF1digv73gvXjahXaZuK5Jzy5HG3D945x9kxcOTolQq+rFv9frOATIv1PxMj26Ky7XriyPCt/40scL38iJ0HirgrWofvkyrvr+0T37+asuCQZNv+nooA0hN74aA9GEbmunHgtAjbaekovqRTacFeQpY4ak+cQqQDTr9Tlg0I9dJtn2zonJvnt5VZRa9m29VXcsJJvZ6PcfM6YzwJDJ+2G8gyVoEgRl4xQykE95Xqji5AUfospH5VStWjHsj4xeekDMl9/+Ppt2cE1NwxZvgpzOAbSWUPMmOXhwJYejRysHzNsw+IkiiGZzxZcKgFfH2Ol2gYMHKEgYnP3V09K1sg4dsqVd5ytF07LVExukevWehTTpihXgflcuMIYZ5cU8v7+s+XUk4gRq4eDSqKYtvC8bqPHQIyWa6F2y05EyjpyZ6Mi82MEn4yCkuw9acLUzISmYoMbicuBb9gjkeEDRg5oAG6X6GcUCGZBUtFRDYwm47biwTadFVPZHxWs/MwRvmi/PAKFfCp5X3LORU6rPH1H9rWE0XSMWMj1Or7XlAdshTXhpnk0RTLvcgPdxxzjnFcRVTfcAUopgrq7v0M6rj08tmalfsJ93PIwA4IKUvbDmCP2xVSCTXnge2zfscs1JylH1I52FM6E+EYpS+WOOu6hfhZky+LFCWkdwPty0KXTD4W7Yuycl8kcZYaXeZOfZM0U3mpZ5PwB2zqZcb9CHpk2HPTxK9M/l6sXmZUx61TnsNpLAO2chgksWaMo5bGhBPlBuBhi2qkvsJv8pSb13gOrP7TVipZEu7Tz7qzeauvOGHqZRRzo2yyylYeaLq1Z/hK+NKETweLNi5V3FQFR64kdg4UNh5XF49LP9qti0octMR6BY03/FTRFxmA2fLfE/mqbyhhJc2wtM0FDBg5CGT6wfrWUaNuVpD0C5KuTT0nfV8Wfa4+dwlQt449LUqqzLT9LhCXteOY/adeKiHwO5V1+qjsTkc44Sv32i4YqDYzxn7crXhoixmcqg0v3svD8vmGLPl++00Y5vO3bLuy10b9ktRO+WNditLAzfK9xKCddmBwW3RNKgSy+DkVKqHtY8Ej8YkaUOex6mmirj7fSmN+TocF07GhiRDa9nOon+yFW82Qm2LPVu735XagsGCtZ+uQa9tJ26hLezxMLK2pIMehAZhh52D9VBK2/ymdwoXgczfLacDIoQqy7ZV8f5lTITZeRCe90Xvp9n0V5GL9xU1A3uc3jUZRRH+Hz5o7vunt0X4+hVM2XZdSsx+uO4pqMrCoStbn0m9BinU/DdIW266W2BSag12wLY9TqPzIifrZunRNn4kOgPb/rWHt9zCbs2ksg0YOhHOrx/EhFVOK9dtalFbq3abpWa7SP99q0F76xeNpajyh/muR/uy6aFZOscUhoSo3BAq3Xjc4kUYx+poWW+nCdRfgOZRUdpqdThp5KJhvqdoEOY6e6FPIxgtI/QXjhD37sHbKf5Uj5mwhuQEjBxLefK9ZfFxlXgjymm1dRtZP3oMp5Ev+IZSss8+DaGpKl+2tmWbnYjx5u7cCruoU9jjMaW6rempxOThHv1QeqZYtalkQDbZzTcs2W1aEeVxXTDWVH8ZE0X777nZGkTAKJ3ZK+t7VbnM2lHTQyGGicGfvvCpSbi1d6oyvkqAXgT6BrS9ZY6mI56qpy+Zzbu7i2+06eSE640nT96sA1v022Mh+EWvP6ZU2Jz24Co/tIsHEir1l6SGtme5psOiAWsUuXRsGYi/vScph5wfNND8SYTy/thLfw7YXl+YQBowcTiLVLQ4+CnqxVweMGmmrOXgvjxUKJb55KVXBl7zJaj7zkEef+Guf2CAqCBq6yK+72u05fReXSnH8Sicet8C5ex0hC0dHqmr6tImE3D5/F5RHFazsRA8rvHB56rUkLDL23UOOubALNveLFhmHkxvDf9S/a80xDBg5ABjbyrGpFFDUsXptIZZMfSZKSUr8BzSkeOYI8Y2XXMbYD194/B9fTwPUnRZ5/PUbYuSx/fJrSsGWiwqt12b8fNpJQhZtUfXNQg4rWz80brZ0/yRLO91VS42fUfA4y0iEFu4BKKBsuTr6otUryVn0gLD47pVUc9TMMWjkEMCWLdX+pNRTMLY2Fdpj6oCqWCJ439eKY2W/d7Wy/Us2ELe/+5QXX4iX0kb1dWdFffxJfQ43l+DCG6lGjrDgPJpClULHH+PqD06vHDZf1SKW7LGLttOpHjWLbd8qTTG30X16eRRGOW/76xCdvf6iVrrjyhL1vEA6K1Aor1mTm/HSCwlNDRmmMtnUALLxT40ozPzP94nqjm949NInN3780S/6p58VqbCmsuYRTMsSvbj1n4yYTh5NvumnKUAFRltSv/M1Cv4TdVWtHKGeQckQrmtMqo5MdYpusK9Eb5f5mfc5BotZt12WNjjtNIMDb25g0MhhbpWstWwk4Vg03mpBX9Z6nbyLf/Jn4xWKpU8wb+5sAovw5Ma2T//Ow0958Tfh7Oy8hOnQ4VwzxtoPjas8tZ25+BeQUsGrdt6+R6w79guR8uET1q0dTy9oDgg8X1iUvfbYFxmELF2yXVBR0KLst5CG8+Zuy8ABI4eBYrx0m4gI4TXdGrqrJz8vZJJt/kCgyvjDVWCf7VbIrBe2kReUy/7r6Z/tfDOVY4/ts2FVYoHfcCs1Z1JpYxQIkvIjKyjb2bq62hKtwoNTzBiM4dH6Dce4NFNFBKt6ke5EueHuvljz4upS77PUsc81c5QaA0eOCOS12f64m4Dmmq2k9lQAcKVKadNfXoAX1wNeQzRr/47S68VkwYJFYanqnH/BecXfaGGdqYFNlE+ZNUsOaIitmIdSqHpEZFJf/Y6wlrWB0Ufh9C8F0wMxYgRNWHhyX8nkprHxyr5gpBSesLFFclu73vP8tjIbcAdS0rEeyJHqdqsqad/Jz4sB3393Qo9zf9+i4vXXx5PcIKHk28Y7lMKq79fs9ZxU5el+XzIu+vuQP31pTboUIaT65JO8IvrSHl2ZuupWtqrb89ruBMtnFcRjOz1GLnYwc6vQ9f2Kr0Uilh6s0JorVc1RK9iAkCNQhsI1zz/tceceeWhaVi770k0EKifmPNI9q/cMF5xs3X8g0kkvbd718GJwI0EICpSggPnzTn9nw0EtmM6SR/BFq2n9ucO6j5oFFjRv3ZvtKqYjqvLjnMqBy1qm2PmeCtnZGHbEUtwC7ROGFGTzC/pU2TAC0wcXgn/SdjbDzw0MSAisyyqR+I/PuP3VFz/4tX+9+Su291mnYPGjSOMsUVIkDIuKP/kBKs9fd8brSpLy97cbRwT+sty1+q11v0Apz/qqGo7b24J/vk3UjVs+ZaeodStp3YWpcLg3Ke3CnboIoWuh2xK2wDVBiVgd0z3pHA9ZlbzKqmfoOjY3MCDkABAWa4j0krvXn/fjQHf+7JTifjViycMiWSKcXPGVvyumh7787E+N0G3xlXN7UacTMLecN36tW618ilBfwpn8+5dg9sI24R/NRoM3/jurdjqJFJetdXx/94zvfFcNS/vWWT4CLSpYUndtwu9zbXdKNCKUHHXv2oyl0WDQyr3eSwzIgicju8kvZuSQnx7//B+HtaguIZme3oIlJ6QIFaepQh8s+HFf3uv5B9Btlbv/cnvro0qujrztY1V+nCz6Ut2M5h8yWvXSRHXeTWaChvbpM9i54sYUHgsKyWcyg+XSbTUibpqQVhe3exXagZvGprcVB8laey9DWutpbnJjUMgBVJjdfDPj1/79HQh1Pd8MUU5Z7L7pW8gsRanJb/p68vLaFU2GltLnz6+kfjNClfJTb49cvyrYzvQZl/4Pbm9Yrsy7xxVgLT/h+J1uWlDdVIjWY5Ogr5pg3/it5RXO5l5nOS1oayKiw+5e36fdpMiGHXR4y8pl43PVLTsg5Aigy/ifr8UoQjXubAKvtlnWpjfcmiwXT0355/dT4tDnd+qKQokP7ThQNmuuNj/ltKTUZ2BouEApljwxVN1+DlbTVjceuvPFO/oJlZ30eMLyDEawKJCCSCenSMUK9YIpaja39l+PV0QO6zpXXzVvId0tGBTn0g+HWSqM0BCwZWtQlpzhSpec8ne3Valw2yte103pqM8ub1sh5ercXyzafulvkXX+D+DphYa++hh1Optif3CUm/55K2q8Y+z5lJ3DAEPnfp+s/ayBagZtxVOMjQXJjIiEWzVqOMgQN/erxIBVfjDOhrPmaufqAdkNXZboXINZgI0brsxtnYUkXv2d8cJdb/yLFx27Zc1/rnMVPnFKKAmqfLttwaNPIO267Eew4IkVrfDpAhtx55dden64rXt/p91V4PHox8ROAqfzo3Gjcmpy5XmnRyfcF4ykUrghqkKuSC2ThzBhd/U7aRIYzaHeWHXFgDyBvzIGhBzIRHtNr4agPCik0rgMnvKxN96EfOP7TZAK/MHxKFER3vyToC8KzyzCA3jgA6CXDVcSspL8gm2MjG9w2VmbvJMKXvzAGZfN/aps+SiHaqbJ8fDuLwTp6rGFPWVmYVAgFZPfNF3ng4QBdQqjaWl+W9ldGOugt0j30gU2fdIKuTzn3H8+fTTCahgpPPLj725tSTRgnX/+/o5jOGG07V1TbyQaSIy9w9L4845rgo8UKC2CeNoMNco7/CLjnLTLk3SVbQHllJFixZjsDiVE2bIzCR7TqsXGnROy5wYGZeUwQGswzIQ8hLy5SalURate+duXf+Gza1Yur6tjnvzwFiwKq9k2uvEDO066IFnuPPFBk280NTWZatslwl+1GK74gQVp39Xd+szTacWOz0Z705dB++9U72MSubIFe7jUflECl2G1SRAm2LAzOR594vmu763ed6axBh+DRI7IU7UePTzgJCu9uH4//sjXt1oqVUnQs5VqVP983QwDFermj6Ysm3U4JOP746ksP6jb2vb2xkXZ0oVX1OzEDTj3WohD2ju9P4FESfuE0bm2IbmAuifwmExbdj7en3I+Wveu9/2q0zEYGJBtJQgrvkfvn0YIEbYZdykioLWoEj7RDgHPGF/+0x1PXlIL8kMfPC1oOg2Gro7Ein1aMf6FOgxb5zQLq/EZnK53QJVes0t7txXnQBPV5+paGFNPlzC6O09meV6Kyi79VadjQDAg5AAw26tndTKQydC/bzYac3eEehmtjTAvuYILX6IZnvxMrTdN39sQTsF+XnV52ELsShdWLSbsASc2IzunHsQliayj7yHo02hVPadL6oBIEwsH+AwG1XTgweDjnZ0/mQsYEHIYjlMVrN0LihDAeRdkEgEYpYBBhUSq6Fz52s1pO8ujGeBB89inTKu2E+X9rjurpNYrwP4uZ6TGWPKFFTU72x/yL0rlDzx4Z9ZMIkVc0eA6bLKwpWtCgVXMENWTnZTLL26+D1MyABgUcgRB6YiqdgQVLrP2BY5JBIXKhZHNzApE/cpzq7KzxBTmo+/pz4sEVxr/x3GVPY4vceNZpUI+Jt50SF1i5zKhl37fsl64ZJe1vDLutzjBtVjUhUlmGEnsnMlAlbcWJ89RI9iAkENeAlsKsawAZAtc8SWX59Rzj5oBlXpmDX/tT8l9C7mBZLgrjj56h3uUOetv3XjPYu+8bCxlrKlpPYtIfanPeSKlrrhoX2K7lDlSmahmvJ+slCQvE54Viu2YLAtAazFOJCLomX3nEgaFHKSq8cfs52W1euU/a4LmF2uMOuc+y6MFUUlf+EynJk3nMaqiMrUjfPSvFvbVliWi1HyTrFNfGnb296wkU2rSe44h+l0n1UTI0LXAeGl2abQyJ+8jl+8FNUxFC0xvhTug/eIqV9U/h3tQD8ps31sMyOk6wjj6cE+lqnBQoDabXxmQfLqkm6hk2PlvWo8I9T/6WajyOPyx6ne1u8OGnyd4ERY/NFESeHk03Yr+2zlhSb8mk9LD7kG/N2xbclnbGlIImzLQ7pjNC0DDm06scve/7gRjzm0vA0IOukYV46flYnUDVAgat6+uo2PeTAmINpYNvvjg62vPqC+g1zCnK/Fcot/VLlLnz84vVI+gWf0BURMUe8QBtChl+m7KCRytSRQ7aWeH3DS07eKSXTeQuphj2aYEjxmcr3UeWeFwx61dbKZUh8HGoJCjRZaPvGEVFaKQwYigvoh29J/kgkT89fMjNQF9VZ6Qy3yfwhEvJ/XXo1fBP01VPeIUWudvgaaOmpG/XQ7aruKOkCuv/+8Creoe4vpEdbBhWtv0ljd+Saqjd/bvWh3RghJzLX9lQMhRoIKy73NoMibDJAyNPeM7ePQFCnPNrx/6J40KFf3pSGEtymmHJr1gFblf7PPEnZvqnN8UUb6ZzCk0nHwaYBVNv1aSsGpsLaSnHmm71FYkW/DciqIDM63ijrpMrRgz9EjI3n4sKlxPxn3n3NvBxoCQIxEYrqeO9B7FNGEKta3PvrX4tBpwxSse+qmbu9Q12ZVb0wJpa9wWH3RjqZ9WVNXTWkEG/rXTpPZx5tu+VIgUbTv1LUw03ZgSLnoVvZqvd0jlKe1dh3yah7zB2ocs6OX7qysmui7MtGk45Ugj7GeAEXMso3pAyCF6Bd9OXVG5yaQs70XQbHnkmjLdJutV/7mOFjSNg/UZFrxDevnm2/1Bp5rRnb6qilj3V4Vy2kqN/cntSRS6OuMJEz/Z1zUyMPLZr5CX1qP7qhbuiCDsIlydqyOVREKFyVZQNsNceqTFleGXq4Jg18a1wcT9W+lyJ/wTjk/PoeE8Yo1KiV6P1u8cuKLa/gn1ia4I1ds27MHIVzsqvVqiUVSkUOevMdo/KnoZda/J4z53ztDvNdQN/bqB/36eof7oJIo0fjxgr1DvF/IxiV6pY+NhM37t7gMwW3qJIvJcK0g6ICvHFJ51dPSvvkpq/eD5P3U3ZSjlf+5Yn6Vpy6gRMjN5ddgfX7ElfezMVk93qCiGg0V5ZyV/5gP9xx9JDahGJ+y5s1oRorY1v5CneLizc4TYJDzEoYmkQykGRnSjVw7IZNUMXytqe4U2frqYpblWOW7AyNHs9QqSM+Uyt6JudfbjPlZQXbB0WVdtg37LqKWEIr+vvbT13me1iIQRypWhKNj/bMxuz63KD8OpSI3zG7azT8WF+NmVqBz7WAvfpSVTbu1xWWGUKI5FM94LfnfD6xksoG43r8mQLkhiBl/OYGPAyFHHo/Yt8ikvZoV5ZutL3iwnyXlSRUepzwCK1Umi9eF146vOfq1HcYrhSkQxTzT/okWxz2P5yofIZBX0zBfZLuwNZ5PEny1t/J5uYok1YbCClBDWHesXcHYet6t9n2ByW9Crl3+fJuV+w4CRo/gpv9mfaNZUKWoXf3dBESaOyyM7lOqqUynYQV/4eFp6+jieKIbCDSVK6NLvsKV60tIrXnUT1gKqJX9caWd7gwzrfC8VHv2cqPr6r+wIk/mxFnB3ccKwbZ0eIwRaMEOsaNuWvKYi5VWkidIzcwgDRo6UeXS734OVc50jnI0yYXQYN+/Pfk+WM6RDv/p7dYeRpoSSUyxQIpI3b7YWj/lrvnGnY93RlPLvnDY+w8phnrnw4iC9wbRrKwdIzqIa53KjMWBzYwKZRCzbeVfJKvpSYw1Ncc1Uu3KgMWDkUBWPeF7f3tyCRlTxzIdUjSE95XVJxfrrZpQGS/rmq3+ttBvVyQ3lyhxRnO7rv6OuvX5J9/OElRVLSjnk1RqZ+QmOzzSyhaflRLVz4PEkzKNzURe3TU4dgo2NYnLpWLnzZFaWrOCptSERc65B9YCRw8Lbf7ywLw+lV4ztzI+1qDPFeMz29fGJjNVefvMfPfWsCIESeSJu79sfdOP4X7M/+xEtcfcd1K88UMh2MnKpVP6dZLKmonuPgbXe3uZk3UqjJDQ26bIHls0QZV4yv70sSvcOlMzmGDsGjBwYHPAfE3OYkhkYPPzLCx0qKtg3aptID+nlluCrjigP+atmogg5jhtVKSmc1mc9Uv1u+5t3iZwWAIteNJX3vB0ice4lBcpCqcbvqf3WWAZYXGoBdpf32ksljAN2ljlKqjjiWOGLs/WXopwbGDhyqFnw1BeQklFHkSqiqt42OvX5UQ9uRCVwIo2OIA455bpTvrW03t5nnhIFNv4o2vm1T3rXm+uE9iOx8G/3lTFDvenU8AVH/sJRs5kr+kygxHlXB1V1cq8/JVv77veynQ9PdFj8kMqjW8V2IQZzAoNGDqhi5PkjpagXHJxpNcsfOiUjNEvfMMKyPFm9cSy1+J2Lln1+pO75N6af3PFSc+Orr6Mz+orb/kRN4LdXTXrIy5ITM111ffP7cuKAt7eR7sEeUdzPKRU5H9erIhjrRY+WLttnhp5w1FiVw0yOZjKwDzIG7XTDDJ764UWYJaoAK/x2a+pJrjgCuhQjvGqrlObgsZv/z4Eubd/+QiOpxD9+OhK/cex7ooVayps5/Y9owGeqA1e+2GDp6Su7snuqVe7GpcjY4wRMRW5rJ8OLYWSfnY9PHW/iFtAF3wpPcy4t8v623++AoqKir7ZIJGsZ5jz47v4DznN6zgwbaRs1oxx6s0IhRelv8tbVtgdgPHr912Ai4eWMjqKoKzU7/25zXIL2tRqXolvuwQeS/8Aw2leGchTpt6cDfVZt29llo5B+1E7Ax0v09YCbGxiwlUOO4zqva4XSWrAXCl6wbNpcLi2sJ7qlqNtRanzbmWfv33OLSt53Nb7xNRe5eMGy/5taHlUhLf+3FuZUM4oUX76sjNixh9FW12rf9ayUdGJCdDYVMC+sASb0lVUjM+xGUbi84ymNHuj5ngXdAcSAkcOCRrLDTCygu2WdGXZ6XwqCWbsiUXrOrmq0XvCxLx8YE+TZrmuBveTjCtvzjL+52Lvhuaa86PgigWmGBkzN20nj+l26shb5HgRH54oM9rjDKhyS1k3JGbb/DLtRTsYBC5oS226gHrjZ/mUYtNP1qA1Wqz7uQbVUJFt6UJo2WarsvTwyXoMJGysvetFopG41oadMHViVt/13cV70s/P+iKaqgoYT3kaSUZDvfPPPusQKPKZpqZc0s6vTk3FW5eghy+magrwVJpTjavkMEcZVWH78YzHZDYXufGrC7kEOFsegGz/8N3uEUeuoVVGJIEPB3PfE9/BsRE1TYktD0CpYz+4R9KI69XfvIUXrne9/VUAvt+FtK0qvm5f3l6AkgMh/bYn0z/tVPd/YruVG4/Ybc1A/Z5waTOOrLZALozlmhi/KzXkMwhYZrblWc3LQyGFEhB/czmM/eu03FgpxLGAdV0IJbNFyi7uzlGhapoc/tSY105PuRK4R7/q/mfy473/m75NwMNmJj4+0czSw4Qi/6kcq9pgX/XLXR/BHW92qAw8ZQZjZxm3q2UfD6oNmIJUF3vzY8GjbPZFuMDFg5Ahwj0Ku4ofrvr0M8sF/iXdpy6yYKEQ1FeHZ1fvOfr4622318iqwc/9cno776uI/VMEiIfb4lwWOdrr7DblYlx+STC8Y3XUy0xRu+JjC8jMW9joMa822XkUwRzp4JgOoUVafXREs0Uwa9GBjwMjhFOFp5X65+vYH//g2vOLfb6dVpELCSMRzyCTqAo89/7WpUVtV3z015HnTazuuPf+r+r1MtUAUsFec4dLODrU6V4nWR/+wKvHAZ9+LUrK6VgsQKgUIbO30DW/tN0O7LhHpjrsKVh9sgzbX9wL3ty69M6KRXuqMfO6pOOAHve16hRpFVldbNHZGqiEx+pFOKEJqFFOGiyhS1rtx55zuOyYX8han3Cw1mtnQMK4jgPrz5R5iR6fwvRpjwbslKYf00V4jHoDDuzMZMrL0YbBq1e3SDKGrg40BI0cJSbk0b8e5RD9+DG2MauGjXvGf6zvSuEKd7z8RMA56b0iN1N3ulodC+vfF7ry2+wwnVWDtmr2vkMZVdr47WRF6J+02p0Xn3tioXg3AhySVIpV3pilB4nGlzECO0MbHYXDieMz08WBjwMih0hRJ8e0K+4Sk16WeecBYdfL771Togt9oU5F48bVSVmRJ6k7f9FApdx28APbc+voECUvU7PcxKUtlZ9NoVyp3raiBt9+rB/vaERLse1XJUaRcfitN6Enw2mYGw6uUr19CgjObPOe4MWjkiIm1fcNJ5odtUuTjmXDM4zz06m88YAGJGh6Z1Uzc60bRd1c7sfVxYCMfXl9N5JO0aH1aRerMuGlE5CeD2zO7inuxq/yEGnjdxBLV6CmGGZWB/c2M16PYuMqg+o3YjsRzAwNGjin8g2FfC101MhmVu/dr3/n2R9ZQ0WL0rVua7R/0IqlRSKE/h4oXb3wubWABtKvnzrTaZylUGuk7JODKzj26Pnq0afQt3ODzoSiKkE62Ceutc9ZMIk2o/LgFcFEzeZpzCINKjktJ9SuVH5oS1Da6py/Y33oNYBOt/1AjbbdgSF0psrI+Y2047NxT6wQ8anlVw5UzPrDd3p/O8xLOMzarlwu1azQh6R8TKS26RBPU3HhAb00zGPmFZtqYut13Yws4YK2U5xg1Bs7xNonm8ENL8+kb074pnEbbtqSx1WSItpWVb/zNGQL5aoWlYu96bepw0Keu/XlTnKOWbMoNrzp85/C9ArWiCemy/ynEk96/KCLFrmcjICG4q1DKM46btGjcuXaqJtieBzOD+bzU9WpsjEP2JJjBbD/YGJQ6pDugbg65gY3XrfqHb3SVSirjpIIHCx/yqIcdvg9jC9je3hgus24rfe6PsXrRd9a/1mR13f4K2Gvf6Ts3XEpNjdwp6b1jqfA7+yscvwcTZriRKzZTZ+1heWLAO8bxjJBpv2XMEAeS2HINVbeqIpJPF/KYIxhQcqAVgmseu+pTv7HRAFOw15GHH/rcYwoJFmgHW7QwdUa2veWfaHeavznsOZtTUefkCwP7nb9uk3ZmRy05KtW1nyHY+8gyURdkV9PhyFTR/ToNrYNLBTLCb6PXLA5xKLBzME/ErT+kW+V93ScCmeYU7u99bWZkfYEWjw7p0jcejCWjah3+l6uLVHoBQRHaTuroKLTt/VDTfkvnZVZhPOAZYG9ZqxIz5U0XqZt11WlAeo3U3FOAjxRSKYob2l7z4A2l96roHWYTBfj9XWVCsd7hi1+1ihZ/XBpJ9/wbg4cBJYc6t49iI9coq2x6KzACbY49X1mlOyn0b3/TS/NWM2r/C/2gDTB6SAt/9UY1pWda2wFdRdb4r1PBA+5URETus7TuiKJQqHtlDfb4CW4V6YUwEZfmZzUzGkqa90IL+6bUyXNMkR1ccqi8NBl/UVSk8bcs6D2eHPilbY00eRf6JrursY8/ITks+br0BJLTftAibP/NPdPXTnpCUVGj8Y96bVTv7b23K/N67wtdqejulQYr16moSFnlkZj16uD65Y1m+H7W56sEfHZMmslCO9gYVHI02rDY2H9r77H/8bFuvdqOR39qS1av8kaWpKJtd20du33r51+AG3bQf0h/lgBfscjgA7saPhShcucygDM33KMKO3G8Gql5h+HU1/c2qtD4QZObs6/aVmY2z3/RcR+9rEeOOYYBTcLKFWtOv63wx++UCWu2vvE/rEBNw4qHvmj8F6tXr3z6EXddu231zTeu02jePHJ7ToUUX3hGueZh61N2AhZ+8Pkju/yFktjyjO+k4ntetPe9ERQ77dLceAy0ukeevziccNlNx4xNfvywHxjhOxdj3/yM73pw5MVtZJAHVf7fBe5vdu4ar6fNih9MWCfHf/inFSOAmfVqGSQc3LyaVlwe9e3SfedioAXue39/12OXolz+tAJOuGBGkWQnhDr6D09G+y+mdrZzpqfxdyXlGVaOrctw/KBNKjN8OugYVHKURueMuvGosaLIRVK8YRSMml6XURI9RdysV8pn9MwvF+kFLawCEntfco/jF/0hkPjOpDvnlyG06RSgfvJkbkPWR6fJ8c/dmUe5eRkYK9b37PtzDANKjq4k/QVg35l6Z+vP/vZgWpg5LQyztsMCx7AKXniNSuhN0HO4Weuz96B8SF1dv6wGniM1Pcv4PSPGS/f14PibYzwkqTT6o2lyfH9mZUXvqSuoXrWjajU3MKAyhyyU1h7UifTrH5X3qrHQ1Hf947k/3tZQN/QKDKb0wOVWb7E9Tz351AXeNH/9Z3UDo9vqZt/PPIxczdiLHqCp8x98gLo5+Md7VjMZr3ZEOOjA27y0ut86E8KFnOd+fnL2Rq88gPCdB8oP/AVVPOD7i1QsMVNrqEHGgEpIJvNmr+d+1MqXvvDsYp4gvGL5W5ttP/pmHHhuPm7B3aMPPNgXH2SgSNmSuPVxN9DUjW9LzUv+eu+cZirDNIH6nE//i6tZ/Jl9KClX20Z3deAEvLia5IXMqm4VCTmU66YercP2BpuBZBdfChbrt456BQMXlPlLcX8vXbtEFK1Z5viDxhQqCjXKCo1JXUVM5E2GiiJPmKSe24vgqOGUtXHPgsQHR8Cctys3IY3fi7ORvj5ClTh5rBfLoRJ37z01i89SV4ruTl/7MLRgxU2ak/vKgHI5aAyW/rVMPznzFiuGNRUZ04hIyORhmHp5LvLAv/fMz5rJRPOQr35zz53qKExkw48TcNubX9vgigf/Pqky2GW/v2l00b93KOLENg0mMLt1jUFPV3qA6rxdVacGAkr2qov97oHMvar4MKgrR5ZUGnUeAikdeYcURVEkRTRTkTe9uL8JNeGHi6laGNibtPMTrJ682DSKaL64tIIWrb/b+CucTzN2rOHw75MrUsRZ0Mt8gU9LE0b1vh/UJp0JpEO3zjGfyiQGlBy9NbjRV5cD7aM/0qh0d/aglVBnwvC45fkOOFX9htDO7AhpXD3nxvq3UEHNY777K+iWJcd/Qwt7/d0TbxSV93iPHWbtq1RU+iykHUUpUue4UWreN/csHD0MKDkm7NtFn9zHAP5qi4qk3ISi00hq8kTIcFYU5Y+daLAA4ORNmimMvOf5aIq+dQiO+cj/aTSmzq9wPh8nmVUX9ipnS738CQDMDxzTDPHDsfFlbfD0lTlnN5/AoJIjT5oNbjoGErxjgyK6mnB7Z6koKzTec7081Wpw2vAbt2fNYHGIRkWK0Kf2rA3sKT/PoTLD9rMLFOkNtYOdG1EiegXWHz4hchiP6RGm9B3fSPFVEiS+P9M+NxcwoOSI3nx2S+j2RyTH0sG/8911Y7krSdGoqx5HQoqfvWgUEiO08N+NMvlo74CikDb8U8/A+tht0ozhF7vGpauwhF/TSOqqqGjjoUBPPf0DTfrmpn5NyuVZhsNJa7fd12m4nzGw5FCO6GUArH4MJCrY6/TX/tu3VqsjdbKKxqTYqh8+zrAEbUiP/IesMrOZOqRxffVYDDjhO42k/Ks8zlkvSyR4jKLpxbmHLl84yQ3+a+KU+78gXddOpLTo0jnoj+1hYI1gJFTMw9nnG597w+0ErF/7Mzj0kOqEdO4Bex918LXrbrigLLpicypWTF2O/4OXWcZLqna2RJZEt/2jV92KTIu+cFC1dfPSBYnmXlcUTuUXpdWlfnNTO7mSibJmK5gElo7oxY9OK89Kgf+oA4UHH5erATUY/FLc3+y8Z0SEijo/OKomQQV1f9K09WptGPWix3/3kq0TyQWhiJ5iEhP/76k0567ESGm/D4T0T62P3StdZSJxKnSV1VQclCc9uEX6QM/1VxkrN+z8zUY3H0wF1bn36OMZaAzoyjGJXi2X1hkXX/zNr1+4udDqAq6USSUFBFXDqqc89dSFi4vwnlW7sslnuGdbVysnL19541pAr/vDlYJndB5ORb6H4l89OBVhGGcpyPUym6oPF7qmYCpkOHThjt8Lq5q/u43c6v7pA8Ore79EDRbub3beM0KNNFYkqXPdx/7gZQ8f6UX0tqAXx+O0Xn6nunnCxN77Tk+bkSR1izZJiouf620qWh+Vcqirrfeua1KUXr5l56FUCd7cb2l53EQdZec1O3+viWv3pAJ+IDVzVFkZ+JXDDEYUTqkPPfT5NT/fumnt+us268d3LePmRScvrh5+5gOUa2S9KiqSzHACkMxqlcV0//XjP7fo4Hzp8XRbRJ1HcWS/1EtqE7WSP/MjMoy+dHJTk1GuNWESKU7eaRyr9N27yBV7H04M+BzvGgN+4r0Nwizcc/K6pJOSyYrbOp39jmrLxr0eccKR65p9irkMrGDeq4nvgLBwS6x/xRc8YMEYL3hilJbkkSip2XVblWlI5pH4DFVJ8Zxje6OCodW3Bx4mCsfs/L3ofiSFcuv39qU43dYsz8v/T7i/l65fghLKzcR+UULdCQNXUeeVPX+Z7b/vAW/77N985ee5W1SkpvTyqSfQqHP+E6gNr3j0j9WZSomNe2WZip7R7Zv7gPcieiak3KLveY+7zopbd/peE+sWkCp+vYmZ81nmBAZ85SDcEhBOkRuJxhPIScd611RnrS78RQAPfPCZj1jS9FrFSoBBU1frP/buO2hSIT3gMysmaoF6mBmJ+KUBFoZnWf70HWCcdnrFRMCqjCtSeBDAUXvv9D238xPF65dWYWUu+mOBgd9WokIymcASyOWJMAtLv3XALed/usFLFeGS//Sn7z9szyecceLevY0IwdjmT33nJ2sAir/pNw9u59oUCXkgk9+b4JumrijVXaQord/ua2kqXdtgeCA7tN4p4szzB7fgobHxkVzNvajzCQz4WftEAHHq3cieMbR3Vxc9PXX+6Is//vnGyHi2qBu/5tofs/DRxy1f8YBTINfXvP4HW4K6MYv2B17clureOmQTY90LbqiWSD/9LqWVu8dU06FexsW49aJEHtg3UEm5IjzWfcdElVc4juZay9BJDDg5doVeInX7sLc0W8buvObW8UWfPUcUF62tX/mabPT39vidhb/3qbtcWEPd9U8+tl38nrLodwEjEp1Pbsa7HHLYtFIi27YaZWQWHNX/hUjgxBe3esrxijNaYRgzFNSeE7i/hZ77jsi9AnNRpM7mq993KBM+GBzjYU+E1oQV9bAv9lId75NkGEWvSC0q/rjPRxK6YTFguLF49bT828vCy83G/aDilHUqYxMlh+Yi5ujKATLzQkKYUVr1EUc87brbtn3vpzfUDYE4x6i6qch48NtOXR5qWuR7o7vugJJw9iqF3FqrftHits0msCAdsnJ6t1HvK1x6m3lOb10eNjJz5PGcwJwlR0+PiV7FsARNfdBBJb3qmo+dV914tUqVRVhZeNRzn3DQHihRQboPt8iwXK6Edqfb6rNmGddhvf60Os6mSVOhcAr/nqJQDuq0e2c3N7kx54Llp2Am1Dv9CLBaIpXuEU8YP+jrKwqCOrTsW+f84Yl7MGbRK5TwKzf9lVPUvr6mYyMvT/S1b7iYXoUyi5P6F5RiySKt/YyULLXaIhdm6uExJzBHyZGbiWhuBbiLglngLX7/B/905V6kgMae850HLyieWYAB0cvT/5WgKF6vvrtx7IwTt1tnL/LoFcTn5P6uj4ayc/62cNNvHQVUCc2xThpTmKPkmCx3HuYKMWnRMjavTlr8+4tCFfu9+0OniEQiMEl+H+6SuSz/9U0WFkurvu6O4iaBzKSRw/rWBRmWGPvXqCJXv12JUMirOcqOuSpzVDBhsJi0PzqAPIkbX/6Ir11685GvO7wnCPaOgvuWOuKqzoNUuEuuKcHSbrod6kZmOuKgMi3MWEkY9kNylRcfgPXq5M7ZR/D+PoHZhVkdjNkhrwvldtl945PhjG/0KPDo7XSOG7dCkIpzmKXpTyJRXDdhntl7j9399fsbc5TTu0JhMWkhHp7aambBMBnQWRfgx73KrDPNjkvACoCd3t/D2Ilk/pXNYZ6e0s555wHnEoaMHGl079ESHfdCsTY7N1j5VeHQdHDskAPAp2SOuMIRFCJO1nbblYgbP57JYS9lzsaOTmCOn/6O0B0XbrMDW0QvEGRWwigu3+ZBuS6H6ily+DUwkbZ9hPdtNyGHT17owOknjM9V+8YkhowcNraVkT2suEeeDRUygvFvdwzYcyJCtYfu9YEjxIH7GNP7isyyfx6zyn+Tkblq35jEkJGDjhhbj4W8wna/VadDdRmAnxjYtD3j1tsmnX8nLsKml44EfvtVrWLZzxifc02Gd8SwkWPdVrfNxd2gsVm4uKJqS1VSHVe3FNPbxE3jlQtEdRilX40p2E+3dFNUJ580Mgva0v2LISNHPOCQ4LbeRc1KB+ja2JygYSTbROeXArooZTDM80NKCp/eVhJ3vh4L8nPJc5waQ0cOX78WdXduoXHfUS67tkOh/ZCqCaBAQnZ9IAxF+6DUv60Y+V9vcIm9foNqhm6UcwtDRg72Odi5dfaK4Il01VoqyIeCk5EE4iK5wGHvQ5D1aSW5Os+Dmhfs150lZel+xLCR4/aNwQGzF9FrcBsEXkpTK0+0M9fWqwEEfugy+rt7lGrLtSI1x7+pJJh7bYa3x5CZz/W5W2x2naDNzRAWBx5moYmegvLVd1v0CoOdYFhfcpTxj9fIzE4+AIrPq7IDBXXcufBetB2/lyhwN45aD3pgpUQjIMyubAQgyslAmZ5F52NRK9tJTYmkX5qKO+CY46e/I/zMurBoNtOW6wqM7q1deVaNBYKL1MuUJJ0U25Ut1IVrvYF4Yi1nNuXi+wVDRg4eeJiztMyaRJpghIA6WsUqYQSiXK+JkutLDzXCNOVgs7ffEST7tWObStgc97sNGzmChWHXzWKKmfh5MlGWUvXS26qoSJcaFhh21FLhssRElp02g0nvou71O57bGDJy+IXXUB3ts/fIGmMIi6OjEZUCAu66CTCEjnCThFGKQQndmVCcctKs/f79iiEjR/x4A3npbOpgW9YGUJ/pdSrQq+102zojEjJOwSwM5IkG9y9fWSqvnj+Y3QZ+ZQwZOWyMlq7pzp4kGLffIVxPeCITAewydE0AAWYnItwgDAwrn8iUyI8aEgPBkJGDvch+6WyqkLd3Mdgz5YnVIIi4LGQIi9HDJmjoFGpi7X9TKY2umOtqygSGjBw61CPOv3181sbzNZiCNd0qKYLAZOkK5AZi1UoDTXaabvy2bsrEifsPyawOyWVMwg9eYr7xrl33/fsVYdyeZHBYRVhyhJvlK4xettvxbTN5AJEIOLsqVDpjjpaH2wlDRg4WegrunL3xtLbgsJ8T9IpKke++zggBrSMCehkPBVndfD4zkkee7fPbyiBC2zblirtnT5XVpViBEVEpCShUF2/tVQ3x7kO9520hasxk66FT1w/IQzKrQ3IZk7C9Hltnv3n2tIWxOxA1Sya0EcPFDRZIKOr9Jg/zyBS75S7HmtPacz7KZwJDRg7ql9To6lkbrpx9AU7jG0ohCgks+MVkFZg9Dp460KqS+MndGBxTDWhXxV8Zw0aO8ogKXTdrcRTpexklQinhPRO5cfHkvV+1YvK4bDhskKS0CObJMZBI+61yu2bzrI3384TCq1FoSAohH7sxesVz7bDpn6VYt7MvoLLfrkebYxg2cmB7uDqzF8+xpeAiexcIJYBb10+kZXPclFZi4bTazYhBepAPy6wOyWVMIXf2ytTrZ2u4dFQFHna0U4NhMl0+jpBAx07PnluX2/9pXM5xJ8zWj9/vGDZyWPs0/M5vzdZweUlOBA88o4oSbsVE3Bi9iGJj1dRxAuPa71aSnb5gSKwcw0cOOAHnWzARYLGbqNaThI+28ESvulc63wW4aeHhU79gos5LCENPG9fuZ9oNBoaNHMFerdwL4ZsNwSMOoyt0/UYCQ4Eot4YjQrb3oumfMKjWtEI67JEjMdcDiycxbOSotb5LNR6z9Oz6NiehNZsgrGc933zNRAllHbRo+kARfC4DeyyN2ufJMYgo2JJU5Qs12Z58d3FeuKCzfrIGh3HHHRTDQIf0Gbss/JrPh1s6eFj2FIaOHImxjSX7IbM1XpNQ1LAly3omMF1TTJhw+n4lyHzz7hSUfamGhh5DRg5iwX6LzTqT+sTuon5ccgoL9qqmahFe6N6rcMnh/bJFix8h1DquQ5nrpRcmMWTkaJz1HStjsxULNn5AKW52zEFAcQPZRSZkEOnQ6V3Fu+gmBZz2jPZEV4YhwJBcxiRqYnOX1kpjdhwcI0sTKlrUplAZQXBLRMIwfK++n6iLbaoMO3hlls/RUuc7YcjIIXwZ1q1AmGZhdb+gJE0UXgBc6a7rlUSYWLlqmgTZfPWt2SBFZcxeUtX9iyEjh2V9meJ3NhjMwt5ywz9QqDh58rV08xZKgLAVrem4jarYhzZamJ5hwdxtRbEDhowcUdktiRhrJkt23WdIAm7cWiXPLJl81+3qDr1Z8wN9WpVt0p0fkExHP9qcZq5XEZzEkJHDyccWqw+arWd3I7mET/diElfJe+JMHNzHvpo7xkj1wnftEWiuZ0FOYcjIoahOcTV3tUC7J5L2lOEMjuqDJt8tcSXRG1gH9RUCGeejHaL7xGdozrbsmgFDRg4Td1WVtdn95kgG8PMWTtVM9WmpmuunDjg4psk3cuFHqcSppXExa1kz9zeGjBykTrebtWFMs2EEU/ykW2fEwqm3Nt2YJjsw7N/X+Sn+712j2dkrtRS/eleXQcWQkUO0H+awZlswC0Zs6VZvqPNvP2DynWbtGk3sOGlF3+St/n69zWLkkUJ1L4VhGDBk5LBCy+H4JQksdsPe0KWA3bUGCHtqe/Lt+lLrefSMPVf2CNgFuDQ3Jp52qOHYsCgrw1YwLldckM1bNYpkuyGRtkih+JvNIpW+1nD5WtFr/GTL2gaevaVQdY4h6ucN2aM2ZJdTFW7ByrXbiiV2Q+gIFZTq7xgGR5w4Pf6lPVnGYFXLABxL1YZPZFrYacPijp3AkJGD3p1rAlDc94XDEc41awMVzjhi+qZfnrCeKrsKkCpKUTnrRrzL8fsO2WwO2bbS1M0ixJGLetvKfYclbPy/bwXc9p9+hO6+qUyEEWlfGXIiofStQD76R608XNM5dFyvVxCsG5el3QkxFkGMrCUZXg6aduDdstV7nTWMVZoINgvGzklIxz0nhosbw0YO63I7cOtYBnYj0NdkBHdXiGwnpSl2XK2JYA3Zvr1IUsfthrsKC/i13T/9AcOQcT23umeTdPDiijC/7zJHU5OrrRdlPOo9j5ouQ3td7pU8l3yy96Ns/JwNZmOMdtrDEsgxgSFbOSpVLS+2xIQT931bqWSp3HWeEVXzqEUAgQIundKAdEAxKJZF8xYEaXN7rhfC3xFDRg4aXywrl+QizO97ioAhSz+qSOT6Ba1eD3uBbsMVELLRxQmRSiW6WB1ePW/o1uFhI0eru1FWHZ3cdsspm81ovpVVnJWPI1fq1Znccu1kg2lbtgyAwPFV1hCd9pCtG0NHjkK1kIh9k4N031ObEkG1zSBSN8swgRt33MlkLt2yhcikulgsOTRItqQ1LGU5pjBk5EiN79UibpHA/L7HCRpg43IjttxNmiTZjd1e9y6I5b3EJoSncVKxPQ4elkS3KQwZOaib1V1sZHdjzwOnuR4ZOmJZz5tWgKvNhBFm7K0A86iCWy+2IE5lWKo9TWHIyFHQelDlZuyOEcxV2HxXIuCgJT3HqwGXGxAOsn16bWsFfPV2w+uXD9lUMnTkSKVe6satvVf3vbVGWOK2u4qcan2OlgIcxU2hXgF00jIzBAnXd0Gho2mGbekYMnKE2zMkOk0EuxPv40IXCQsPKu+1ZQrjeggo4HnVFBHWfNMDSjd8aAI5JjBk5ECsTFTL8MB3J47Uojm/oDpiVSAzA2ft6smPg6W9NgpIG8cE9RMO8WEpXDyFISOHW6zJ5Gu/SaXubmwrBW/d5BDUj3XFxCJ024apA2xv1OsPadc1bt4cOoINGzuGjByNsRpj4w3jxVp9EVy/KhKFu8GDlU/CkhVAunn6AFtOT+SNcnYUQaWsYTN0DJ3B10erTL3viIrb7jF/HBDdpRkzx0HXTEmclpbYREyYflYVi9ap5tv1iRwGDNnKkYjl2Wg2FUPc915vhUSCSNVY41XPSWN+1XSBuNFFgCykfEcoqB+Cz/lukDtiyMgR+AhWpQckkhe7z+tiQnnPgJKTea/saBDXT8kwWrIIYThW7RUJuq37vocNLIaMHA7ZQjW517PxPg8kq06scFiyWKUX3kP3jqmVQ6MjPT050L6AlXWJYQsEGzZyCD1qbxj/bjLTblTYCYOLswfc+Z+WCBOJu26e1n5WVCX1DLBxAyV5bIyeyXSYMGTkMKyuEG2j7E5YlhHrzyVSovPDaCAQec34tDayoGfwUtL6G/BS9tivmmz1NjwYMnJQaLdNWJA8ym4E+/hlGy2VUoHXuLnBzdO7lC/tJa8YTStXASsPYNzv++8NJoaNHMFYR7C/N+pVpL5vEKysKEZmOUgGpOunVwZNRJAWWs2ijKUnthkZtoVj6Mhhced6zBK1oea+ywCFIx9e4Qn2JWRByK6dNnJpOYCRKB9eA9IJFGIW26QPBIaNHMm2NJiuHYNs9X2/ugT7FUpJvr/ccMx0Q9/SsHzib1M+1qVFjJI6Q9NnZRJDdjky65iLlSPKuxNFKsY65wWpKjFqYQFGc0vfZC2dEEhr21TTZcmpmfbQzeaQXY7RbG0ibIlRJZr7LAQ4C1hcUTJ+OMIRZXzNdPKtLwBAVi68o0nYQQcmCruRnDuQGDJyQH30Hpi+Habdcxzl9sMywIoDIpFlOd2ybtp3Ens3FDDS94oV9Bo3EsMWRDpk5Cjgo+ac4mGxO9qDqo3ntIHq1KUeUQnjrugzco1MVH9qtkletU6tAuUYsuSEISNHonPX7Sq2HLDdcaEbf3FeByc/iF7UkMVNTOdX+iLHEbnekFLkBQeTUbUbuvNAYsjIIdq3Vi3suuK7FQwu4huGhVFF4GY4N/R97osmS1leXAosqWhNhhsPEYaMHNZtPtN0ia29Pmu7kWXvD5SBOLeX/BL4zX33vlrYe1FdexmpxbhAw1PzfBJDRg5SvQZSdXIlY7e2FR2BnF5oMQHolr6Y1PaC3r/j03dSGpaNSJHYzeiigcOQXU4kCvjeD8BMft+9pAVrasCqJ1oJOc74bX3bxkgbQwaX4NB+xahNvB4qDBk5XOeeZ1UTI4HM4j4LiKmEZSue8t6Yi0Cdu/s+b9UGMjwj1Pm1wByzIXPZD1t8in11M3DwMrDdYX6k1f8psMxVYSUJ7Pa1fZ8vbsvwMLsdL3V7j4lfGrZH7f4+gdmF4meVZ56/u+N4M36nJUpVtd16SW4b+40Ybe9VBms2JJH3OmB3f28wMWTkMG6MYO9X7v5IyxeqQM4pEGbE6v4GLgt79c/ZuLGYqR6y7WQSQ0YO8p4By9q//MBfgnrpyb19ablN9Le+uV8zXjDxd+lxFOeO83b79wYSw0aO+njHFu2+GVscbYkg7WXqZajcQp+2siCEYapPryhsuni3f28gMWzkKJ0Adr9XkkWvUKUtmyw3eov1ZVeOqmdFUWRg9Jjd/r2BxJCRQ1suADat2f2RysVkvEqLzJGQ7ui3qC2cqDlmV2BwxCN3//cGEUNGDkY3OSxd/ssPvGcoxn6eIHKnCZAZ2thvb21PcGPDBZVRLd7dnxtQDBk5bNM2of1223pjXlolWaq8bRQH23QHaboi8oKedKpP3ZrD8pA5Y6cwZOTQJXepprP7qYka2ZuiEkv2MRKgTdv681KWRc8meolw6rt2++cGE0NGDruVqmHDbndZU2zZCGbx4L16DalZu7XfU1OFoyKtxcOahw9ZkM8khowcnNDOcNAvP+6XIa27Dcd4JBIGtmY7t9qCihzJbA1B4klDuq8MGzkiW22/sdvDGKN7EwrbJu/FF6/ZbqpGQYnYclvlln3bkMVxTGLIyFG+VazRBbttzhYHPN0ws6+P9VqqcIfTZwSrpJrs42tzJMWSYfPVT2DIyJHaHm2+sPtXVexYMOLiTbjAWG39gsWoQXHWYFY46eT5lWNOYC+qTvXk3V45zMo2IWGbZcIoGybqWvdQSeDj/7MxSemV+8yvHIOMoEAIloY8r9n9q7J0Y4WBLc69LOpbcE0Ma7DIhFNfjkCPGLa6HJMYEnI4qVewuFOXiqW7P2DJt2QMVi2te1HsW6drZQtaSlikx1qkym8buvT6CQxLJFj0StJu/vum7i77vdhtzictqsgeR9QAFs3d6s88aPciRg8QBWbBkzOYGJKVA5Ah3XhuamzpobOwrfjdOUx+QurN0bYNbEeOLAsv38CBvYZ15RgWcjhAsdtV0Cnju18no9EW5IrDexkIvnnLdi3qVRWc6/8V89qO2O1fG1AMybYSjhGVvtjupPKikd0fsOY3f17sFdUrAHAOdICjHnbLtwPMKgRsXk8hluyh+x7mPtAYGnKUJOeWfwsvfmCz274V4CGZ9qsOn3r5tf/4T07+p8Rxfw/mo+MjhGvheIGHjNpwcmNYtpWKALpXmsJO2X82uMHtVKXv2Rn93dfxxAqeApjRZtybn28tZvaYkXlVdsBRY/gnM1Qf23tWEpodS2ybfm0v3m+h4Hag1G4xwuY/xiSr0NDM4vYYksvKlCyqu41qxd6zc00lNYW9pl8HD8tN56K3AVgNdJftXSmNxrJxG1KX/ZDIHIkEJa40aTzCZyFr9bf+I/n4g8rnnjX91qo/+9OJ9ASvi1HbSKFsq05Jw/KI7YghuSzLQgkXpZSYhasqFylX5CW9aFQVgBrohYN5wsNuuF24nXhSHfN2joGGW4izb5VzxIpqFuRD7d2uutjCIwEmayv0RBmD5IkcH7vTMJ0ydDVbpjAk20px8PylzaQ4cVbKgVbHf21k/1WpXtV7aZjY9+RbJqJFPUFLP8a9qZ5Y0pBqssNCjgSiurLKcGrTno1KGacu2vKC53Bq/1sPexjn/8WtgHlSJPMqFw55YqIMWxnBCQzLtiJSaE1Y0zq9PSu91p7/54sTfQaT3pCnfeRgDMwtRbMog0ld0nByY2jIEcjjQWGM7sdu9Fnpw/95TLD/1KsthjnY0hdZgVYC1wUJ10Ht1m50vh5sDAs5EiKtTSK63dBsWCxtH1XTs9OdXI4eCWAWhfNuLRZpNEXc9xalg40hkTlysgLXF7Nj965tVlzotuI/vrrndydfPf9O/ITX1rAYwM2cr40DZSHezIZdZRAxJOSooOamNSidsgjKLKgPJW2748491+zVu+2rz9tsuujUh2MmwDG6P3ai1X1wrur5eI7BRlZw3jrQIcxOe9fE9Yo82UJwrAOwZeIzOZHTeoLuyONVhtbQMSQrBxWmm0oqsSRKitlQH8oNNfss6c2PtuUeAWRgIrlZWhh4dchRIObjOQYZsrDSQtTZZ6cnTnft6sa6kx61bbFdn3pLlu5cj9E9bWwBNmydqCcxJNuKgVX7I2tuKb0Mtd1Fa793na477uj2hu/2+v1NzpZZ1o+vwsOfuSDKsIZzDM3KIe9lPOsSM+1OSfxp/Pb3fza+9pDevzcf/rcmWx6mjQCV3C72gNiIl2GVR4eFHEa3VZZb0HqKlzQb2gpQKN3eWhHdm+JAeQA/7X3mcVt4QG6ou7tfgmwwMSTbCtaiOnQRdO/KtlvNNKbRIWldbyQfb/66t0Cs/RCAOWUNkvkX65pWHtJ9ZUjIoSC4oYtzaWUdnxVy3BjF696Okcc59wXfXTt262decjvCLKIebRmK796EohqSWdwRQ7Kt4LLy9Q5R7wvt3ae8sNVnbKN9KL369x24/s3TH1sqqAtOK2EaUgPpsKwchlkaJZGPnJU7ZSp3b2lYsj+BIR+fEjoNMCM1jqF4+MocHvPbyiBDDcGZ7YIXo7v7u0p4ai1uVWObcYhgfPuPTdSH1y3TglfWyTvzLvuBhtXy7pGCcg20bLfTIb3Emo3d2HxFIDzRQH8BY6zLnk3HGBuXlXpYzedDQg4wWj8wau4uUHZbkorkP3fcNhkWgbaiXtCGCSOgjvNTEv4VU5qd+JEBxPBcVvDtDg2npiDt9sphaFXxEncYco+dWzCFnV4KFnflGFZxdIjIocJtWNU+GI/d18GMfAAh28K4gcXObDM5prpbJeYjwQYbgbmORtH5aWc2rNmF+thHA+to0YR5A9s75hPfMTeajnq9eoYSQ0IOR25PXgDpmja2+6qlw6Lnk/SfNzi1kxuD7QPMSqcEoBIMaTLk0JADZKx0D19LmYWrMuXmQQZrfg4S1c7bSvrNGlGl5EMzhTthSK6s4HRv3RreLGwSvvtGKauqfRYU+HZGhmaQcJ95qIly5zYNyxzujCG5sBQNrTpZ2ANqZiFXNqBsEhU3VEAzoa30131SvS9yrdkWs+TnG0AMCTnAxYYE7EvHd18IMIovGSWzF2E4YYaICaHUQgarRHgiaWjZMSTkKJ6subarpENKe1aCb5JrxLxaSTZSSNuvG2BsAWK/PYMYVgvpkHhlUxj1SlepqjQbjDcFXqT4RdMSeL/jtVeQVLqLXlyz2e6XPR1MDM1lmZVDJeKHMBtBnWZp+YK6soWORa/oeZ8qKxFlI1i+5qZhtYAxPOQIkfasvY5LimZBfwhctkeTbVnqkkrf/e/VJQVzB7xZk0KzoB0NJIaEHHKjubobDQfYbBR+lIql49ziFiooZtBrJIvAJOFpeULRXF0SwxphPCTkMIrqu2unXur9rS/uKxIO46aqLaO0+gzkBhPOlD3CqGgZs5ObO4AYEnLIE6xvwpsrEbNQo6sYmy4s5LvNSD3vSR/jZCoxIpFHjvTuvONtwCFZXGpgR8/OcBUs3Ac4ughZr5cLCVy4iMB9ccJpRnJrflsZbMgi/AQ8OKgQux99bopI+xjVsgRGt3f3Nfl/EYYXD+qFVZlfOQYckcjHK/CTk9kshN/IYWki3wwNtHYwB0mp4XaMltaSGNKSYMNiBPPG3TomyhaYjXpx4M1q4W2shjyRbN9bPExFiD0QXdZm15DmUQ/NylEHdo1b4udiVryycPdFxbR6YwBV/zMkCEqLYxPm7SWVpzJv5xhkBA77FuHMimopK2xbjbOh7V2FaqyXZdmTLkpJcFVI0bkQTdawHToMyWV5jqItTmk/ECnPigywcG+kFSPU5jsW4JBQXChR8QMNrWtlWMgRFWn9X0SqV52OaTYEKWf5oRZ+1R2Yyo5BokrFmn0dy9xueRZCBAYTQ0IOp7B5jZXmVSMxSyIH6UilWP3TIBJtTJMKssu6TVL9qEAwUirSvCo7yCjUzaYxVbaCiaic3UWIPepCGnES1EyXRRaQwZePADw+lVKGVJUdEnIY1Ne2yVqNyWYl5teibvC4HYD2duZzSRjLx6mpLpTSfCTYQMMj66wupCsCi91PTZASvorKdB4ZaG9vBI1M5pTFlj1/7JMpz049kMHDkJADvKzLpNIuwmdh5TC6HJtycDOOaAHmUyqyhDhwiUnoh5aHZxa3x5BcVjGvHpEIP6MWze6nGQkSBy+vRS4OjEy93fsb1MFR4Uqt/Rjp7u7PDSiGhBwJyiGOArzX8G334Ch1Ut3A7ZvBGNnBZV8Ct9/0gO6xJeYLxg00ZJG2NWZ+VUa++93KIWgfcmyquGKzhFq9ULCpnwPIj5QRrWOHtXTL0JDDise/mRRHMxshpNCkghaVXI9sM8KWBL3efwYGJZuoPmFC5SpmpSjuIGJIyIFz48+F7f2ISj4LVrColcKektrN1k0FKzv0uBZm6PyokpWbu0NbGH9YyGGxblNNigMIZkV7EEUnlw5paZj5yPYjqhEwRinmq1vztc8HHOGbKeQl3aai2n07h2HUtrYFdmFdsFbfPAmiQNhISyiupAzpwjEswT7E1n8jPJ69DPIs1Iw1UJQfdYHVJRHt/jhBKIAYy2AajzQr3X8GEEOyckSVPw/GysikWRABArPUWlAn8k2J0OjkQ9QbOrJI5ZhIFq2jHA3JLO6IobmsssCwg8/wajbSVuhproeVQjoNPI1u30RUXcCeZuGE051vVz7Q8HL2Jrf8yIdCng3lwYSM/QPKSjIaqYC+1KZwFT9qDxXyLWqlIQ3omOPkCFAvpeSvDHiUdamEgVQgegYIBUQmprux/VKBNYyADSnBmqbCWgsMs2nvW2Be6k6qsBu3DqvIMdfJYZOVhW+7Lor2fyb1RP6RWcpyOeRiXuRV19WrChn3ovKPA8RYKfWy42pCdWv79aiUbGltLjnpoitjfuUYSJh668TmP7/bbGSfpWGEmdNQqCIbogIc1MKb3s5w7zRdxwqUsaAxa41i/RkPShW6uQuWuuwUYjosmOPkmMhEtAv/BanzBFDgKGpSvv1mQVc0mxqLYhHRkyrvjUNfIcOyW3RuoDZsoaw/67ERpRkTZC1eyWz4cgYRQ2DnkMH5VQTtF2pCrXTKh//l+rz3qtjmvrlZOLpowcI/OjF7roh71TzSBMRYgF/yTMlZPJnR1EPTtKy1p6lXHmxIF44hIUf5WgTtFx0NOJLzi7/7ZEll7Lo+aeAHH3p6rjDC701GnAOWMcW1RY6WsF1duFxhjKWiVmds87DmUc/9bQVA2y6Jms7zghDI2fiMTxcr9IoLe2V44s5nPenK0mAm3cvEZzMXjKTKg6W+XS79mFO6V2TRULZqSOXROU8OgVHG7m41tvBwDyeA5vbbGktOpNEEZKUoVuxrD/ydTQ1m9ssTn3tKTQ7MVhBgC7afqCBXrbWGi1SGNb54zm8r4TJV32qiyq3lWK7A8dEVhXWkNzxvzzvWXLj8hC0Lr377nTls7N++/vRXHJfo9frM1a63FysJ5+ZUxCiOa+9c5b6MGGWHS2RhHHTEsHrs5zw5quJk+1l45qgqqMKV67T/z257rtV7/XntB/G4Oo0tePDLv/b66wu65R//678fNiFA+j2IHj1vydpisn0yVVMvRJimdODxKnun1/tv5VKqIU2InONXlUlGVd1gqus/WehOwWq6aY+33KbuQ9seESMpWrTSk39+1slYXe5+xm9fkEIUXLsWFmQYuhHY8/FVxf/X3nnH2XVV9/671t7n3BmNerFkW+6WjW1wwRWbEnChd0xJCGB4QHBeIAkEeCR5JISElJfQAiEUU/Ighk8otunVYIztuIF7l6ss2eplZu45e6/1/rgjaSR0ZZXx09yj+/3YHs+9d86dfc9vdlm1YIba+DIc2SKtk8XADw3W1HWlx8URMR+VNde51/EUvDMTWrns9F+JhDNBNTsSvHaZds53LhioldWfe95qFULGu9cdVLKw/j6cww40r2HmuJQ3Oub16lYXhRPITT3L9rg4MJGS65cQ5XkzSSBYrenvb2677/sccvJCDCg8w9zzbv9ClMyqD3pOhPRYvv07HxQpTkelcGYY2GbXfFvQB79GdNHjKLZ3kV6m18WhDiO/MU1+TOEAlRT84GOOLLhkH0J0cDPJGtoUtu/r3+6U1Yd/EbQibtcBF8jXt8XrfTOQmCky3kRa1VgaoXaX4I3dkPa6ODy5Ds4uDIaQwpxSqC4xWsddcfxABjVCwEKmBRr9Lz52ZIW97BpVYDv1mhzkLjzS8hoKZsRORZ8xUoFtkBIh393YKMFeF4dJKRVza4TDKlwlA+VNSnvOwYngGfExb4pnw5n+R/8ssPIPVsa83RLpjum66Ik5MTqZoejjo4jMcqy9CoLcF6W/IZ2UKKSSewjCEwqEHMCqRyUQUXMJGwN0AhIUEeNpJ6vwmz+l0yqlG4L6dUkIs5FMYOp0YXOuLJnA/UQMX5SpG2o/73Fx4BZzukqyzZsuCYKD/uqBbJzu2/K+pmDTf7zQY/ryBW7btXob9QYI2Y1I9tYU8M0zxHoltYqUkXhDoOjPHJOSrIR0lQeWX2kRXEh8dgSf/1LZVn+tmJTyH/ZJ2J/cYqHq/hdvyuJ7kMx0AwKDM7eomp0hnjFfwdN9qf24jGwS0OviUNqEI8j4YusUrA/pDiQsetI2zxAWzVqvvHK+2tqnfLMut7esEFqCDi2MndqjM7cIXHY3WzJiUeC42GpqT+qeF0du5XAiStmKhqubyEl4FgjbKJCvSd31kNdZlPVvumk7G1LNHH5sYVJMwYQszGS8q78S5SerSA77QN1fViYnRpBfA9WISSeULxwMVKRt3foccyD7X55oXqz7SHfvrKGpWl+RRckQyHO3mIkqR/e1Fhr0dqzszxyTEi8cGcZKCgTJKrU/jBbDG6JsI6Fa8VqQaT94RZG45Lqu8X3iIotvR6MmgjswT23caSVBPvXQtljWXxlN7Und4+Jwccm8TkLFTDUsQFE/gtW3XYlsoxaDEYocsDkfi66rFnf1SQs5DIqQVi5xFUBmblEXOYuFA96HB9Itqyegj+3kpNeHZWLBF3qO3Jo7sT4e3z4E9V+2x8xejA83d8SDIayoMQ6qu15XNE+dEowDj7BOQeS5hoVN12kDeojgaNLY1DjBHheHKCS7DJIsCCimiJ70LkWveHtbgzp4HmfxiOSOMevuFISHurrMDBEbTuiBQ4EIyPROctQYNQrHPglTqz47IXVPJyM9Lg4jawwMAAs6IYOJOrzzD1Eu+KKBCBJk3J4gB3DLtnBqFh7sel3FmTIA1t6A47jMYXz0ecrA3CeBkP9jIvq7TEp6XByKgM8eBS/dQIhImnY+WdMfXZrBrZPKMoYHLImGPGU9NvvZ3S/sRmsOcMvDY/aN2VvsOUwd4wjRHHzx8r44JikK9jAERjx0/KYx1ge/1k3rl15kLor45tsqYJFsxRKV0vfruhyYBPX5IOsTgoJN32JnUQloPkmMTPVfPf8hdqHHx+WOZ7sUlPVjQ8lt4sC/viqYb3jzRRnPhM17AsMjOSh/Z7E6MnQdvQDhePBirXe+H5o6Pt1lFHEPx89FkPZn1z8+g9vj9Lg4RJBQzI1k1oOQlaAmzLjwvICsfPlVw1uGAipODrSvvIxa39HqOnMI5nJ8ALl87IGhGeM9NRncmX8minP9JY/P4PY4PS4OsrnxrCQmo2YQqL1QwN4wVZP48z5+v1NvDu9zHKH9o+e2haGndC+6YqjYnIxWPxcyjrSGtigK46haeCFZKeyrj+sQ9xw9Lg4LiHJhFLyNJqDotIyWM655N4Hh9x7/zmuWG9Q4ViMZUx79k9Utjx8+aOseO5tRHJk5iPGoISQszBk/zVSAaf6deaC13jDinTqlDTvS9rg4FMijG5Ijy9Q2GzxF7PD3vCrFJKs+fvKR77zhh7dsGDEtSCHr8vNPvos2f3Iesl2zt9hIobrCa60KhKHxz9WOC2HWWwozsSU3iGKdnWuT6PWkJoRAKybCvPG3JqvWMz935hd/pTmx7hOfSMXC+bPbac6Zesv6m6/Ngj/nb9Rz15vp5GCD1DA1l53Mg1k6LpI4d6KNB55Sqwntu09WBGptVo5Cr4vDECRbZ0nJm9ysIcUiD5133ue+fdk6K2snL15ceIrfUHcC+Dve26rK7lUAhUjYd9YqGCqpShDmjU+/Tp3drxw1UIOzXHFcmpaj0OsToWBWjxjOveRxMaEx1wHiW7917XsHK1VMqDMJd8oc933TR+YB3Tv3ZQzqNqIPd5qt4PPGu+zr7I66zzsoI8SfjYA4uWFVXHp95hCB1AYPJxLGzQSmIcVoGg77wMuX3XTNPfdWrUddXKYdsPbgc85aNAehbLe6XjYgsGpY3O+75EW4BZg3/nnv1B1j6IzbNUu66I4nijSviEuviwPw1StA9nudado0r+cAMQd1sfDEJz83s7ZqP6TD99YnHV1NRaEu2I42cBNnyoKlIVf/+dygAZg1/vnsgru6Pv0CwdUfOA7P3bMre5QeF4cJwobhQJ433yk2ZbsHSDGASyoFD8wQ2V8FXIKSQy5Ire3UYBDU9MinXJyFdVHaLZCh8f142qns1IE6qajBuOyc0iLesFLGPT4axfFf1uYseWirXmux83wnRAwVggBCQaeVU2Q7cRidymFzM8j6TGE4c8dvSGvBFZSZZUSQbwSE1LS25b0+HBF8GdHi0tsnpBH1GCqWjeMiYsviqCowvsL12J4DZJ9THMdrXL1xTSJ7XByWJOmB1ICw7ZjiXbuue1BObYlxKJ0c2ZmD48znljuzjsVnZDRQrRD8satJ9Rg9Lg5VVI4ZokgLFmYmoJfGGCJ4qmUDFLMZAIdpU8c9vzFbTv25MzBj6feybVGLshH0uDjQpKxOoS6OWRQn0LchBIlFHoL6JyutVmBw2rgSDDnRacqipxwouMq3Q8Oso9Dz4nDcfHU7a33b4jzBvg2vl23wwMq7NQAeZox7Lo/V2Dc41BW1n95Px6jeJHpcHEKB7B9wWq2QJy59xHEXKxGjviUrCDp9ixIMna+B6rlAiqu/A90PPz1Kj4sDgFkafPD4/Txsp6TCTiIYtA6d5R704NB5ZHDcvXcb22BY+apZKMm+lUndzfG9Sa+Lw1yTJAkjd9XsSNnqHcVDhtAGm3EiuGC+/7gdTapxBUdtxkvNFG5egjfNft7j4kgqHpcXKUtZjCsTutuYE52hQvHhq8faukzZaubofFU5tCWGPvR9K/qnlUlFBGOfGYIvX8sE+jZUcWf+4Q4jv0gGCDO2FEfnO00sr1zUuU+bpo1eF0c23C9f4cqqtRNrZ8iinoNTlutiJ2JsvOdtzM4hENNhjrvHGmtaI8AeF0cQoqyyYDplcPtBfzuHozj1cqWq1jsBx2ePL1KbxoKVDX3xFMQl3d23kE42zDLzyoyUE7riiwtJopkEYaxa4Ozxb7tRiGq6z35ocHlY+kfZyYUHDdxfIznVE+l4M6AgHYJnX2wdN9sWe44MnblDCYcUGeLqdtNWlV4Xh5hbngXQPXttV1DJFWngLIXwkHZc8ePtHBtnKbFMPLgml/Xi+5qmjV4XByLKoAdnYFBqJHUah2awHegeux0sFAT/H/OR/MhwdhNnZrH57ksF4kpSp30cSkX7q9KwENKeF4c74bSpWZgX2oVbzCpOpz4LvhvlmBKCyIZ1ImH9TcEFGBwY/76dLzFrap0+zYjEb1S7NZJJSI+Lw1UsVxtw7h9pjbV+FQxNYoR61w0fJe7Uq4fdfOSL6G/77Me+ikaOe1UMieo318R+xtskQ23EC5EVG7xG0ZTJIVt0g2KXb1YNSSgO2F+CcVmdQXxg6viz7MZ3B9cDBQbhhgZ8mlvQ4wHG4kicFbJzwFQv6qApdhq0LclzjV0vAVlAwejAzEEXBo7s+Ox16riSyJmOE9YlRZ9fwwi+vb5gPUmPa91FnNswKRZMU0NTIEN45C2nHn7EuT/aDcNHMhjgzqWSQ/VEMUCZPu59x7yyLh5NXnAkBa14b8O00fPiAJErVL1+4CG8IAih1kfOvuDBasnPL0+7freiulPfvdZdLKmCwYzxQaSddxcz3Pd7A0lS+srShh1XelwcmhxfXxvlM/bPwZGaqrjqLTcYz7tu1V8W3UtJPgaOi1AcMA9N4ZFObTGZEsa/AICkgSB2QvCQZflojy/SW9Prw4l49f2YqN7WSWEscnnx71f47316CoNe5FAXKYgLKbrkHT69iCmOHHvEqkw+3nGJpnM2n4x1GFMXxtKi9g3ZEbaTQteT9Lg4TElrViZ1v+VkR6Aq171xLQxecYq3ppz0tCcdXuQgiGlEsB0/2ioIXq9OwcqDY4oAg5vLZW9OcHJxUavUkKGm7Uh7XBzuokXEYrorB0ihrP9uRVnpyP1Jjcs/OvPkV7xyqKijulgudlwbprggctRtIbc3Bi6Pd66MxXO4OC7U4opJP8B4UhGEMDTUkhQOwiui+Df/PlSFk1AIsvrStz79GqKQ0dgp4rFDjN3lorAK1qKAMGMrO8dYu0hx2u45+7LF/RjSyYS51yuXt11mnhlEHZNPYEXtAhkyWnPz2f8qWMxIvePzZMfQ6ix1Avfk5CA+fZxNbcwrC45sSokI/RjSyYSKqCXUh2eOEMX0Z9cFt2OmFvs+70VTYnSjZRve+VduaJu4E3/Y2UGEtQJc3xGVb2XngE2HlpktpJShHv8wf4te33PgGiMWi9FZYHr1y9cjr/vsTeWMfbmK+qfvD21a7b+5/u8OmN7y7sUDf/u6Di6sXyce/NbRIQFkfCXSjmdPvGMW3WfI6orhi09oVuGnHhe7iAqGpLU3UQHVKkp/Bsc+YT7Vqaee/r//eZpKW+ySU/54GWknYkwlCEZeucSkstVagyFbJK5sfH/AmTG3Fgq/tFna6HVxQGZdJYI8REHmpxKq4nDrFOZA+dMrPqQCDH/+tcsFNTr1Qh9TJElIAZl1AB54yWB0DJ8+ThyJDIoIglC8DI81y3bZ6DY56XFx1ATWjjr47Cyp4Jee9fRTN6+V6Yh3f+ZYcOXHZ95DHYyNkUDbp7PJ0BALnPXugiKtcdE+W8srSCZY32U/mSgAHRBTBgRl1W+INiduDvKJml//lUPBlBue/bXCTJ1a6scetXkAfFWNMV0UU5jSGi+Ojf/vDuSM472+g9uaHhcH5KoedbE4QxG7dYUmnta5sR3M9ei/PekgsZIHfvcVP4xJqgEPj7kz9c6CsWaDBgmRzkwxOODjXzD+9U/TADanbwSbXIRyn2mqOmWqY3pHMg44b5ytK2nMnHv1N54eqpj966/4XqBM8ttdI7emE1Kc16y37Jui2svBLU8rne9EgBMsBYp9J3Bck4EeF4cn8GDZ1t7p5JHrA2HqjM0VJ4ltDyhP/smzcbzccP4dTtyBM4thGUJQBDuSUQQ8jit/Pr56HO4PDQajXtvjn+bW9PhwJFLftopCuLnycjCI5Pa6cWOqW25gdfjg26bgVbz3VSN4fuxRjzWun1oghUxjwNVBx4tj3H9x/+BIlsjtKyZybHueHhcHeFwFtQ/MGZCq+k1yPXBAN4eBF6gaWnDCx783K2Dym0+Z7EAccECVRGsAS35p1Tn9amvbpxXHbw3BUqkDv32lXqbHxWGIHSbA6LHmxQO/JPgzwlapb50h2hmfDCC856UJBcNtOydazwpx9J6VqLOqdDXFffq4Dal0ciYBRMOxlhWT/mllMqFGmNISZOo8JS+t1ByVbRQEzUX96n+bbVak757+dRxx0e0MXgKeGdh3VjCmjZ1MRFpdTyMbHCOtWToRY5o89Lg40Iq5M1yYOReJw6rOgdW2KvwEhDd+af+yxq5+xWWPXb7LkZBtxfqMrrviEQCEKV1/agUqoq2GFejocXFkCqLivuqenKksEzodMH4bxZ5z1TQpaPFJyQhpe2Z0SRAYrhRYs/FoPLXr66eCu0yb1e353qTHxRGQvHgpxA03EWgjmu/f1lbCKjyp7vf9w2tr8/VPBs/odjtJSwYtvMDCwfuC4PhQV3GcEEwkNy0fssfFAVZbEBLtYPkRzAZOs21MCFpWEhNy4qdPK6H+3Igoup1U2kRQSFOkjeYNwwDOlm3etuDti3BncEbXF/QkPS6OhA7sN8thcAoa7g4wdLRuq8KOl+pBkKdd8TwN3PCJtuDePXArgDvxwCcALF3aiRiV7uI4pBDg0et3dzyTix4XR7RsK1eijDwyAjdllRXf8G3F+QqYUBP8HWWW9Ge34LKd7pCSQPCpRwUs5NyJId1OeuWDD7tQrLlyQgY1aehxcZiGTpNH+dmg8yxMeNe/hxpIQL2Ffz5AAfY7RwWX+F2XJN1TFTyglsX2AfHBIQB3n7Z5vVKHcZV8ZgziJL70m2SkxoQZ97g4xPGFR2DCBozz3yUW7W2veAAjUnvhBFzHx2+0w8hgRtIhQtyenUNA3XVJbrmZOo7I1t0fx80jU8+lULj6+MP/LMewG3VBJhW9Lg5B5pxMwANuQ3/9ppA8XHT6V5RRCt/Y4jNtGmUuGTy+wDicvL14MMdA4ZjQhnY95mYb2PxpbbXAyAuoLcZ81KueHKBqSBR6jxt8XaoizTYolxFpT/lM8W9ovez3Hn77wPAUNYlkHV+7NjAyOKOm1S5cOolL3a5rHpT8MKjNmCkde/m4INKtNr31gUVNDeveuU8idrG09Bw9PnNAKcX6ANz7CLnV9k9+mDpI8d733zqFSkkESUZ708utZGnUNp3U+a4oIWR85XezOHOmIyLAuGAf2TLcp1j4ZAYIPPhwinG0KT3te10cFc4DGarhnwCtzDv+8zl4nf7l6LNvKWuJJKJuznB2C//91RQ4Y/8CtmdEd0Hw9krU2bBurDHkVuIY//LWmYzicuA+sc1AU2pO9rg4JEI6LYLk2wiWovPqb3+gFUI7/Pipf7oMQzK2qaQsEkevGNZa37GA7e4a3R1Bpz8JhVUjncfGZ9FvdfuNwwYpzO//GmWiKZ0Ae1wcSbPE9x0ZxMMhJI1JMuF917034rL2w2ec/fEHQkDGddj6zVv/EuXgF/lYRcouiBpiTH9rrClMcHcY3zGjk0a9SQQhPzhCDfKjJJGmtJft8WE4ITEYsgeOS9E9EoGjP/iFpxpw/0//+LRz/m1ZndwBc/j3p/3HupDkr1oine6yXQmgMF2QHCsTUfChzT8Q0fH9XTxMUQAf1AZ18+rx00qRNfod9wHzD4t5c9PfV7/8y/9+DUWtS5f87P2DT37KSXNX2v7lFf/xfUJ2XvL7O3r9i2tC2nQUDt2PsgyYupr/4uH9CLYTiZeTmR4XBy7kO9YEy5WMP162W294/eUXX/cTg7TC7r/YAnlARqSscqt91t/ucJGVh5HEmlsP6mxQ4+aZo1O/Y/MLbTZg3npk6f5V2ePT8SZ6fRwhW/SQhdXLsmw+QrZAjvvHS9572MwhATzkTD2CVnHO6V/9wVE7fPkjEFquY2mOm6u/bN2Y3GU/wZ02P6acwBame5ZeF4egVBGTFlt0Sko103zwb26887tvOenEI6Y4A+QBsp7365+eq+0dnvWfgtGesm/RiecY19R4rCngpl9D5gRAonx9GO/5T3WMHl9WqtLChtVtwQOSN2eypeimVg/E8ulPx4fXfe/2csOwLp/yu88KKdY7XtjttFZODH/kw53SHMX4ZWXLfoMes4MnblrbGANpr4ujRBg8buoG8fVLDh93ryLihJAl4JKHBs8DsiimOVJgO/qnPb1cT1F/8f3TEP/tPcdmTMuiAhgY0hLazagr2PMTYDA9fr5btCU+fl1xlGRBsU5bccyC4moBsx0f9X5HOHU4ZrY7jmw+rfz2cWRoahCQ0XWPmDWl5mSPiyMjmXYJOf6njM9RFCAqoEJABe0E7Chs16myFTKVqHnaVFEPuG/y2dsA6jL+7ZJ1Eh6iqDTEQNrr4ghGweAJJZIeqmzrM8TuU4+STO9c1vGtjLv+Vu8k1l6XBQ++YSki/dPK5MBwObwCQvE4WJ6K4woRW75srInseAspW1jCNESckNHKU+9/qh16fBhZ1dAAJnNEthe+s2ukodqdsgTw8T6arat/+QYpQBSTpnhWel4cDojdHARftcG344LfRXQtBYOb2jJtFke5lQ7tkdFaSCYDC1T7RrBJQcRh9bXZlTvXPA7Lis6QmtHq3rFaxZtPK4UzvgN2CqNjh9tyENvhk/Ikp9eH4S4UEVxWr3scLp+HnYKNCdRbGFK2QFlVCi7R1z5YNUUbvS4OR51p+wV14pqJ33KQrqSovd0C2VjqvMNW8TzavmNUgCTRyx43LG6mx8UhYmKLf5nB4+zHY6lvARKXAD6+tnkZlfGVxYwvqEUcjQdP/C+xp+hxcXjWpHP2w4hTZjLx1qfWGdRKumPj2218PHSS3zbNJPqTh4wcguTR1akpaSu9Lg4JFDb9pUBuu038aYVnTcEJczrfbNZeMbBl9Hl9SQq4Z5F9FsTH4dfYM/S4OCBRsx90egFO+MxRV8Mt8UIYK8Kw8fGtxVHcJQ6GsXAhqSn28x4Xh1u0FjcOgterH4fgvGKeJmN0xVjxwE03vSyQ8TPJ0tsxgRh8uLLQF8ekQNSF4dGRAO37H7uY006T1fIAeszWj08ptow0vPMBR4UsrFixuTlPr9P7wxCmPKPIhGLZ42A+DyZU2I/Hvt10/amFM16LK0TIjiRyJlt/5pgcSNuQDLl6ME38zOEGSLhs7LtNjw/qFgEC3OECiqFxULZVzbAn6XVxVKjW15qgrN6B4rM7zaUezPPzO23rN2fOTRnT4ViJKX9AHASija6f+F9iT9Hr4iiJxBcULkaqJz6OQgJ5QDm2s1BsFscQHTEokDOy1DuellTUGW1KVlOPi6NyBDlqiiJySjHxwT5+VBlGTT7aEUfeJI5ZbNxxSA7B0z0ApoT6NYfCY7fs6A16XBylGDZ63RpzZ/8d6MC0s8jSSim8GMuN3XT9WXR2IB21+Or7Oq8mM9d9dKJ/iz1Fj4sDQAeGFcHvrib+tGIHSV3UvLIzF2xeVma5gDvqBCq9e1XHpU/Boloak5rQ4+IwMPeFJog8Um43M3qX0KcfChTrOrfeNnX5mdvZhyIGqeSbpp0GxPXhLy1NfYc7X09uety9rKgL83BgCROfwOxXPkQd6wc63+VN3R/naSdV1gMeGb4IENyIfzvdxXOPf6ob6fGZo2Ns2BAQ5eqtC3VNAJJGW5rCQ50uxLZJHNNx78wmWVi/5EExQJzTnlOINCTHvufFkcUdpou6UO9EU+Ed5sTZ7UB+qBLIMlqjncl2Xq3WEUxwhi5a75SOR95VYKSGFBPsdXEYqlCbSVZ/HHwrbKiolXtWIAhZcTAYmBbo5MpmJKXvCGUlQjr9WYMo/Tqkk4MC8ljFA5tZTbyF1B5aLyqsfyiBMlLjIo5OnaUoCB7xWNzjWkkw4vumkcEnfmO8Z+hxcVjtQVwsuBaHlRPv01CXYJkYxMFGE4Dj06dtrAqGG1emTlcwnvX8nMPY/NIEelwcWgjY/PkZq3+9Mk/4dG7iGYrDDgngnqRTftRnlxvN51mCf3kpChblxYSA6ThjWW/T4+KAnDwcdIyWwuX3hgk/QtqgeIEunImBtqGz01gg7ptqkcpIxggkPznh5kL3gvu9Rc+LI0RhxWKrPPDwxJvP46KDqGmvsQwiG8a89LIfIg4uBGoWtkBz5PCjIqJhx6t/THZ6fRw1hj+6WBjIcm2aeMfb1BmqgIaAMzJWmENnd7YcjpGKtKodgmlqvWKq4clNpSEW0l4XR4EhByx0HS38mIk/yYrMMI88tArDWNdp6kNexCaHvTD6DbJjevD7AYmiPW933kivi8NRKJ+DlXUIj8NSLy8e9MSyB5KCrwdzRGXB5heE+p4HKYxgTy16/tPcih4fTmdbWPxuSUKvtIk/rVDUQQiDEYKsGTu/xgM2Pq/U4XqJNSIcExpygt1Ej0+AYwvJ8YM5k66deKXrw/83QTxqIXVQVik4+JR9N70gF7TUEE/Rurdv6VF6fOaALIjb1Fzg96yY+Mtf998tIa3Gg3pegzk4+8zc+LRB+km2KJB6/O9sG/S4ODppZ3ndcurAg2snfF73BWXbZYDKFQvrrZNsf+AmHWioFn9WSCbC0U1bVXpdHGIK6JShQJa8ZOJd9kceJeKVqZKVNYgicPC4kpblJZ04Hz34hIl+8z1Oj4sDI2d07isFivzNCe8Xbjfc7R5tpHZU8hpxB4p9N7fSkHypBaJgx8zt+Q9za3p9PJEQQP/EwPiaksHTxBlK9Z71JYnTZwviYfVYLOATHUi5xoXv34BhoqzZIsupETRkFzV39kpJsn8k4ETyRFk80sFaSVG/GUcY2SBuIh72FYxIwOXD37kv5KIGjp+wN5009PrMsZFoDusfTR0D94TtPcKVql75PgC+fk3HKjr1UFBwN6n/a0SFmhKODzTEar6Jhohj+jwCfvNFEYc8cZ6vfEECWssRYP2wB8zYZ372sW483725liRKjcx5PALR9iwNEUc4BiP6d8B8p4qbPwZeRYN2J7B4bacztR5QqGRMyf5/6+sNBw8635oSALaJhohDzi0yiUvvzRObiujzsohyCIizorPz9CNdALUcH/2ZJQQrNE87VB+HjLs9S0PEwdkLKWDNmgxuExem117h6j4wE0dkmQMu/iQ6zSOD/GzlaMsdrY0XzKY5n+YYTRnOtFnUgRkLSksygbP7DXdqDt5egoGvQcVN5FDJkB1GPuu0FfZXOejdRVPy3DbTEHF4p2jcugfxkLffbHoHr0dnkRDTPOAFwdB7gznBiqMQUMGu+DGCwQt//u2rnmT+eNQH2aM0RBzwO0EM+0UO4hNgvXGrQcm/LhBG5ywCcVYaSJaDZqHupKzXUTgF+rQTz5qPNSbofBMNEYfYgRnH7wyZOAHB36oFJMLPKrJywBAIPOzg6KIhdzGJgYsUQs3sZwyWWGyehbQh4kDW4iA/XCNG1t1eVgxwxR8SwXhSJ878YQD8CSKOOdz5q+eciiMbhrO54t4Qc/MmGiKOJDMiit97k6rH3R+VYIjmZbdGB56qOLRXCCJqxwtKIeTv2J8/ARMdWS0quDWmz8pGGiKOyDOORyXmb45i7H5Nrk4pUfcqg+pMR2DlGsGD6XE4FTX8cN8jr1b1fMiBmpNJwBpmIm2IONzmvbSVPPHzjLL7y4oD2ePc41RFrNMDatl6F4z9D8IpCTxyxSEf+40pLJqXQ1Qeh/Ige5iGiEMyr1kokeKaG7PUu3+TJGeCe70yqedyHwF8Te0iFg6egTqjyoOrn/xlISkzCY5ntCnZ9RtpiDhQO7j0RB2WBIoJaMEXgpnKB+6QFDnhbHHgDjAXP6UGbKDmcl5/mBNk4M8qBAlb9IBrBA0RRwr6yDKUkFeMYrtfj9RxVdb+CNcUTi8RkCUIwf2EAhdNxdrPzwu3Cjk/7YTGVIjbioaIQ2HqmZhmUkk9IaeVTP7ldUIgn9NpQnkH4BQnYbhFHrjpNTMegMjzGzZfbKYp4sgMvbmAIjygtCbgSOkaCPcHSCw6xQSHu3Gc/Q7IKq7Ul9mLZpdKis+sH/tyvUlDxGEhc6Bgdf7srfUEjCoJjLIiO64vmZ3BaC8VUJ48KLh4Kn42tCgtyIGp+8amWUY30hBxQEhD0yDqo/cUE3BoUISBdJuA2HQvQFi6WpDMU1BMJD78nacc8OP7iQFrnt18jIaIQ2vi/LOQZEPXwe5vSBUSP/2+E2CkxlEWrwMhnAamkK9qn5b/G9p54ZBowyyjG2mIOKxIFOcOeNQNX3qkvfu1wdyI1aXLA+ZyaJkBX2ogMvNowSV7+GZ+0cqvaKQ8e2pqXHzgGA0Rh2aFl58n2bj7ba0JOK1orkvIwcOMZxMckxtLI/tJ08mBwMofvvKkHz1iKUz9A4+N89WP0RBxeFCMU/EW8u1rJ+SCxfrLkQwvXOioK3dXqHJyJGRHblr66nSTx5Dr+WLthpnNN9IQcRjgnA61SvX3o7t/syxVd/0SLZCzMMNJtxPM9KngQWr7t4PPvOdzpMyLp2Qtdn8Ak5KGiKNTAnL/fTGz+NPdNzzkEIsVHnIt+QgLjrL6Qckw54kYeHHdN/9x2n8+StAFHyp4PMpqTwoaIg41RymeHxBP1SO7fbOUSn7mAurzFQF7YLUjcuz+qGTnf576vEf/DVyetTCF5mUzjdEQceAKxN/PBBi+YrePlkLJVTghHzWXpJjebuD+TAwL+pOrLxh6YAUty3/gAW9KN8itaYo4BEBOOUKS4lfu9tHS4Sc3k8mH/c9pRNy5HoHizKw4Gy5+42F+IbRl/jFJcu4vK5Mfo/UmF1O/Z7cvJeRPPFxCcf75OEbgZnNkwXFCCmnxo5/MclkO4v80u0ghNqYq7VY0ZVja+fdsFAs3j2C7mZvoS64kK/UT25iZJLsJGDhjUD1a/NEfFeE391H4QS+riT4B4SOTk4aN65GWo3l4Le7IbpwiTP5uqWSTfRa1PGgk3LsSGHl+RtAl55xE+vNlMlo8q4zgfd9Kb7C8bcFl5a2d+OBdP0Xo8m+rg7zgEDJGLTeuQ6Q8S8HXPHxEQbzVVOqhQujUkGskDRPH7GlkF/sBIshu/D3b6Eonip1JO5IJXAtw8jwx0uh+hecbV2DoMwCksepomDjO+csStfLbw7t5Ha/NSB4PtRYWEa4G95cISjFtX5dwzRrcT3mZW0NXFKBx4ggHJZS0eHVy39gPZZeu8/HRIMR5hyqImG+4lUB8pho1U8jYpS3gFO00T2iqQhomDnvIQy7M27uZnfDQp3CXNG0eKQuutz2Cc8IRQIAUufJrbeDJY81l+8tKT6Di7rVE7ZjBdvmmfWakMHVON4/BEG4YwfQp01yzWo7YTyoKylM7ByJp2Ie4iYaNKx8rFvG1S0CQnQ8XdDMyfvs/UEfjlL8NkrOa8ysEe3UlBPcAw9+0WHPsoRNcu3Cy0TBxhBNOI4HeCrUTdtoOJgoBuXM0kCg/uJ8TNKtWV6jrohNLz0jInr5xvaYpnFY0NDxwIw0TRzXr3KDgn78GEXbFdKkG6ULNwLxn1g4Y9tAdGuzs0iQAgfhDnOHWm5v26W1Nw4YXOCEb+C9/VoVdMpAmHIYvt0DJEbEQ3KPJNZYyr81jH5Zce6l4qV86urHryRhNE0f7qEUQiFaKZ935Em6RQB6cSqDipNEsuIjrjw2ecGrnFZ75yEMeqnhKY/NVNtIwcXg5/0UtkPQwuQ47PzoDUiicSokHDQRcJYd0mUdeoZIBI9zwQ5FclHMb61PZSMPEIfj+Ndn5dTuU7HwrUc3g3HivBEj1xmveuFjT0IuzGhUqIx9/xIXy/wygDV9XGiYORKJFhKuuzrvYO8GK+gcbyMbgCeYIpvbrUePokwKFldn49VcIFn/vrY0rWPxbNEwcTj5bI8roB3cttDOgFPfiQlj0DMRBTS8OymsT4BbyF148IlnnvY/YXAPHGA0ThxCecEoiU/z0tiqknd8T1CR74JsBxwddR4WMrrwm29wXxxpXrde/71H3wt5yUOPnjcaJw4Djk2pR1z8qq10oKVwQ9RNLchCx8yQNgCSuf7DgtIO8QITig0sJ1DPOhcwEFDyd1DRMHACHillNuJxy53NYnZrcipiHgWeSwTTwQ+riLZgZnr/34SBZ9M1HVxSNKxC3NQ0ThwJP1RJoffv2XeicJBQepiiuKUyjZRiSvlXKwucQVB39KMmV9/yxKHhfHD2G1aeeWAk6vOHdG3Z+cDnjV3yqAnjeTJJaxG++w/x1ReXJR+QrP0Cxw9+/fx1z8z67rWnYABMa9S+e5Bh85x90p/Mig6KX3EuMdsy/D9Kxon23TFPOy6VEpgx/FCzEN7cQFLGGlZbcmoaJI2aT+oVfHFLAPn3Xzmc4C+0SUsJn4WJk/L9qXnBgIMHw+66OHvI5b8KjizWutOTWNEwcFoTAsU+BgC475zaHip0xcxsbLgQlLsIzkkO+/UbnfHEi/q8fk0Q++kszkM7U8TiOZDLQMHEogrq+xhDJcfF7hKqsdsYalrn4LqINplMgmgQPX6s54yQTJ1WfcoLyjjkNnzA20TBxkN0Rec0fdurMXnyBl3VMO/EXHliWyYzMfUEWBJP664E3DCnU8YLF0bO97twJ6ATVGzRNHEEcqsG/ONqVDJ9KOWrMO26sUtaW4hL3PTaQceWa2/zQl5kZxa1/WybC9A/Oyk2tAbY1TRMHaELTgg9kKzG55r9Cwnc80dl91beTBzcqT8EF+3Ky189WVV/7p0tqJb/ogL1FGo0ThydHiWIvfQW1qvOJRwqXHT+0iFx8g5Gx80oiSh65SGe+MZNcL/yhBiP+QU1/WelNJApqqPKRBY4Fv/ytbdspH8h3o0bCcefhJAg/fTC/ZqES5Y5/MYvoi84oYG+ZOxomDocacUH3n6emOfCt7wfbiWVlZFWyJPbemUlHI85nwtCfIIb9/HZtjRb2NmAXzPK9ScPEIVAgAsY0Q8iBD63coQLDhmWQv/4REA58OUoJ3HGpvXRRdpJeg7WZeslTMXLjGsh2oWHi2Eito29QzRKyXPf3VjzmHqFC0UC6/9MMMpBmKOKQ5KvrWv/Lgmh84GJahLPPGsAaXHNhKxoqjoKBNx5lASfUXxx57NeXZPMK/atVjIRRfWFAXIlrPs+5Rwvm69+9NLaLfMAAOe5OgnZv0VBxQB3+bCi7kYvllz32BtLaQSnzrf9RClnf8wEMJfP9xdPfC+h9z75QUqhbL7EUSNLcT21LmjpMi7z+GEC95mPrHvPl2gJJ4a9SRYGelVDHtPpU+eIn1M79L/xVKMnysqdqJO1OOaneoqniSOL1GZRYgf/gwh34gdqIP76kFM8ceFJpAHLVpfo3WuhPT7kp5jbqr/FsHgM7Y4/vZRoqjlxixXlPrwDEPr8Df+mF+uVvbVdg4XnTDQRtf4a3HeTefttqUknJvNMkqEBqWlvyrngzSZVn/yUCSthv3WO+3pJXRwICT16W3LK53cOcZW23P6dj9Zp5UfZs2c29/v8wgklAQ2cOQnTllNe4YmVefhckqm1ZvQ3PZEwkf+IOQDnsa/PURBPyIf7XvNJ+8c9AVvT85ysqisBeMnXsTsm9SUwdxSXFy5/maoh+4TXBPHQacG1Jp2STS4qPLloTXev9LzoRSKrcc+LMWwar8qlXmKrUcf7PD8P3ks3GGA2dOQpBMM54LibB843B1bdVu8ulU9PLwoY/XBuT1a0Pn1hTWyT7R1d/dNDLRxebFqnm9G8fSr13aaOp4sANysz7iuglLCcTs/z2H76BY+bK178uSQkLzq0KomcND3/ynBfiq960JFg7lPtccrxUj21pbRYNFUcW9UzwM85KjMKFv4iJsI2ukcE63ab9qgvMxDy/IZVkCcH4h/Rhof3iS8hTJFcvn55GdyFJqrdpqDgEXNzFTmDAWzr80SqCbcMn4ihoeuvv/FyRqM9+XcQCnvXOj73rqMyFl8XIsDPjfHRg73HHbmRPH5ceH8wsuZv516cjaGh9zd1Ht/VC9+T5ApBYwrvMfdRTdvfXzh9xv/cgoETlWe62zZ9vNE2dOZIEEPGXfVAQy+2vGN7ahqddDOXu96Ceaqa9UaBlQRP//eXPlF6PPiBChfvZ1JJbe9vE0VBxsDEy0F49FVf41g1be9oNMqQM95z7KKbRn/DjozBQI6R3vuKFUNwg7oJ6OKsu8i5UkepxGj5e0bmnO0iInxsd38bcMwrBiAWjL74JwVL8u1OyKxUgF935YVTyQU7huPzv4woXfC8oybEle3pde3yp3a9ZVKBoeaclz5ufyeaW3G3dZ1+JxgDF+0Y9u7mbp9FFX/K25ztOLgPAp90t+fif3jtouDi8dj+jY/D+hZttejh5Z9tpPzxBUYHA57NnH/vnL37Xa/+P58xVkMi7K0vZvU57mzyaLg73dEoZCMSzloy7tbWbe135DYOgUOq8H5m7V9nd88jtp6zI/sGAEEp4fvbK3Wvb6+aOhu852qRwZpUjIf3kn2F04+MRh1ikq9qBKEV12g/P8pStcMg68IEPtf7rTX8BkCte9n4dKfAssg3XTLNpqONtE6Z2/3NvAyB+9PzND1uk3Vr91X++k5DFT/3erLoAcsjBhy++Y/pnb+kEiqr90cdSpFPv1qqBPTOIPUXDxWGaA9c+bVQMtYO+fMbGx11y4M5P/YuaiHHE5XM9FWPy4IG/ufFKAFG3p5z1fhVTMLWmn+x+m4aLA4AVxz0UE4hP/eYzAyaCS1ZZ/MKbEfC48MJTXUxyBFPzv/70smBOyDz9Vec/9tWbzJ7e9Dy+VO5e+z8toGPRmf35FZ69nT25j7yy40aL5X+6WXZ3s+zm39o3ArQoP7S3bUC3pvEzR4op8ptnrVSTWIuf9rXpN+w3vECu/f7VlwPqvv93jpUc6iJLLjzfc8GnVwWtofi9dz/B9jIv7NY0XBzmwWVkkE+/Y7SsoEx21JqHy/a8vFoTKuYce8GJkFXAR++94q6PjWY1gs342O/XQfeyyK+tabg4MB0dwC186U0pBCqI2VvtTaMu6muPjakTE/rvH7vDDHFCPvT85x6NbyM4aO+i4eJwHBcFPvqR+xzp/IOLGOKonPV9UnTZ8Ksv3XVt3Xla7ajvHOyaVK1u7ekB7FEaHkft6upJTOUlj177s9qCJIIhrqghs179jy7xwUc/89Xh0aKO5uJFthf+n0OQuqAq925tNH3mwNQFF7/iRStlynqgqFFDrdWe/ow3nzE7h+Uf+sJKilqcooaB6olv/91BE3Gh8fbjx6Lp4hgjH3dnFVN0MSvqwp6g7ZOf+fwFdbzyxru+cY8H82Aesuo/nZMOnGm6t9nJu9DwZWUjeZ+bSZgRZNrpb180995qQbnkU99c9iiaYspolpA56eVva9VF2ssPsJvZS2YO7nnu3TkmcQIznl3d9nBVt1uVl8kQ11jBcYec9oIjCnIu2629zsPWhb1FHNWFf7xKHDWgqOnsOwQvawdO+P1Dnl9UJVVJJ/1tD/+2k4S9RBx1wQO3/I8lmkBjRUwQE2pq0DrrtS+aUhWuiZgDLrvYObB57CXiAIzvfuCOdZqdztkkZAkJ+L0/PglDMXVx7xhF651vt9BE9hJxmEXTurjq7HUxq0Gn6yMzXn/ck44t6sI0qVKVORh4f94YYy8Rx8a14h8+tEYcZGhgdOoZLznyoLljp1YXcnAxzYG9qTrL9tlLxNFnV+if2fp0pS+OPl3pi6NPV/ri6NOVvjj6dKUvjj5d6YujT1f64ujTlb44+nSlL44+XemLo09X+uLo05W+OPp0pS+OPl3pi6NPV/ri6NOVvjj6dKUvjj5d6YujT1f64ujTlb44+nSlL44+XemLo09X+uLo05W+OPp0pS+OPl3pi6NPV/ri6NOVvjj6dKUvjj5d6YujT1f64ujTlb44+nSlL44+XemLo09X+uLo05W+OPp0pS+OPl3pi6NPV/ri6NOVvjj6dKUvjj5d6YujT1f64ujTlb44+nSlL44+XemLo09X+uLo05W+OPp0pS+OPl3pi6NPV/ri6NOVvjj6dKUvjj5d6YujT1f+H89ZERddMTXsAAAAAElFTkSuQmCC" +} \ No newline at end of file diff --git "a/gf/LightDiffusionFlow/example/lightflow\350\211\272\346\234\257\344\272\214\347\273\264\347\240\201.flow" "b/gf/LightDiffusionFlow/example/lightflow\350\211\272\346\234\257\344\272\214\347\273\264\347\240\201.flow" new file mode 100644 index 0000000000000000000000000000000000000000..a14ce53ca6ff72de6f2df2998ab6a515584853e4 --- /dev/null +++ "b/gf/LightDiffusionFlow/example/lightflow\350\211\272\346\234\257\344\272\214\347\273\264\347\240\201.flow" @@ -0,0 +1,36 @@ +{ + "state-ext-control-net-txt2img_0-pixel-perfect": "true", + "state-ext-control-net-txt2img_0-model": "control_v1p_sd15_qrcode_monster [a6e58995]", + "state-ext-control-net-txt2img_3-control-weight": "0.4", + "state-txt2img_seed": "2587406147", + "state-ext-control-net-txt2img_3-enabled": "true", + "state-ext-control-net-txt2img_0-ending-control-step": "0.87", + "state-ext-control-net-txt2img-tab": "ControlNet Unit 1 ", + "state-ext-control-net-txt2img_0-guidance-start-(t)": "0.21", + "state-setting_sd_model_checkpoint": "天空之境.safetensors [c1d961233a]", + "state-txt2img_sampling": "DPM++ 2M Karras", + "state-ext-control-net-txt2img_0-active": "true", + "state-ext-control-net-txt2img_3-pixel-perfect": "true", + "state-ext-control-net-txt2img_3-enable": "true", + "state-ext-control-net-txt2img_0-starting-control-step": "0.21", + "state-tab": "txt2img", + "state-ext-control-net-txt2img_0-enabled": "true", + "state-ext-control-net-txt2img_0-control-weight": "1.2", + "state-ext-control-net-txt2img_3-guidance-end-(t)": "0.75", + "state-txt2img_steps": "30", + "state-txt2img_height": "768", + "state-ext-control-net-txt2img_1-model": "control_v1p_sd15_brightness [5f6aa6ed]", + "state-ext-control-net-txt2img_3-model": "control_v1p_sd15_brightness [5f6aa6ed]", + "state-ext-control-net-txt2img_3-active": "true", + "state-ext-control-net-txt2img_0-guidance-end-(t)": "0.87", + "state-txt2img_prompt": "A RAW photo of a delicious plate of Rice,chocolate rice dumpling shrimp,mixed vegetables,thinly sliced chicken in a coconut milk curry sauce,hyperrealism,hyperrealistic,octane render,hyper detailed,8K,", + "state-txt2img_neg_prompt": "(worst quality, low quality:1.4), negative_hand Negative Embedding, verybadimagenegative_v1.3, 2girls, nsfw:1.4, bad anatomy, bad hands, cropped, missing fingers, too many fingers, missing arms, long neck, Humpbacked, deformed, disfigured, poorly drawn face, distorted face, mutation, mutated, extra limb, ugly, poorly drawn hands, missing limb, floating limbs, disconnected limbs, malformed hands, out of focus, long body, missing toes, too many toes, monochrome, symbol, text, logo, door frame, window frame, mirror frame, ", + "state-txt2img_width": "768", + "state-ext-control-net-txt2img_3-guidance-start-(t)": "0.45", + "state-ext-control-net-txt2img_3-ending-control-step": "0.75", + "state-ext-control-net-txt2img_3-starting-control-step": "0.45", + "state-ext-control-net-txt2img_0-enable": "true", + "state-ext-control-net-txt2img-toggled": "false", + "txt2img_controlnet_ControlNet-0_input_image": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAGQCAYAAACAvzbMAAAOpUlEQVR42u3cPY7iWhCA0d4OCRIkkJmdwX6ICRCBnRGwBXZC4JGROpgfjehpxvfeqvNJzp6s50u5Tr8X+GOUJOkf+nAEkiSASJIAIkkCiCQJIJIkAUSSBBBJEkAkSQCRJAFEkiSASJIAIkkCiCQJIJIkgEiSBBBJEkAkSQCRJAFEkgQQSZIAIkkCiCQJIJIkgEiSACJJEkAkSQCRJAFEkgQQSRJAJEkCiCQJIJIkgEiSACJJAkgt9X0/7vd71y/XMAyv/ZgfH2+9Xh6iQvcr9bxvfwkLPW/tc2AffG8fpANkOpzaX6YS1+FwAAhA0gFiH3xvHwDEBRCAAMQFEAMDEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAGIfQAQF0AAAhD7ACAAAQhAAAIQgAAEIAABCEAA0jQgu93ueYitX13XVT0wUWCIslBLnXPtgNgHAPnSwNR0gJGfFyAAaQEQ+wAgBgYgAAGIfQAQgAAEIAABCEAcIEAAAhCA2AcAMTAAAQhA7AOAGBiAAAQg9gFADAxAAAIQ+wAgAAEIQAACEIA4QM8LEIAAxD5IDMj9fh9Xq9Vsi2WxWIyn06nY80ZZqNmu2gEp9bz2AUCKAjLnsHxe6/V6vF6vAAEIQCoDJNs+AEijC7Xve4AABCCVAZJtHwDEwAAEIACxDwBiYAACC4AABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgALEPAGJgAAIQgNgHAAEIQAACEIAABCAGpqbFVurfL9siB4h9ABADAxCAAMQ+AIiBAQhAAGIfAMTAAAQgALEPAAIQgAAEIAABCEAMDEAAAhD7ACAGBiAAAYh9ABADAxCAAMQ+AIiBAQhAAGIfAAQgAAEIQAACEIBk+JRJqYUaZeHLPgAIQAACEIAABCAAMTAAAQhA7AOAGBiAAET2AUAMDEAAIvsAIAYGIAABiH0AEIAABCAAAQhAAGJgAAIQgNgHAHn/AS6Xy9mHZbPZjJfLBSAAkX1QdB8A5JsHeD6fx+12O+uwHI/H8fF4AAQgsg+K7gOAvOEAb7fb8z8h57imvzReHZaSAxNl8WYDrvbnqP1TJvYBQEJ9GwogAAGIfQAQgAAEIAABCEAcIEAAAhCA2H8AMTAAAQhA7AOAGBiAAAQg9gFADAxAAAIQ+wAgDhAgAAEIQADiAAECEIAAxP4DiIEBCEAAYh8A5A9X13XPf7b1a3qOTJ8ycT/3+x+A2AcA+dLAZLsA4n4AcQHEwADE/dzPPgAIQADifu4HEIAABCDu534AAQhAAGJhuR9A7AOAuADifgBxAcTAAMT93A8gAAEIQNzP/QACEIAAxP3cDyAACQHIMAzPw3H9fE3n8s4XuNSnKUototr//Ur9HrXPgX3wvX2QDhDNs9gAApAWAFEDO8cRAAQgAAGIAAIQgAAEIAKIAAIQgAggAghAACKACCAAAQhABBCAAAQgABFABBCAAEQAEUAAAhABxCJve0FHAclCbfNc3K/+uQIIQAACEIAABCAAAQhAAOJ+APHCAcTvBhD3AwhAAAIQ82zhA0QAAQhAAAIQgAAEIAABCEAA4oUDCEAA4n4A8cIBxO9mngECEIAABCDm2cIHSOzBB0PsT5RYHGBo4WpyxwIEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAhAbEYqtrsdW+OPwBE/t3i/KHE0AAAhCAAAQgAAEIQAACEIAABCAAsYgAAhCAAAQgAAGI3w0gAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCLCCAAAQhAAAIQgPjdAJIPEDC0uSizvejmtM3z88kTgAAEIAABCEAA4sUECEAAAhCAAAQgAAEIQAACEIAAxJwCBCAAAQhAAAIQgAAEIAABCEAAAhAvJkAAAhCAAAQgAAGIOQUIQAACEICYU4AApDZAsg20Rd7mOZeav2wLEOgAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAGIeQEIQABiIQAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIOYFIAABiIUAEIAAJNaC8SK1OS9gsFAzzhVAAAIQgADEXAEEIADxogMEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAYi5AghAAAIQgJgrgAAEIF50gAAEIAABCEAAAhCAAAQgAAEIQABSPyDZPhXi0y1tgmQBOhdzDxCAeJEAAhBzDxCAAAQgAAEIQAACEIAABCAAAQhAAAIQgADEDwcQLxJAnIu5BwhAvEgAAYi5BwhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQADS8otpkcf+NEXtn/YApvmLPi8AAYgXGCAAAQhAAOIFBghAvJcAAQhAAAIQ8wcQgBhUgADE/AEEIAbVCwwQgAAEIADxAgMEIN5LgAAEIAABiPcSIAABCEAAYv4AAhCDChCAmD+AGNQWgIuysGp/0Us9R5Tfwx+ecc4ZIAABiIXgOQACEIBYWAABCEAAAhCAAAQgAAEIQAACEL8HQAACEIAAxOIFiHkBCEAAYiF4DoAABCAAAQhAAAIQgAAEIAABCEAAAhCA+D0AAhAwtLCIsn2qwadqnHPGT8YABCAAAYhzBghAAOIFdi7OGSAAAYgXyWIDCEAEEIBYbAAxfwABCEAAAhDnDBCAAAQgAHHOAAEIQLzAFptzBghAAOJFstgAYu4FEIBYbAAxfwDJB0jtcNW+EHy6xbn4RE78P2QBAhCL0rkABCAAAQhALECAAAQgAAEIQJwLQJwLQABiUToXgAAEIACxKJ0LQAACEIBYlM4FIAABCEAAAhCAAAQgAAGIRelcLEqAAAQgFqVzAQhAAGJBx3peUM9zLlEWajZYWwQJIAABiMUBEHMAEIAAxLkAxBwABCAWpXMBCEAAAhDPCxCAAAQgAAEIQAACEIAABCAAsTgAYg4AAhCAOBeAmAOAAMSidC4AAQhAAOJ5AQIQgAAkNiC1P0e2c4my2KKcX7Y/iLyXAAEIQAACEIAABCAGFSAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgwoQgAAEIAABCEAAAhCAAAQgAAEIQLyXtQFS+0BnezFrXwig8R5FmtMm/5gHCEAAAhCAAAQgAAEIQLxHAAGIwQcIQLxHAAGIwQcIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAeI8AAhCDDxCAeI8AAhCDDxCAAAQgAAFIA59MsGDcL+MnWfwhARCAAMSCBghAAAIQgFioAAEIQAACEIC4H0CcM0AAAhD3AwhAAAIQgLifxQYQgAAEIBY0QAACEIAAxEIFCEAAAhCAAMT9AOKcAQIQgLgfQAACEJ+SyPDvF2WuonzixR8mPmUCEIAABCDeN4AABCAAAQhAAAIQAw0QgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAA8b4BBCAAAQhAAAIQgBhogAAEIAABCEAAAhCAAAQgbbzo2QY/CkjZFrmrzcsfdgABiN/N5QIIQAACEIC4AAIQgAAEIC6AAAQgAAGICyAAAQhAAOJyAQQgAPG7uVwAAQhAAAIQF0AAAhCAAMQFEIAABCAAcQEEILUBojYHMMonHTxH7POL8ocOQGRxOD9zABCAyOKweD0HQAAigFi8ngMgAAEIQCwOz2EOAAIQWRwAMQcAAYgsDudnDgACEFkcFq/nAAhABBCL13MABCBNA9L3/bjf712/XMMwWBwAAQhAAPK3pmXpUwq/X4fDIdXC8mmKNs85yvOCASAAAQhAAGJOAeICCEAAAhCAAAQgAAGI8wMIQADixbQAAQIQgAAEIAABCEAAAhCAAAQg5hQgLoAABCAAAQhAAAIQgAAEIBUAstvtnku19avrOoAABCDODyBzAlL7Qi31vLV/wqL2C8DOOeO5AAQgAAGIRelcAAIQgFhsAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCECcs3MBCEAAAhCL0rkABCAAsdgAAhCAAAQgAAEIQAACEIAABCAAAUg+QO73+7harWYbgMViMZ5Op+rBzPYtsVKLPMoC9IdEXb8bQGZaqHPi8Xmt1+vxer0CBCAAAQhAMvwvnXdffd8DBCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCCzDXSpFynb4s22iKLAFeUPE4AABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgADECwe4thdWiwD7lAlAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIC8tlCXy+XsL+VmsxkvlwtAAAIQgACkZUDO5/O43W5nxeN4PI6PxwMgAAEIQADSMiBTt9vt+b+U5rim//J4FY8WAKn9ftkWbxQYonwyRgkAqTmAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQADyh6vruuc/2/o1PUfNYPpESZvA+RQHCAHiAghAAAIQgAAEIAABCEAAAhCAAAQgAHEuAAEIQABiUQIEIAABCEAAAhCAAAMgAAEIQAACEIAABCAAAQhAAAIQgADEuQAEIAABiEUJkFiADMPwXJaun6/pXCQJIJIkgEiSACJJEkAkSQCRJAFEkgQQSRJAHIEkCSCSJIBIkgAiSQKIJEkAkSQBRJIEEEkSQCRJAJEkCSCSJIBIkgAiSQKIJAkgkiQBRJIEEEkSQCRJAJEkAUSSJIBIkgAiSQKIJAkgkiSASJIEEEkSQCRJNfUD/Ye7UNZHtNkAAAAASUVORK5CYII=", + "txt2img_controlnet_ControlNet-1_input_image": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAGQCAYAAACAvzbMAAAOpUlEQVR42u3cPY7iWhCA0d4OCRIkkJmdwX6ICRCBnRGwBXZC4JGROpgfjehpxvfeqvNJzp6s50u5Tr8X+GOUJOkf+nAEkiSASJIAIkkCiCQJIJIkAUSSBBBJEkAkSQCRJAFEkiSASJIAIkkCiCQJIJIkgEiSBBBJEkAkSQCRJAFEkgQQSZIAIkkCiCQJIJIkgEiSACJJEkAkSQCRJAFEkgQQSRJAJEkCiCQJIJIkgEiSACJJAkgt9X0/7vd71y/XMAyv/ZgfH2+9Xh6iQvcr9bxvfwkLPW/tc2AffG8fpANkOpzaX6YS1+FwAAhA0gFiH3xvHwDEBRCAAMQFEAMDEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAGIfQAQF0AAAhD7ACAAAQhAAAIQgAAEIAABCEAA0jQgu93ueYitX13XVT0wUWCIslBLnXPtgNgHAPnSwNR0gJGfFyAAaQEQ+wAgBgYgAAGIfQAQgAAEIAABCEAcIEAAAhCA2AcAMTAAAQhA7AOAGBiAAAQg9gFADAxAAAIQ+wAgAAEIQAACEIA4QM8LEIAAxD5IDMj9fh9Xq9Vsi2WxWIyn06nY80ZZqNmu2gEp9bz2AUCKAjLnsHxe6/V6vF6vAAEIQCoDJNs+AEijC7Xve4AABCCVAZJtHwDEwAAEIACxDwBiYAACC4AABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgALEPAGJgAAIQgNgHAAEIQAACEIAABCAGpqbFVurfL9siB4h9ABADAxCAAMQ+AIiBAQhAAGIfAMTAAAQgALEPAAIQgAAEIAABCEAMDEAAAhD7ACAGBiAAAYh9ABADAxCAAMQ+AIiBAQhAAGIfAAQgAAEIQAACEIBk+JRJqYUaZeHLPgAIQAACEIAABCAAMTAAAQhA7AOAGBiAAET2AUAMDEAAIvsAIAYGIAABiH0AEIAABCAAAQhAAGJgAAIQgNgHAHn/AS6Xy9mHZbPZjJfLBSAAkX1QdB8A5JsHeD6fx+12O+uwHI/H8fF4AAQgsg+K7gOAvOEAb7fb8z8h57imvzReHZaSAxNl8WYDrvbnqP1TJvYBQEJ9GwogAAGIfQAQgAAEIAABCEAcIEAAAhCA2H8AMTAAAQhA7AOAGBiAAAQg9gFADAxAAAIQ+wAgDhAgAAEIQADiAAECEIAAxP4DiIEBCEAAYh8A5A9X13XPf7b1a3qOTJ8ycT/3+x+A2AcA+dLAZLsA4n4AcQHEwADE/dzPPgAIQADifu4HEIAABCDu534AAQhAAGJhuR9A7AOAuADifgBxAcTAAMT93A8gAAEIQNzP/QACEIAAxP3cDyAACQHIMAzPw3H9fE3n8s4XuNSnKUototr//Ur9HrXPgX3wvX2QDhDNs9gAApAWAFEDO8cRAAQgAAGIAAIQgAAEIAKIAAIQgAggAghAACKACCAAAQhABBCAAAQgABFABBCAAEQAEUAAAhABxCJve0FHAclCbfNc3K/+uQIIQAACEIAABCAAAQhAAOJ+APHCAcTvBhD3AwhAAAIQ82zhA0QAAQhAAAIQgAAEIAABCEAA4oUDCEAA4n4A8cIBxO9mngECEIAABCDm2cIHSOzBB0PsT5RYHGBo4WpyxwIEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAhAbEYqtrsdW+OPwBE/t3i/KHE0AAAhCAAAQgAAEIQAACEIAABCAAsYgAAhCAAAQgAAGI3w0gAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCLCCAAAQhAAAIQgPjdAJIPEDC0uSizvejmtM3z88kTgAAEIAABCEAA4sUECEAAAhCAAAQgAAEIQAACEIAAxJwCBCAAAQhAAAIQgAAEIAABCEAAAhAvJkAAAhCAAAQgAAGIOQUIQAACEICYU4AApDZAsg20Rd7mOZeav2wLEOgAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAGIeQEIQABiIQAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIOYFIAABiIUAEIAAJNaC8SK1OS9gsFAzzhVAAAIQgADEXAEEIADxogMEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAYi5AghAAAIQgJgrgAAEIF50gAAEIAABCEAAAhCAAAQgAAEIQABSPyDZPhXi0y1tgmQBOhdzDxCAeJEAAhBzDxCAAAQgAAEIQAACEIAABCAAAQhAAAIQgADEDwcQLxJAnIu5BwhAvEgAAYi5BwhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQADS8otpkcf+NEXtn/YApvmLPi8AAYgXGCAAAQhAAOIFBghAvJcAAQhAAAIQ8wcQgBhUgADE/AEEIAbVCwwQgAAEIADxAgMEIN5LgAAEIAABiPcSIAABCEAAYv4AAhCDChCAmD+AGNQWgIuysGp/0Us9R5Tfwx+ecc4ZIAABiIXgOQACEIBYWAABCEAAAhCAAAQgAAEIQAACEL8HQAACEIAAxOIFiHkBCEAAYiF4DoAABCAAAQhAAAIQgAAEIAABCEAAAhCA+D0AAhAwtLCIsn2qwadqnHPGT8YABCAAAYhzBghAAOIFdi7OGSAAAYgXyWIDCEAEEIBYbAAxfwABCEAAAhDnDBCAAAQgAHHOAAEIQLzAFptzBghAAOJFstgAYu4FEIBYbAAxfwDJB0jtcNW+EHy6xbn4RE78P2QBAhCL0rkABCAAAQhALECAAAQgAAEIQJwLQJwLQABiUToXgAAEIACxKJ0LQAACEIBYlM4FIAABCEAAAhCAAAQgAAGIRelcLEqAAAQgFqVzAQhAAGJBx3peUM9zLlEWajZYWwQJIAABiMUBEHMAEIAAxLkAxBwABCAWpXMBCEAAAhDPCxCAAAQgAAEIQAACEIAABCAAsTgAYg4AAhCAOBeAmAOAAMSidC4AAQhAAOJ5AQIQgAAkNiC1P0e2c4my2KKcX7Y/iLyXAAEIQAACEIAABCAGFSAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgwoQgAAEIAABCEAAAhCAAAQgAAEIQLyXtQFS+0BnezFrXwig8R5FmtMm/5gHCEAAAhCAAAQgAAEIQLxHAAGIwQcIQLxHAAGIwQcIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAeI8AAhCDDxCAeI8AAhCDDxCAAAQgAAFIA59MsGDcL+MnWfwhARCAAMSCBghAAAIQgFioAAEIQAACEIC4H0CcM0AAAhD3AwhAAAIQgLifxQYQgAAEIBY0QAACEIAAxEIFCEAAAhCAAMT9AOKcAQIQgLgfQAACEJ+SyPDvF2WuonzixR8mPmUCEIAABCDeN4AABCAAAQhAAAIQAw0QgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAA8b4BBCAAAQhAAAIQgBhogAAEIAABCEAAAhCAAAQgbbzo2QY/CkjZFrmrzcsfdgABiN/N5QIIQAACEIC4AAIQgAAEIC6AAAQgAAGICyAAAQhAAOJyAQQgAPG7uVwAAQhAAAIQF0AAAhCAAMQFEIAABCAAcQEEILUBojYHMMonHTxH7POL8ocOQGRxOD9zABCAyOKweD0HQAAigFi8ngMgAAEIQCwOz2EOAAIQWRwAMQcAAYgsDudnDgACEFkcFq/nAAhABBCL13MABCBNA9L3/bjf712/XMMwWBwAAQhAAPK3pmXpUwq/X4fDIdXC8mmKNs85yvOCASAAAQhAAGJOAeICCEAAAhCAAAQgAAGI8wMIQADixbQAAQIQgAAEIAABCEAAAhCAAAQg5hQgLoAABCAAAQhAAAIQgAAEIBUAstvtnku19avrOoAABCDODyBzAlL7Qi31vLV/wqL2C8DOOeO5AAQgAAGIRelcAAIQgFhsAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCECcs3MBCEAAAhCL0rkABCAAsdgAAhCAAAQgAAEIQAACEIAABCAAAUg+QO73+7harWYbgMViMZ5Op+rBzPYtsVKLPMoC9IdEXb8bQGZaqHPi8Xmt1+vxer0CBCAAAQhAMvwvnXdffd8DBCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCCzDXSpFynb4s22iKLAFeUPE4AABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgADECwe4thdWiwD7lAlAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIC8tlCXy+XsL+VmsxkvlwtAAAIQgACkZUDO5/O43W5nxeN4PI6PxwMgAAEIQADSMiBTt9vt+b+U5rim//J4FY8WAKn9ftkWbxQYonwyRgkAqTmAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQAACEIAABCAAAQhAAAIQgAAEIAABCEAAAhCAAAQgAAEIQADyh6vruuc/2/o1PUfNYPpESZvA+RQHCAHiAghAAAIQgAAEIAABCEAAAhCAAAQgAHEuAAEIQABiUQIEIAABCEAAAhCAAAMgAAEIQAACEIAABCAAAQhAAAIQgADEuQAEIAABiEUJkFiADMPwXJaun6/pXCQJIJIkgEiSACJJEkAkSQCRJAFEkgQQSRJAHIEkCSCSJIBIkgAiSQKIJEkAkSQBRJIEEEkSQCRJAJEkCSCSJIBIkgAiSQKIJAkgkiQBRJIEEEkSQCRJAJEkAUSSJIBIkgAiSQKIJAkgkiSASJIEEEkSQCRJNfUD/Ye7UNZHtNkAAAAASUVORK5CYII=" +} \ No newline at end of file diff --git a/gf/LightDiffusionFlow/install.py b/gf/LightDiffusionFlow/install.py new file mode 100644 index 0000000000000000000000000000000000000000..21785edea8ace4971f0d9e6ef2eee144eacf90ef --- /dev/null +++ b/gf/LightDiffusionFlow/install.py @@ -0,0 +1,6 @@ +import launch +import os,sys + + +# if not launch.is_installed("js2py"): +# launch.run_pip("install js2py", "js2py") \ No newline at end of file diff --git a/gf/LightDiffusionFlow/javascript/state.app.js b/gf/LightDiffusionFlow/javascript/state.app.js new file mode 100644 index 0000000000000000000000000000000000000000..774d5dbbe8c343fd8ffe495be1da6c262e33a34c --- /dev/null +++ b/gf/LightDiffusionFlow/javascript/state.app.js @@ -0,0 +1,4 @@ + +document.addEventListener('DOMContentLoaded', function() { + onUiLoaded(state.core.init); +}); diff --git a/gf/LightDiffusionFlow/javascript/state.constants.js b/gf/LightDiffusionFlow/javascript/state.constants.js new file mode 100644 index 0000000000000000000000000000000000000000..274b1199cf369a75d1bdad3d985454a86fd3c2f3 --- /dev/null +++ b/gf/LightDiffusionFlow/javascript/state.constants.js @@ -0,0 +1,6 @@ +window.state = window.state || {}; +state = window.state; + +state.constants = { + LS_PREFIX: 'state-' +}; \ No newline at end of file diff --git a/gf/LightDiffusionFlow/javascript/state.core.js b/gf/LightDiffusionFlow/javascript/state.core.js new file mode 100644 index 0000000000000000000000000000000000000000..5d956d1cadf1a4016dd6f249176fee0a9d002e67 --- /dev/null +++ b/gf/LightDiffusionFlow/javascript/state.core.js @@ -0,0 +1,1006 @@ +window.state = window.state || {}; +state = window.state; + + +state.core = (function () { + + const TABS = ['txt2img', 'img2img']; + + // settingId, element + const ELEMENTS = { + 'prompt': 'prompt', + 'negative_prompt': 'neg_prompt', + 'sampling_steps': 'steps', + 'restore_faces': 'restore_faces', + 'tiling': 'tiling', + 'hires_fix': 'enable_hr', + 'hires_steps': 'hires_steps', + 'hires_scale': 'hr_scale', + 'hires_resize_x': 'hr_resize_x', + 'hires_resize_y': 'hr_resize_y', + 'hires_denoising_strength': 'denoising_strength', + 'width': 'width', + 'height': 'height', + 'batch_count': 'batch_count', + 'batch_size': 'batch_size', + 'cfg_scale': 'cfg_scale', + 'denoising_strength': 'denoising_strength', + 'seed': 'seed', + 'sampling': 'sampling', + 'switch_at': 'switch_at' + }; + + const ACCORDION = { + "hires_fix": "hr", + "refiner": "enable" + } + + const ELEMENTS_WITHOUT_PREFIX = { + 'resize_mode': 'resize_mode', + 'setting_inpainting_mask_weight': 'setting_inpainting_mask_weight', + 'setting_CLIP_stop_at_last_layers': 'setting_CLIP_stop_at_last_layers', + 'setting_eta_noise_seed_delta': 'setting_eta_noise_seed_delta', + 'img2img_mask_blur': 'img2img_mask_blur', + 'img2img_mask_mode': 'img2img_mask_mode', + 'img2img_inpainting_fill': 'img2img_inpainting_fill', + 'img2img_inpaint_full_res_padding': 'img2img_inpaint_full_res_padding', + 'img2img_inpaint_full_res': 'img2img_inpaint_full_res', + 'img2img_mask_alpha': 'img2img_mask_alpha' + //'generation_info_txt2img': 'generation_info_txt2img' // 可能因为是visible=false 所以触发不了onchange事件? + }; + + const SELECTS = { + 'sampling': 'sampling', + 'hires_upscaler': 'hr_upscaler', + 'script': '#script_list', + 'checkpoint': 'checkpoint', + }; + + const SELECTS_WITHOUT_PREFIX = { + 'setting_sd_model_checkpoint': 'setting_sd_model_checkpoint', + 'setting_sd_vae': 'setting_sd_vae' + }; + + const MULTI_SELECTS = { + 'styles': 'styles' + }; + + const TOGGLE_BUTTONS = { + 'extra_networks': 'extra_networks', + }; + + var IMAGES_WITHOUT_PREFIX = { + }; + + const ELEMENTS_ALWAYS_SAVE = { + 'setting_sd_model_checkpoint': 'setting_sd_model_checkpoint', + }; + + const Image_extensions = [".png", ".jpg", ".jpeg"] + + let store = null; + let timer = null; + let inited = false + let sd_versions = "0.0.0" + let waiting_second_apply = false + let timeout_id = undefined + let img_elem_keys=[]; + let ext_list=[]; + let flow_save_mode = "Core" + + + function hasSetting(id, tab) { + return true // 需要默认保存全部选项 不需要判断 + const suffix = tab ? `_${tab}` : ''; + return this[`state${suffix}`] && this[`state${suffix}`].indexOf(id) > -1; + } + + function fn_timer(){ + + // if(inited){ + fetch('/lightdiffusionflow/local/need_preload') + .then(response => response.json()) + .then(data => { + console.log(`fn_timer`) + if (data != ""){ + //state.core.actions.handleLightDiffusionFlow([{"name":data}]); + const btn1 = gradioApp().querySelector(`button#set_lightdiffusionflow_file`); + state.utils.triggerMouseEvent(btn1); + setTimeout(() => { + const btn2 = gradioApp().querySelector(`button#preload_button`); + state.utils.triggerMouseEvent(btn2); + }, 1000); + } + }).catch(function(e) { + clearInterval(timer) + console.log("Oops, error"); + }); + // } + // else{ + + // fetch('/lightdiffusionflow/local/get_imgs_elem_key') //初始化部分图片组件id, 后续设置onchanged事件 + // .then(response => response.json()) + // .then(data => { + // if(data != ""){ + + // img_elem_keys = data.split(",") + // img_elem_keys.forEach(key => { + // IMAGES_WITHOUT_PREFIX[key] = key + // }); + + // // 等上面的组件ID同步过来后 再加载其他配置 + // fetch('/lightdiffusionflow/local/config.json?_=' + (+new Date())) + // .then(response => response.json()) + // .then(config => { + // try { + // store = new state.Store(); + // store.clearAll(); + // load(config); + // inited = true + // } catch (error) { + // console.error('[state]: Error:', error); + // } + // }) + // .catch(error => console.error('[state]: Error getting JSON file:', error)); + // } + // }); + + + // } + + } + + function get_js_local_data(){ + + if(state.utils.getCurSeed('txt2img') != undefined){ + store.set(`txt2img_seed`,state.utils.getCurSeed('txt2img')) + } + if(state.utils.getCurSeed('img2img') != undefined){ + store.set(`img2img_seed`,state.utils.getCurSeed('img2img')) + } + + stored_config = store.getAll() + + for (let key in stored_config){ + if(key.indexOf("allow-preview") !== -1 && key.indexOf("ext-control-net") !== -1) + { + console.log("allow-preview改成false") + stored_config[key] = "false" + } + } + + return stored_config + } + + function get_imgs_elem_key(){ + + fetch('/lightdiffusionflow/local/get_imgs_elem_key') //初始化部分图片组件id, 后续设置onchanged事件 + .then(response => response.json()) + .then(data => { + console.log(data) + if(data == ''){ + setTimeout(() => { + get_imgs_elem_key() + }, 500); + } + else{ + img_elem_keys = data.split(",") + img_elem_keys.forEach(key => { + IMAGES_WITHOUT_PREFIX[key] = key + }); + + fetch('/lightdiffusionflow/local/get_ext_list') + .then(response => response.json()) + .then(data => { + ext_list = data.split(",") + }); + + // 等上面的组件ID同步过来后 再加载其他配置 + fetch('/lightdiffusionflow/local/config.json?_=' + (+new Date())) + .then(response => response.json()) + .then(config => { + try { + + try{ + flow_save_mode = config['lightdiffusionflow-mode'] + }catch (error) { + flow_save_mode = "Core" + } + + store = new state.Store(); + store.clearAll(); + load(config); + timer = window.setInterval(fn_timer,1000); // 初始化页面完成后再启动timer读取文件 + } catch (error) { + console.error('[state]: Error:', error); + } + }) + .catch(error => console.error('[state]: Error getting JSON file:', error)); + } + }); + + } + + function init() { + + //console.log(window.localization) + fetch('/lightdiffusionflow/local/refresh_ui') // 刷新页面触发python重置图片数据 + .then(response => response.json()) + .then(data => { + sd_versions = data + }); + + get_imgs_elem_key() + + } + + // function forEachImageElement(list, action) { + // for (const [settingId, element] of Object.entries(list)) { + // TABS.forEach(tab => { + // //if (config.hasSetting(settingId, tab)) { + // action(element, tab); + // //} + // }); + // } + // } + + function forEachElement_WithoutTabs(list, action) { + for (const [settingId, element] of Object.entries(list)) { + action(element); + } + } + + function forEachElement(list, config, action) { + for (const [settingId, element] of Object.entries(list)) { + TABS.forEach(tab => { + if (config.hasSetting(settingId, tab)) { + action(element, tab); + } + }); + } + } + + function load(config, addEvtLsner=true) { + config.hasSetting = hasSetting + + //loadUI(); // 往页面上添加按钮 + + for (let tab of TABS) + { + //console.log(`${tab}_script_container start`) + let script_container = getElement(`${tab}_script_container`) + state.utils.onFrameContentChange(script_container, function (el) { + clearTimeout(timeout_id); + timeout_id = setTimeout(() => { + if(waiting_second_apply) + { + waiting_second_apply = false + actions.applyState(false); + setTimeout(() => { + actions.preset_output_log("finished") + }, 3000); + } + }, 3000); + }); + } + + forEachElement(ACCORDION, config, (element, tab) => { + handleSavedAccordion(`${tab}_${element}`, addEvtLsner); + }); + + forEachElement_WithoutTabs(SELECTS_WITHOUT_PREFIX, (element) => { + handleSavedSelects(element, addEvtLsner); + }); + + forEachElement(ELEMENTS, config, (element, tab) => { + handleSavedInput(`${tab}_${element}`, addEvtLsner); + }); + + forEachElement_WithoutTabs(ELEMENTS_WITHOUT_PREFIX, (element) => { + handleSavedInput(element, addEvtLsner); + }); + + forEachElement(SELECTS, config, (element, tab) => { + handleSavedSelects(`${tab}_${element}`, addEvtLsner); + }); + + forEachElement(MULTI_SELECTS, config, (element, tab) => { + handleSavedMultiSelects(`${tab}_${element}`, addEvtLsner); + }); + + forEachElement(TOGGLE_BUTTONS, config, (element, tab) => { + handleToggleButton(`${tab}_${element}`, addEvtLsner); + }); + + forEachElement_WithoutTabs(IMAGES_WITHOUT_PREFIX, (element) => { + handleSavedImage(`${element}`, addEvtLsner); + }); + + handleExtensions(config, addEvtLsner); + //handleSettingsPage(); + + restoreTabs(config, addEvtLsner); // 恢复到最后点击的tab页面 + + forEachElement_WithoutTabs(ELEMENTS_ALWAYS_SAVE, (element) => { + state.utils.forceSaveSelect(getElement(element), element, store); //每次无论有没有修改都需要导出的选项 + }); + } + + function createHeaderButton(title, text, className, style, action) { + + const button = state.utils.html.create('button', { + title: title, + innerHTML: text, + className: className, + }, style); + + if (action) { + button.addEventListener('click', action); + } + + return button; + } + + // function createHeaderFileInput(title, text, className) { + + // let inputId = 'state-import-file-inline'; + + // let importBtn = createHeaderButton(title,text, className, { + // display: 'none' + // }, () => { + // actions.importState(inputId); + // }); + + // let label = state.utils.html.create('label', {}, { cursor: 'pointer' }); + // label.appendChild(state.utils.html.create('input', { + // type: 'file', + // id: inputId, + // accept: 'application/json', + // }, { + // display: 'none' + // })); + // label.appendChild(document.createTextNode(text)); + // label.addEventListener('change', () => { + // importBtn.dispatchEvent(new Event('click')); + // }); + + // let button = createHeaderButton(title, '', className, {}); + // button.appendChild(label); + + // return { + // hiddenButton: importBtn, + // button: button + // }; + // } + + // function loadUI() { + // let quickSettings = gradioApp().getElementById("quicksettings"); + // let className = quickSettings.querySelector('button').className; + // quickSettings.appendChild(createHeaderButton('State: Reset', "*️⃣", className, {}, actions.resetAll)); + // quickSettings.appendChild(createHeaderButton('State: Export',"📤", className, {}, actions.exportState)); + // quickSettings.appendChild(createHeaderButton('State: test',"📤", className, {}, actions.test)); + // let fileInput = createHeaderFileInput('State: Import',"📥", className); + // quickSettings.appendChild(fileInput.hiddenButton); + // quickSettings.appendChild(fileInput.button); + // } + + + function restoreTabs(config, addEvtLsner=true) { + + if (! config.hasSetting('tabs')) { + return; + } + + const tabs = gradioApp().querySelectorAll('#tabs > div:first-child button'); + const value = store.get('tab'); + if (value) { + for (var i = 0; i < tabs.length; i++) { + if (tabs[i].textContent === state.utils.getTranslation(value)) { + state.utils.triggerEvent(tabs[i], 'click'); + break; + } + } + } + + // Use this when onUiTabChange is fixed + // onUiTabChange(function () { + // store.set('tab', gradioApp().querySelector('#tabs .tab-nav button.selected').textContent); + // }); + if(addEvtLsner){ + bindTabClickEvents(); + } + } + + function bindTabClickEvents() { + Array.from(gradioApp().querySelectorAll('#tabs .tab-nav button')).forEach(tab => { + tab.removeEventListener('click', storeTab); + tab.addEventListener('click', storeTab); + }); + } + + function storeTab() { + let tab_name = gradioApp().querySelector('#tabs .tab-nav button.selected').textContent; + store.set('tab', state.utils.reverseTranslation(tab_name)[0]); + bindTabClickEvents(); // dirty hack here... + } + + function getElement(id) { + for (let i = 0; i < TABS.length; i++) { + if (id.startsWith(`${TABS[i]}_#`)) { + // handle elements with same ids in different tabs... + return gradioApp().querySelector('#tab_' + id.replace(`${TABS[i]}_#`, `${TABS[i]} #`)); + } + } + return gradioApp().getElementById(id); + } + + function handleSavedInput(id, addEvtLsner=true) { + + const elements = gradioApp().querySelectorAll(`#${id} textarea, #${id} input, #${id} img`); + const events = ['change', 'input']; + + if (! elements || ! elements.length) { + return; + } + + let forEach = function (action) { + events.forEach(function(event) { + elements.forEach(function (element) { + action.call(element, event); + }); + }); + }; + + if(addEvtLsner){ + forEach(function (event) { + this.addEventListener(event, function () { + let value = this.value; + if (this.type && this.type === 'checkbox') { + value = this.checked; + } + else if (this.className === 'img') { + value = this.checked; + } + store.set(id, value); + }); + }); + + TABS.forEach(tab => { + const seedInput = gradioApp().querySelector(`#${tab}_seed input`); + ['random_seed', 'reuse_seed'].forEach(id => { + const btn = gradioApp().querySelector(`#${tab}_${id}`); + btn.addEventListener('click', () => { + setTimeout(() => { + state.utils.triggerEvent(seedInput, 'change'); + }, 100); + }); + }); + }); + } + + let value = store.get(id); + if (! value) { + return; + } + forEach(function (event) { + state.utils.setValue(this, value, event); + }); + } + + function handleSavedSelects(id, addEvtLsner=true) { + state.utils.handleSelect(getElement(id), id, store, force=false, addEvtLsner); + } + + function handleSavedAccordion(id, addEvtLsner=true) { + state.utils.handleAccordion(getElement(id), id, store, addEvtLsner); + } + + function handleSavedMultiSelects(id, addEvtLsner=true) { + const select = gradioApp().getElementById(`${id}`); + state.utils.handleMultipleSelect(select, id, store, addEvtLsner); + } + + function handleSavedImage(id, addEvtLsner=true) { + state.utils.handleImage(getElement(id), id, store, addEvtLsner); // 图片有修改就发回到python保存 + } + + function handleToggleButton(id, addEvtLsner=true) { + const btn = gradioApp().querySelector(`button#${id}`); + if (! btn) { return; } + // legionfu + if (store.get(id) === 'true') { + state.utils.triggerMouseEvent(btn); + } + if(addEvtLsner){ + btn.addEventListener('click', function () { + store.set(id, Array.from(this.classList).indexOf('secondary-down') === -1); + }); + } + } + + function handleExtensions(config, addEvtLsner=true) { + // if (config['state_extensions']) { + // config['state_extensions'].forEach(function (ext) { + // if (ext in state.extensions) { + // state.extensions[ext].init(); + // } + // }); + // } + for (const [name, obj] of Object.entries(state.extensions)) { + obj.init(flow_save_mode == "Core", addEvtLsner); + } + + } + + // function handleSettingsPage() { // settings state 界面 绑定按钮事件等操作 + + // const page = gradioApp().querySelector('#settings_state'); + // state.utils.html.setStyle(page.querySelectorAll('fieldset'), { + // 'marginTop': '20px', + // 'marginBottom': '10px' + // }); + + // let buttonsContainer = gradioApp().querySelector('#settings_state_buttons'); + // if (buttonsContainer) { + // buttonsContainer.parentNode.removeChild(buttonsContainer); + // } + // buttonsContainer = document.createElement('div'); + // buttonsContainer.id = 'settings_state_buttons'; + + // let setCheckboxes = function (value, checkFunc) { + // checkFunc = checkFunc || function () { return true; }; + // Array.from(page.querySelectorAll('input[type="checkbox"]')).forEach(function (el) { + // if (checkFunc(el)) { + // if (el.checked !== value) { + // el.checked = value; + // state.utils.triggerEvent(el, 'change'); + // } + // } else if (el.checked === value) { + // el.checked = !value; + // state.utils.triggerEvent(el, 'change'); + // } + // }); + // }; + // buttonsContainer.appendChild(state.utils.html.createButton('Select All', function () { + // setCheckboxes(true); + // })); + // buttonsContainer.appendChild(state.utils.html.createButton('Select All Except Seeds', function () { + // setCheckboxes(true, function (el) { + // return el.nextElementSibling.textContent.indexOf('seed') === -1; + // }); + // })); + // buttonsContainer.appendChild(state.utils.html.createButton('Unselect All', function () { + // setCheckboxes(false); + // })); + // state.utils.html.setStyle(buttonsContainer, { + // 'marginTop': '20px', + // 'marginBottom': '10px' + // }); + // buttonsContainer.appendChild(state.utils.html.create('hr')); + // buttonsContainer.appendChild(state.utils.html.create('div', + // { innerHTML: 'Actions' }, + // { marginBottom: '10px' })); + // buttonsContainer.appendChild(state.utils.html.createButton('Reset All', actions.resetAll)); + // buttonsContainer.appendChild(state.utils.html.createButton('Export State', actions.exportState)); + // buttonsContainer.appendChild(state.utils.html.createButton('Import State', actions.importState)); + // buttonsContainer.appendChild(state.utils.html.create('input', { + // id: 'state-import-file', type: 'file', accept: 'application/json' + // })); + // page.appendChild(buttonsContainer); + // } + + let actions = { + // resetAll: function () { + // let confirmed = confirm('Reset all state values?'); + // if (confirmed) { + // store.clearAll(); + // alert('All state values deleted!'); + // } + // }, + applyState: async function (addEvtLsner=true) { + console.log("applyState") + await fetch('/lightdiffusionflow/local/config.json?_=' + (+new Date())) + .then(response => response.json()) + .then(config => { + try { + config.hasSetting = hasSetting + + try{ + flow_save_mode = config['lightdiffusionflow-mode'] + }catch (error) { + flow_save_mode = "Core" + } + //console.log(config) + //restoreTabs(config); // 恢复到最后点击的tab页面 + load(config, addEvtLsner); + // forEachElement_WithoutTabs(SELECTS_WITHOUT_PREFIX, (element) => { + // handleSavedSelects(element); + // }); + + // forEachElement(ELEMENTS, config, (element, tab) => { + // handleSavedInput(`${tab}_${element}`); + // }); + + // forEachElement_WithoutTabs(ELEMENTS_WITHOUT_PREFIX, (element) => { + // handleSavedInput(element); + // }); + + // forEachElement(SELECTS, config, (element, tab) => { + // handleSavedSelects(`${tab}_${element}`); + // }); + + // forEachElement(MULTI_SELECTS, config, (element, tab) => { + // handleSavedMultiSelects(`${tab}_${element}`); + // }); + + // forEachElement(TOGGLE_BUTTONS, config, (element, tab) => { + // handleToggleButton(`${tab}_${element}`); + // }); + + // forEachElement_WithoutTabs(IMAGES_WITHOUT_PREFIX, (element) => { + // handleSavedImage(element); + // }); + + // handleExtensions(config); + + + //handleSettingsPage(); + } catch (error) { + console.error('[state]: Error:', error); + } + }) + .catch(error => console.error('[state]: Error getting JSON file:', error)); + }, + + exportState: function () { + + // if(state.utils.getCurSeed('txt2img') != undefined){ + // store.set(`txt2img_seed`,state.utils.getCurSeed('txt2img')) + // } + // if(state.utils.getCurSeed('img2img') != undefined){ + // store.set(`img2img_seed`,state.utils.getCurSeed('img2img')) + // } + let stored_config = get_js_local_data() + + fetch('/lightdiffusionflow/local/lightdiffusionflow_config?data2export=true') + .then(response => response.json()) + .then(config => { + + config = JSON.parse(config) + //stored_config = store.getAll() + + let data = { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + "config_data":stored_config + }) + } + fetch(`/lightdiffusionflow/local/parse_lora_info`, data) + .then(response => response.json()) + .then(response_lora_info => { + + for (let key in response_lora_info){ + stored_config[key] = response_lora_info[key] + } + + for (let key in config){ + if(config[key] != ""){ + stored_config[key] = config[key] + } + } + + // for (let key in stored_config){ + // if(key.indexOf("allow-preview") !== -1 && key.indexOf("ext-control-net") !== -1) + // { + // console.log("allow-preview改成false") + // stored_config[key] = "false" + // } + // } + + var checkTime = function (i) { + if (i < 10) { i = "0" + i; } + return i; + } + let nowdate = new Date(); + let year = String(nowdate.getFullYear()) + let month = String(checkTime(nowdate.getMonth() + 1)) + let day = String(checkTime(nowdate.getDate())) + let h = String(checkTime(nowdate.getHours())) + let m = String(checkTime(nowdate.getMinutes())) + let s = String(checkTime(nowdate.getSeconds())) + let time_str = year+month+day+h+m+s + + filename = 'flow-'+time_str+'.flow' + filename = prompt("Export workflow as:", filename); + if (!filename) return; + if (!filename.toLowerCase().endsWith(".flow")) { + filename += ".flow"; + } + if(filename != ".flow"){ + // const handle = window.showDirectoryPicker(); + // console.log(handle) + + state.utils.saveFile(filename, stored_config); + + fetch('https://api.lightflow.ai/openapi/access?action=export') + .then(response => response.json()) + .then(config => { + console.log(config) + }).catch(function(e) { + console.log("Oops, export callback error!"); + }); + + } + + }).catch(error => console.error('[state]: Error getting Flow file:', error)); + + }).catch(error => console.error('[state]: Error getting Flow file:', error)); + + //config = JSON.stringify(store.getAll(), null, 4); + //fetch(`/lightdiffusionflow/local/ExportLightDiffusionFlow?config=${config}`) + }, + saveFlowToLocal: function saveFlowToLocal(){ + var checkTime = function (i) { + if (i < 10) { i = "0" + i; } + return i; + } + let nowdate = new Date(); + let year = String(nowdate.getFullYear()) + let month = String(checkTime(nowdate.getMonth() + 1)) + let day = String(checkTime(nowdate.getDate())) + let h = String(checkTime(nowdate.getHours())) + let m = String(checkTime(nowdate.getMinutes())) + let s = String(checkTime(nowdate.getSeconds())) + let time_str = year+month+day+h+m+s + + filename = 'flow-'+time_str+'.flow' + filename = prompt("Save workflow as:", filename); + if (!filename) return; + if (!filename.toLowerCase().endsWith(".flow")) { + filename += ".flow"; + } + if(filename != ".flow"){ + + let data = { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + "file_path":filename + }) + } + fetch(`/lightdiffusionflow/local/file_exist`, data) + .then(response => response.json()) + .then(data => { + if(!data || (data && confirm("Overwrite the existing file with the same name?"))) + { + let stored_config = get_js_local_data() + let flow_data = { + method: 'POST', + headers: {'Content-Type': 'application/json'}, + body: JSON.stringify({ + "file_name":filename, + "file_data":stored_config, + "overwrite":true + }) + } + fetch("/lightdiffusionflow/local/save_flow_to_local",flow_data) + + fetch('https://api.lightflow.ai/openapi/access?action=save') + .then(response => response.json()) + .then(config => { + console.log(config) + }).catch(function(e) { + console.log("Oops, export callback error!"); + }); + } + }); + } + }, + + handleLightDiffusionFlow: function (fileInput){ + actions.preset_output_log("start") + //actions.output_log("
Start parsing settings...") + console.log(fileInput) + let temp_fileInput = undefined + try{temp_fileInput = fileInput[0]} catch(error){} + if ( !temp_fileInput ) {temp_fileInput = fileInput} + if ( !temp_fileInput ) { + //alert('Please select a JSON file!'); + actions.preset_output_log("invalid") + //actions.output_log("Please select a valid lightdiffusionflow or image file!") + return; + } + + let file_name = temp_fileInput.name; + console.log(file_name) + let extension = file_name.substring(file_name.lastIndexOf(".")); + if( Image_extensions.indexOf(extension) != -1 ){ + let data = { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + "img_path":file_name + }) + } + fetch(`/lightdiffusionflow/local/png_info`, data) + .then(response => response.json()) + .then(data => { + //console.log(data) + actions.importLightDiffusionFlow(data) + }); + } + else{ + // const file = new Blob([fileInput[0].name]); + const file = temp_fileInput.blob; + const reader = new FileReader(); + reader.onload = function (event) { + actions.importLightDiffusionFlow(event.target.result) + }; + try{ reader.readAsText(file); } catch (error) { + console.log("read from python") + if(temp_fileInput.name != ""){ + let data = { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + "file_path":temp_fileInput.name + }) + } + fetch(`/lightdiffusionflow/local/read_file`, data) + .then(response => response.json()) + .then(data => { + actions.importLightDiffusionFlow(data) + }); + } + + } + } + return fileInput + }, + importLightDiffusionFlow: function (inputData){ + + forEachElement_WithoutTabs(IMAGES_WITHOUT_PREFIX, (image_id) => { + state.utils.clearImage(getElement(image_id)); + }); + + let json_obj = {} + try { json_obj = JSON.parse(inputData) } catch (error) { + actions.preset_output_log("invalid") + //actions.output_log("Please select a valid lightdiffusionflow or image file!") + return; + } + + // 筛选掉默认值参数 + let data = { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + "config_data":json_obj + }) + } + fetch(`/lightdiffusionflow/local/useless_config_filter`, data) + .then(response => response.json()) + .then(data => { + json_obj = data + console.log(ext_list) + // 缺少的插件 + missing_ext_list = [] + for (let key in json_obj){ + ext_name = key.match(/ext-(\S+?)-(txt2img|img2img)/) + //console.log(key) + if(ext_name != null){ + ext_name = ext_name[1] + //console.log(ext_name) + if(ext_list.indexOf(ext_name) === -1){ + if(missing_ext_list.indexOf(ext_name) === -1){ + missing_ext_list.push(ext_name) + } + } + } + } + + if(missing_ext_list.length > 0){ + actions.preset_output_log("missing_exts","",missing_ext_list.join(';')) + } + + forEachElement_WithoutTabs(IMAGES_WITHOUT_PREFIX, (image_id) => { + json_obj[image_id] = "" + }); + // webui主界面 没有localization相关的兼容问题 所以不用管 + + waiting_second_apply = true + store.clearAll(); + store.load(json_obj); + actions.applyState(); + }); + return; + }, + startImportImage: function (index){ + index = Number(index) + + //console.log(`-------startImportImage--'${index}'---------------`) + if(index+1 < img_elem_keys.length){ + //console.log(`---------${img_elem_keys}---------------`) + //console.log(`---------'${index}'-----'${img_elem_keys.length}'-----------`) + switch_tab_dict = { + "img2img_invisible_img2img_image": "switch_to_img2img()", + "img2img_invisible_img2img_sketch": "switch_to_sketch()", + "img2img_invisible_img2maskimg": "switch_to_inpaint()", + "img2img_invisible_inpaint_sketch": "switch_to_inpaint_sketch()", + "img2img_invisible_img_inpaint_base": "state.utils.switch_to_img_inpaint()", + "img2img_invisible_img_inpaint_mask": "state.utils.switch_to_img_inpaint()", + "img2img_invisible_txt2img_controlnet_ControlNet_input_image": "state.utils.switch_to_txt2img_ControlNet(0)", + "img2img_invisible_img2img_controlnet_ControlNet_input_image": "state.utils.switch_to_img2img_ControlNet(0)" + } + + for (let i = 0; i < 10; i++) { + switch_tab_dict[`img2img_invisible_txt2img_controlnet_ControlNet-${i}_input_image`] = `state.utils.switch_to_txt2img_ControlNet(${i})` + switch_tab_dict[`img2img_invisible_img2img_controlnet_ControlNet-${i}_input_image`] = `state.utils.switch_to_img2img_ControlNet(${i})` + } + + state.utils.sleep(300).then(() => { + try{ + key = "img2img_invisible_"+img_elem_keys[index+1] + eval( switch_tab_dict[key] ) // 跳转界面 + const button = gradioApp().getElementById(key); + button.click(); + } catch (error) { + console.warn('[startImportImage]: Error:', error); + if(index+1 < img_elem_keys.length){ + // 图片组件设置出错了,但是需要继续后续的流程 + index = img_elem_keys.length-1 + } + } + }); + } + + switch(index+1 - img_elem_keys.length){ + case 0:// 图片导入完成,开始导入Dropdown + state.utils.sleep(500).then(() => { + try{ + const button = gradioApp().getElementById("lightdiffusionflow_set_dropdowns"); + button.click(); + } catch (error) { + console.warn('[set_dropdowns]: Error:', error); + } + }); + break + // case 1:// 触发了导入Dropdown,现在导入其他参数会卡死,触发导入按钮,等下一轮正式开始导入 + // state.utils.sleep(500).then(() => { + // const button = gradioApp().getElementById("lightdiffusionflow_set_js_params"); + // console.log("lightdiffusionflow_set_js_params") + // button.click(); + // }); + // break + // case 2:// 导入其他参数 + // state.utils.sleep(500).then(() => { + // console.log("导入其他参数") + // actions.applyState(); + // }); + // break + } + }, + preset_output_log: function (preset, key="", value=""){ + fetch(`/lightdiffusionflow/local/preset_output_log?preset=${preset}&key=${key}&value=${value}`).then(() => { + gradioApp().getElementById("img2img_invisible_refresh_log").click(); + }); + }, + output_log: function (msg, msg_style=""){ + fetch(`/lightdiffusionflow/local/output_log?msg=${msg}&style=${msg_style}`).then(() => { + gradioApp().getElementById("img2img_invisible_refresh_log").click(); + }); + }, + output_warning: function (msg, msg_style="color:Orange;"){ + actions.output_log(msg,msg_style) + }, + output_error: function (msg, msg_style="color:Red;"){ + actions.output_log(msg,msg_style) + }, + get_sd_version: function (){ + return sd_versions + } + }; + + return { init, actions }; +}()); diff --git a/gf/LightDiffusionFlow/javascript/state.ext.general.js b/gf/LightDiffusionFlow/javascript/state.ext.general.js new file mode 100644 index 0000000000000000000000000000000000000000..55665c8ead05eefe8c5a8a8bbb6cada375a73eb7 --- /dev/null +++ b/gf/LightDiffusionFlow/javascript/state.ext.general.js @@ -0,0 +1,501 @@ +window.state = window.state || {}; +window.state.extensions = window.state.extensions || {}; +state = window.state; + +function general_ext(tab_name, extension_name, root_container) { + + let container = root_container; + let store = null; + let cnTabs = []; + let root_not_tabs = null; + let cur_tab_name = tab_name; + let ext_name = extension_name + let LS_PREFIX = 'ext-'+ ext_name.replace(" ","-").toLowerCase() + "-" + + function handleToggle(addEvtLsner=true) { + let value = store.get('toggled'); + let toggleBtn = container.querySelector('div.cursor-pointer, .label-wrap'); + //for(let toggleBtn of toggleBtns){ + //if(!toggleBtn){continue} + if(toggleBtn && toggleBtn.className.split(' ').pop() != "open"){ + if (value && value === 'true') { + state.utils.triggerEvent(toggleBtn, 'click'); + //load(); + } + + if(addEvtLsner) + { + toggleBtn.addEventListener('click', function () { + let span = this.querySelector('.transition, .icon'); + store.set('toggled', span.style.transform !== 'rotate(90deg)'); + //load(); + }); + } + } + //} + } + + function bindTabEvents(addEvtLsner=true) { + const tabs = container.querySelectorAll('.tabs > div > button'); + if(addEvtLsner) + { + tabs.forEach(tab => { // dirty hack here + tab.removeEventListener('click', onTabClick); + tab.addEventListener('click', onTabClick); + }); + } + return tabs; + } + + function handleTabs(addEvtLsner=true) { + let tabs = bindTabEvents(addEvtLsner); + let value = store.get('tab'); + if (value) { + for (var i = 0; i < tabs.length; i++) { + let translations = state.utils.reverseTranslation(tabs[i].textContent) + if (value in translations) { + //if (tabs[i].textContent === value) { + state.utils.triggerEvent(tabs[i], 'click'); + break; + } + } + } + } + + function onTabClick() { + store.set('tab', state.utils.reverseTranslation(this.textContent)[0]); + bindTabEvents(); + } + + function handleCheckbox(checkbox, store, addEvtLsner=true) { + let label = checkbox.nextElementSibling; + let translations = state.utils.reverseTranslation(label.textContent) + for (var text of translations){ + var id = state.utils.txtToId(text); + var value = store.get(id); + if (value) {break} + } + if (value) { + state.utils.setValue(checkbox, value, 'change'); + } + + if(addEvtLsner){ + checkbox.addEventListener('change', function () { + let label = checkbox.nextElementSibling; + let translations = state.utils.reverseTranslation(label.textContent) + for (var text of translations){ + var id = state.utils.txtToId(text); + store.set(id, this.checked); + } + }); + } + } + function handleCheckboxes(addEvtLsner=true) { + let root_checkboxes = root_not_tabs.container.querySelectorAll('input[type="checkbox"]'); + root_checkboxes.forEach(function (root_checkbox) { + if(cnTabs.length == 0){ + handleCheckbox(root_checkbox, root_not_tabs.store, addEvtLsner) + } + else{ + let needsHandle = true + for(let tab of cnTabs){ + if(tab.container.contains(root_checkbox)){ + needsHandle = false + break + } + } + if(needsHandle){handleCheckbox(root_checkbox, root_not_tabs.store, addEvtLsner)} + } // else + }); + + cnTabs.forEach(({ container, store }) => { + let checkboxes = container.querySelectorAll('input[type="checkbox"]'); + checkboxes.forEach(function (checkbox) { + handleCheckbox(checkbox, store, addEvtLsner) + }); + }); + + } + + function handleTextArea(textarea, index, store, addEvtLsner=true) { + var id = state.utils.txtToId(`textarea_${index}`); + var value = store.get(id); + if (value) { + state.utils.setValue(textarea, value, 'change'); + } + + if(addEvtLsner){ + textarea.addEventListener('change', function () { + let text = this.value; + store.set(id, text); + //console.log(`id = ${id} value = ${text}`) + }); + } + } + function handleTextAreas(addEvtLsner=true) { + let textArea_index = 0; // 因为文本框的顺序不会变,所以命名直接使用序号区分 "textarea_0" + + let root_textareas = root_not_tabs.container.querySelectorAll('textarea'); + root_textareas.forEach(function (root_textarea) { + + if(cnTabs.length == 0){ + handleTextArea(root_textarea, textArea_index, root_not_tabs.store, addEvtLsner) + textArea_index += 1 + } + else{ + let needsHandle = true + for(let tab of cnTabs){ + if(tab.container.contains(root_textarea)){ + needsHandle = false + break + } + } + if(needsHandle){ + handleTextArea(root_textarea, textArea_index, root_not_tabs.store, addEvtLsner) + textArea_index += 1 + } + } // else + + }); + + cnTabs.forEach(({ container, store }) => { + container.querySelectorAll('textarea').forEach(textarea => { + handleTextArea(textarea, textArea_index, store, addEvtLsner) + textArea_index += 1 + }); + }); + + } + + function handleSelect(select, store, addEvtLsner=true) { + let translations = state.utils.reverseTranslation(select.querySelector('label').firstChild.textContent) + for (var text of translations){ + var id = state.utils.txtToId(text); + var value = store.get(id); + if (value) {break} + } + //id = state.utils.txtToId(translations[0]); + //if (value) { //前面不需要判断是否有值,因为需要执行handleSelect绑定onchange事件 + state.utils.handleSelect(select, id, store, force=true, addEvtLsner); + //} + + // let id = state.constants.LS_PREFIX+LS_PREFIX+"dropdown_"+index + // state.utils.onContentChange(select, function (el) { + // console.log(`onContentChange ${id}`) + // let selected = el.querySelector('span.single-select'); + // if (selected) { + // store.setWithNoPrefix(id, selected.textContent); + // } else { + // // new gradio version... + // let input = el.querySelector('input'); + // if (input) { + // store.setWithNoPrefix(id, input.value); + // } + // } + // }); + + if (id === 'preprocessor' && value && value.toLowerCase() !== 'none') { + state.utils.onNextUiUpdates(handleSliders); // update new sliders if needed + } + } + function handleSelects(addEvtLsner=true) { + // let root_selects = root_not_tabs.container.querySelectorAll('.gradio-dropdown'); + // root_selects.forEach(function (root_select) { + // if(cnTabs.length == 0){ + // handleSelect(root_select, global_dropdown_index[ext_name], root_not_tabs.store) + // global_dropdown_index[ext_name] += 1 + // console.log(`global_dropdown_index = ${global_dropdown_index[ext_name]}`) + // } + // else{ + // let needsHandle = true + // for(let tab of cnTabs){ + // if(tab.container.contains(root_select)){ + // needsHandle = false + // break + // } + // } + // if(needsHandle){ + // handleSelect(root_select, global_dropdown_index[ext_name], root_not_tabs.store) + // global_dropdown_index[ext_name] += 1 + // console.log(`global_dropdown_index = ${global_dropdown_index[ext_name]}`) + // } + // } // else + // }); + + // cnTabs.forEach(({ container, store }) => { + // container.querySelectorAll('.gradio-dropdown').forEach(select => { + // handleSelect(select, global_dropdown_index[ext_name], store) + // global_dropdown_index[ext_name] += 1 + // console.log(`global_dropdown_index = ${global_dropdown_index[ext_name]}`) + // }); + // }); + + let root_selects = root_not_tabs.container.querySelectorAll('.gradio-dropdown'); + root_selects.forEach(function (root_select) { + if(cnTabs.length == 0){ + handleSelect(root_select, root_not_tabs.store, addEvtLsner) + } + else{ + let needsHandle = true + for(let tab of cnTabs){ + if(tab.container.contains(root_select)){ + needsHandle = false + break + } + } + if(needsHandle){handleSelect(root_select, root_not_tabs.store, addEvtLsner)} + } // else + }); + + cnTabs.forEach(({ container, store }) => { + container.querySelectorAll('.gradio-dropdown').forEach(select => { + handleSelect(select, store, addEvtLsner) + }); + }); + + } + + function handleSlider(slider, store, addEvtLsner=true) { + let label = slider.previousElementSibling.querySelector('label span'); + let translations = state.utils.reverseTranslation(label.textContent) + for (var text of translations){ + var id = state.utils.txtToId(text); + var value = store.get(id); + if (value) {break} + } + if (value) { + state.utils.setValue(slider, value, 'change'); + } + if(addEvtLsner){ + slider.addEventListener('change', function () { + //store.set(id, state.utils.reverseTranslation(this.value)[0]); + let label = slider.previousElementSibling.querySelector('label span'); + let translations = state.utils.reverseTranslation(label.textContent) + for (var text of translations){ + var id = state.utils.txtToId(text); + store.set(id, state.utils.reverseTranslation(this.value)[0]); + } + }); + } + } + function handleSliders(addEvtLsner=true) { + + let root_sliders = root_not_tabs.container.querySelectorAll('input[type="range"]'); + root_sliders.forEach(function (root_slider) { + if(cnTabs.length == 0){ + handleSlider(root_slider, root_not_tabs.store, addEvtLsner) + } + else{ + let needsHandle = true + for(let tab of cnTabs){ + if(tab.container.contains(root_slider)){ + needsHandle = false + break + } + } + if(needsHandle){handleSlider(root_slider, root_not_tabs.store, addEvtLsner)} + } // else + }); + + cnTabs.forEach(({ container, store }) => { + let sliders = container.querySelectorAll('input[type="range"]'); + sliders.forEach(function (slider) { + handleSlider(slider, store, addEvtLsner) + }); + }); + } + + function handleRadioButton(fieldset, store, addEvtLsner=true) { + let label = fieldset.firstChild.nextElementSibling; + let radios = fieldset.querySelectorAll('input[type="radio"]'); + let translations = state.utils.reverseTranslation(label.textContent) + for (var text of translations){ + var id = state.utils.txtToId(text); + var value = store.get(id); + if (value) {break} + } + if (value) { + radios.forEach(function (radio) { + state.utils.setValue(radio, value, 'change'); + }); + } + + if(addEvtLsner){ + radios.forEach(function (radio) { + radio.addEventListener('change', function () { + let label = fieldset.firstChild.nextElementSibling; + let translations = state.utils.reverseTranslation(label.textContent) + for (var text of translations){ + var id = state.utils.txtToId(text); + store.set(id, state.utils.reverseTranslation(this.value)[0]); + } + }); + }); + } + } + function handleRadioButtons(addEvtLsner=true) { + + let root_fieldsets = root_not_tabs.container.querySelectorAll('fieldset'); + root_fieldsets.forEach(function (root_fieldset) { + if(cnTabs.length == 0){ + handleRadioButton(root_fieldset, root_not_tabs.store, addEvtLsner) + } + else{ + let needsHandle = true + for(let tab of cnTabs){ + if(tab.container.contains(root_fieldset)){ + needsHandle = false + break + } + } + if(needsHandle){handleRadioButton(root_fieldset, root_not_tabs.store, addEvtLsner)} + } // else + }); + + cnTabs.forEach(({ container, store }) => { + let fieldsets = container.querySelectorAll('fieldset'); + fieldsets.forEach(function (fieldset) { + handleRadioButton(fieldset, store, addEvtLsner) + }); + }); + } + + + function load(addEvtLsner=true) { + setTimeout(function () { + handleTabs(addEvtLsner); + handleCheckboxes(addEvtLsner); + handleTextAreas(addEvtLsner); + if(state.utils.target_is_newer_version(state.core.actions.get_sd_version(), "v1.6.0")){ + //console.log("-----------general handleSelects-------") + handleSelects(addEvtLsner); + } + handleSliders(addEvtLsner); + handleRadioButtons(addEvtLsner); + }, 500); + } + + function init(addEvtLsner=true) { + + store = new state.Store(LS_PREFIX + cur_tab_name); + + if (! container) { + return; + } + + let tabnav = container.getElementsByClassName('tabs'); + let tabs = [] + if(tabnav.length > 0 ){ + tabs = Array.from(tabnav[0].childNodes).filter(item => item.className && item.className.indexOf("tabitem") !== -1) + } + + cnTabs = []; + if (tabs.length) { + tabs.forEach((tabContainer, i) => { + cnTabs.push({ + container: tabContainer, + store: new state.Store(LS_PREFIX + cur_tab_name + "_" + i) + }); + }); + } + //else { + root_not_tabs = { + container: container, + store: new state.Store(LS_PREFIX + cur_tab_name) + } + //} + + handleToggle(addEvtLsner); + load(addEvtLsner); + } + return { init,LS_PREFIX }; +} + + +function general_ext_main(tab){ + + let cur_tab_name = tab + let general_ext_list = [] + // 遍历第一级子节点 每个节点选出一个层级最小且innerText不为空的子节点 + function walks_element(element, cur_gen){ + if(element.innerText != "" && element.innerText != undefined && element.children.length == 0){ + return [[element.innerText,cur_gen]] + } + let res = [] + for(child of element.children){ + res = res.concat(walks_element(child,cur_gen+1,res)) + } + + return res + } + + function init(core_mode = true, addEvtLsner=true) { + console.log(`------------${cur_tab_name}----init--- addEvtLsner=${addEvtLsner} ----`) + + if(addEvtLsner==false) + { + for (let obj of general_ext_list) + { + obj.init(addEvtLsner); + } + return + } + + let container = gradioApp().getElementById(cur_tab_name+'_script_container'); // main container + + let extensions_root = container.children + if(extensions_root.length > 0 && extensions_root[0].className.split(' ')[0] != "gr-group" && extensions_root[0].className.split(' ')[0] != "gradio-group"){ + extensions_root = extensions_root[0].children // webui v1.6.0 版本,UI结构有变更 + } + //console.log(extensions_root) + for (child of extensions_root){ + let root_container = child + res = walks_element(child, 0) + let min_gen = 99 + let title = undefined + for(pair of res){ + if(pair[1] < min_gen){ + min_gen = pair[1] + title = pair[0] + } + } + + if(title == undefined + || title.toLowerCase() == "lightdiffusionflow" // 自己存自己就不用了 + ){continue} + + let translations = state.utils.reverseTranslation(title) + title = translations[0] // 标题翻译一般只会有一个? + if(title.toLowerCase() == 'script'){break} // script后面的面板暂时不考虑 + console.log(title) + + reg = /(.+) v[0-9\.]+/ + if(reg.test(title)){title = RegExp.$1} // 匹配 xxx v0.0.0 格式的标题,把后半部分的版本号去掉 + + if(title == "ControlNet"){title = "Control Net"} // 兼容旧命名 + else{ + if(core_mode){continue} + } + + let ext_name = title.replace(" ","-").toLowerCase() + console.log(ext_name) + + // if(!global_dropdown_index[ext_name]){ + // global_dropdown_index[ext_name] = 0 + // } + general_ext_obj = general_ext(cur_tab_name, ext_name, root_container) + general_ext_list.push(general_ext_obj) + general_ext_obj.init(addEvtLsner); + } + + } + return {init} +} + +//global_dropdown_index = {} // py里是不分txt2img和img2img的,但是这里是需要区分的。。 + +const TABS = ['txt2img', 'img2img']; +for (tab of TABS){ + state.extensions[`${tab}-ext-general`] = general_ext_main(tab); +} + diff --git a/gf/LightDiffusionFlow/javascript/state.store.js b/gf/LightDiffusionFlow/javascript/state.store.js new file mode 100644 index 0000000000000000000000000000000000000000..c06f4b323dbc13716d9d3cf68f142c0c29dbd0d3 --- /dev/null +++ b/gf/LightDiffusionFlow/javascript/state.store.js @@ -0,0 +1,60 @@ +window.state = window.state || {}; +state = window.state; + +state.Store = function Store (prefix) { + this.prefix = state.constants.LS_PREFIX + (prefix ? prefix + '-' : ''); +} + +state.Store.prototype.setWithNoPrefix = function (key, value) { + localStorage.setItem(key, value); +}; + +state.Store.prototype.set = function (key, value) { + if (key.startsWith(this.prefix)) { + localStorage.setItem(key, value); + } else { + localStorage.setItem(this.prefix + key, value); + } +}; + +state.Store.prototype.get = function (key) { + return localStorage.getItem(this.prefix + key); +}; + +state.Store.prototype.remove = function (key) { + localStorage.removeItem(this.prefix + key); +}; + +state.Store.prototype.clear = function () { + localStorage.clear(); +}; + +state.Store.prototype.clearAll = function () { + let keys = Object.keys(localStorage); + for (let i = 0; i < keys.length; i++) { + if (keys[i].startsWith(state.constants.LS_PREFIX)) { + localStorage.removeItem(keys[i]); + } + } +}; + +state.Store.prototype.getAll = function () { + let result = {}; + let keys = Object.keys(localStorage); + for (let i = 0; i < keys.length; i++) { + if (keys[i].startsWith(state.constants.LS_PREFIX)) { + result[keys[i]] = localStorage[keys[i]]; + } + } + return result; +}; + +state.Store.prototype.load = function (json) { + this.clearAll(); + let keys = Object.keys(json); + for (let i = 0; i < keys.length; i++) { + if (keys[i].startsWith(state.constants.LS_PREFIX)) { + this.set(keys[i], json[keys[i]]); + } + } +}; diff --git a/gf/LightDiffusionFlow/javascript/state.utils.js b/gf/LightDiffusionFlow/javascript/state.utils.js new file mode 100644 index 0000000000000000000000000000000000000000..d089a8b3cb34846637024acad848fc4d11fe4940 --- /dev/null +++ b/gf/LightDiffusionFlow/javascript/state.utils.js @@ -0,0 +1,684 @@ +window.state = window.state || {}; +state = window.state; +let selectingQueue = 0; // 默认都延时一下再开始触发 + +state.utils = { + + testFunction: function testFunction() { + //console.log(state.extensions) + // const button = gradioApp().getElementById("lightdiffusionflow_set_elements"); + // button.click(); + value = "新建文件夹\\anything-v5-PrtRE.safetensors" + value = value.replace("/","\\") + parts = value.split("\\") + console.log(parts[parts.length - 1]) + }, + + target_is_newer_version: function(cur_version, target_version){ + + let cur = cur_version.replace("v","") + cur = cur.split(".") + + let target = target_version.replace("v","") + target = target.split(".") + let version_len = Math.min(cur.length, target.length) + + // 逐个版本号比较 v1.2.3 和 v1.2比较时,只比较前面两个数字 + for (let i=0; i < version_len; i++){ + if(Number(cur[i]) > Number(target[i])){ + return false + } + else if(Number(cur[i]) < Number(target[i])){ + return true + } + } + + // 前面的版本号一样,再看谁的版本号更长 + if(cur.length >= target.length){ + return false + } + + return true + }, + + searchCheckPointByHash: async function searchCheckPointByHash(hash){ + let downloadUrl = undefined + hash_str = hash.replace("[","").replace("]","").replace(/^\s+|\s+$/g,"") + await fetch("https://civitai.com/api/v1/model-versions/by-hash/"+hash_str) + .then(response => response.json()) + .then(data => { + //try{ + //console.log(data["files"]) + for (file of data["files"]){ + for (key of Object.keys(file["hashes"])){ + if(file["hashes"][key].toLowerCase() === hash_str.toLowerCase()) + { + downloadUrl = file["downloadUrl"] + console.log(downloadUrl) + break + } + } + } + if(downloadUrl == undefined){downloadUrl = data["files"][0]["downloadUrl"]} + //} catch (error) {} + }).catch(function(e) { + console.log("search model error!"); + }); + + return downloadUrl + }, + + getTranslation: function getTranslation(key){ + new_key = key + try{ + if(window.localization[new_key.replace(/^\s+|\s+$/g,"")] != undefined){ + new_key = window.localization[new_key] + } + } catch (error) { + console.warn('getTranslation error:', error); + } + return new_key + }, + + reverseTranslation: function reverseTranslation(key){ + new_key = [] + try{ + //key=key.replace(/^\s+|\s+$/g,""); + for (localize_key of Object.keys(window.localization)) { + if(key.replace(/^\s+|\s+$/g,"") === window.localization[localize_key].replace(/^\s+|\s+$/g,"")){ + tmp_key = localize_key + new_key.push(tmp_key) + //break + } + } + } catch (error) { + console.warn('reverseTranslation error:', error); + } + + if(new_key.length == 0){new_key.push(key)} + //console.log(`---------reverseTranslation---------${key}---------`) + //console.log(new_key) + return new_key + }, + + sleep: function sleep(time) { + return new Promise((resolve) => setTimeout(resolve, time)); + }, + + switch_to_img_inpaint: function switch_to_img_inpaint() { + switch_to_img2img_tab(4); + return Array.from(arguments); + }, + switch_to_txt2img_ControlNet: function switch_to_txt2img_ControlNet(unit) { + + switch_to_txt2img() + + let elem = undefined + elem = gradioApp().getElementById('txt2img_controlnet') + elem = elem.querySelector("#controlnet") + + try{ + if(elem.className.split(' ').pop() != "open"){ + state.utils.triggerMouseEvent(elem, 'click') + } + for(e of elem.children){ + if(e.className.split(' ').pop() != "open"){ + state.utils.triggerMouseEvent(e, 'click') + } + } + } catch(error){console.log(error)} + + try{ + gradioApp().getElementById('txt2img_controlnet_tabs').querySelectorAll('button')[Number(unit)].click() + } catch (error) { + console.warn('[switch_to_txt2img_ControlNet]: Error:', error); + } + }, + switch_to_img2img_ControlNet: function switch_to_img2img_ControlNet(unit) { + + switch_to_img2img() + + let elem = undefined //gradioApp().getElementById('txt2img_controlnet').children[0].children[1] + elem = gradioApp().getElementById('img2img_controlnet') + elem = elem.querySelector("#controlnet") + + try{ + if(elem.className.split(' ').pop() != "open"){ + state.utils.triggerMouseEvent(elem, 'click') + } + for(e of elem.children){ + if(e.className.split(' ').pop() != "open"){ + state.utils.triggerMouseEvent(e, 'click') + } + } + } catch(error){console.log(error)} + + try{ + gradioApp().getElementById('img2img_controlnet_tabs').querySelectorAll('button')[Number(unit)].click() + } catch (error) { + console.warn('[switch_to_img2img_ControlNet]: Error:', error); + } + }, + triggerEvent: function triggerEvent(element, event) { + if (! element) { + return; + } + element.dispatchEvent(new Event(event.trim())); + return element; + }, + triggerMouseEvent: function triggerMouseEvent(element, event) { + if (! element) { + return; + } + event = event || 'click'; + element.dispatchEvent(new MouseEvent(event, { + view: window, + bubbles: true, + cancelable: true, + })); + return element; + }, + setValue: function setValue(element, value, event) { + switch (element.type) { + case 'checkbox': + element.checked = value === 'true'; + this.triggerEvent(element, event); + break; + case 'radio': + if (element.value === value) { + element.checked = true; + this.triggerEvent(element, event); + } + else if(element.value == "Scribble/Sketch" && value == "Scribble"){ + element.checked = true; + this.triggerEvent(element, event); + } + else { + element.checked = false; + } + break; + default: + element.value = value; + this.triggerEvent(element, event); + } + }, + onFrameContentChange: function onFrameContentChange(targetNode, func) { + if(targetNode) { + const observer = new MutationObserver((mutationsList, observer) => { + for (const mutation of mutationsList) { + if (mutation.type === 'childList' || + (mutation.type === 'attributes' && mutation.attributeName == 'src') // 图片被更改 + ) { + //console.log(`onFrameContentChange ${mutation.type} `) + func(targetNode); + } + } + }); + observer.observe(targetNode, { + //attributes: true, + childList: true, + //characterData: true, + subtree: true + }); + } + }, + + onContentChange: function onContentChange(targetNode, func) { + if(targetNode) { + const observer = new MutationObserver((mutationsList, observer) => { + for (const mutation of mutationsList) { + if (mutation.type === 'childList' || + (mutation.type === 'attributes' && mutation.attributeName == 'src') // 图片被更改 + ) { + func(targetNode); + } + } + }); + observer.observe(targetNode, { + attributes: true, + childList: true, + characterData: true, + subtree: true + }); + } + }, + + onAccordionChange: function onAccordionChange(targetNode, func) { + if(targetNode) { + const observer = new MutationObserver((mutationsList, observer) => { + for (const mutation of mutationsList) { + if (mutation.type === 'attributes' ) { + func(targetNode); + } + } + }); + observer.observe(targetNode, { + attributes: true, + }); + } + }, + + getCurSeed: function getCurSeed(tab) { + const elements = gradioApp().getElementById(`html_info_${tab}`).querySelectorAll(`#html_info_${tab}`); + if (! elements || ! elements.length || !elements[0].innerText) { + return undefined; + } + seed = undefined + values = elements[0].innerText.split(',') + for (value of values){ + pair = value.split(':') + if(pair[0].replace(/^\s+|\s+$/g,"") == 'Seed'){ + seed = pair[1].replace(/^\s+|\s+$/g,"") + } + } + return seed + }, + + handleImage: function handleImage(select, id, store, addEvtLsner=true) { + if(addEvtLsner){ + setTimeout(() => { + state.utils.onContentChange(select, function (el) { + + let data = { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + "id":id, + "img":"" + }) + } + + try { + // new gradio version... + let img = el.querySelector('img'); + if (img) { + data.body = JSON.stringify({ + "id":id, + "img":img.src + }) + } + } catch (error) { + console.warn('[state]: Error:', error); + } + //console.log(`image changed ${id}`) + fetch(`/lightdiffusionflow/local/imgs_callback`, data) + }); + }, 150); + } + }, + + clearImage: function clearImage(select) { + try { + if(select){ + + let buttons = select.querySelectorAll('button'); + buttons.forEach(button => { + if(button.getAttribute("aria-label") == "Clear"){ + button.click(); + //state.utils.triggerMouseEvent(button, 'mousedown'); + } + }); + + } + } catch (error) { + console.warn('[state]: Error:', error); + } + }, + // handleSelect: function handleSelect(select, id, store) { + // try { + // let value = store.get(id); + // if (value) { + + // let input = select.querySelector('input'); + // state.utils.triggerMouseEvent(input, 'focus'); + + // setTimeout(() => { + // let items = Array.from(select.querySelectorAll('ul li')); + // for (li of items){ + // if (li.lastChild.wholeText.trim() === value) { + // state.utils.triggerMouseEvent(li, 'mousedown'); + // //return false; + // break + // } + // } + // state.utils.triggerMouseEvent(input, 'blur'); + // }, 100); + // } + + // setTimeout(() => { + // state.utils.onContentChange(select, function (el) { + // let selected = el.querySelector('span.single-select'); + // if (selected) { + // store.set(id, selected.textContent); + // } else { + // // new gradio version... + // let input = select.querySelector('input'); + // if (input) { + // store.set(id, input.value); + // } + // } + // }); + // }, 150); + // } catch (error) { + // console.warn('[state]: Error:', error); + // } + // }, + + forceSaveSelect: function forceSaveSelect(select, id, store) { + let selected = select.querySelector('span.single-select'); + if (selected) { + store.set(id, selected.textContent); + } else { + // new gradio version... + let input = select.querySelector('input'); + if (input) { + store.set(id, input.value); + } + } + }, + handleAccordion: function handleAccordion(accordion, id, store, addEvtLsner=true){ + try{ + let value = store.get(id); + let child = accordion.querySelector('div.cursor-pointer, .label-wrap'); + if (value) { + //for(child of children){ + //let span = child.querySelector('.transition, .icon'); + //if(span.style.transform !== 'rotate(90deg)'){ + if(child.className.split(' ').pop() != "open"){ + state.utils.triggerMouseEvent(child, 'click') + } + //} + } + + if(addEvtLsner){ + setTimeout(() => { + state.utils.onAccordionChange(child, function (el) { + store.set(id, el.className.split(' ').pop() == "open"); + //console.log(`accordion on change ${id}`) + //let span = el.querySelector('.transition, .icon'); + //store.set(id, span.style.transform !== 'rotate(90deg)'); + }); + }, 150); + } + } catch (error) { + console.warn(`accordion:${accordion}, id:${id}`) + console.warn('[state]: Error:', error); + } + }, + handleSelect: function handleSelect(select, id, store, force=false, addEvtLsner=true) { + try { + + let value = store.get(id); + if ( value ) { //&& value != 'None' + value = value.replace("/","\\") + let parts = value.split("\\") + value = parts[parts.length - 1] + + selectingQueue += 1; + setTimeout(() => { + + let input = select.querySelector('input'); + state.utils.triggerMouseEvent(input, 'focus'); + setTimeout(() => { + let items = Array.from(select.querySelectorAll('ul li')); + let localized_value = this.getTranslation(value) + let successed = false + for (li of items){ + // li.lastChild.wholeText.trim() === value + let option = li.lastChild.wholeText.trim().replace(/^\s+|\s+$/g,"") + option = option.replace("/","\\") + let parts = option.split("\\") + option = parts[parts.length - 1] + if (localized_value.replace(/^\s+|\s+$/g,"") === option) { + state.utils.triggerMouseEvent(li, 'mousedown'); + successed = true + break + } + } + + let hash_res = localized_value.match(/\[[0-9A-Fa-f]{8,10}\]/) + if(!successed){ // && hash_pos != -1 找不到对应选项 并且选项里有10位哈希值 + for (li of items){ + + // 去掉Hash比较 + let text = li.lastChild.wholeText.trim() + text = text.replace("/","\\") + let parts = text.split("\\") + text = parts[parts.length - 1] + + let localized_value_no_hash = localized_value.replace(/\[[0-9A-Fa-f]{8,10}\]/,"").replace(/^\s+|\s+$/g,"") + let text_no_hash = text.replace(/\[[0-9A-Fa-f]{8,10}\]/, "").replace(/^\s+|\s+$/g,"") + + if (localized_value_no_hash === text_no_hash) { + successed = true + } + + // 只比较Hash + if(!successed && hash_res != null){ + let hash_str = hash_res[0].replace(/^\s+|\s+$/g,"") + let text_hash_res = text.match(/\[[0-9A-Fa-f]{8,10}\]/) + if(text_hash_res != null){ + let text_hash = text_hash_res[0].replace(/^\s+|\s+$/g,"") + if (hash_str === text_hash) { + successed = true + } + } + } + + if(successed){ + state.utils.triggerMouseEvent(li, 'mousedown'); + // state.core.actions.output_log( + // `Note: \'${value}\' not found. An approximate match \'${li.lastChild.wholeText.trim()}\' has been automatically selected as replacement.` + // ) + state.core.actions.preset_output_log("alt_option", value, li.lastChild.wholeText.trim()) + break + } + } + } + + if(!successed && items.length > 0) // 下拉框一个选项都没找到说明就没有这个下拉框,可能是界面设置把下拉框替换成了radio button + { + let option_name = store.prefix + id + if(option_name === "state-setting_sd_model_checkpoint"){ + // 大模型找不到就只用warning提示,因为不影响运行 + // state.core.actions.output_log(`Note: \'${value}\' not found.`) + state.core.actions.preset_output_log("no_option", "stable diffusion checkpoint", value) + } + else{ + //state.core.actions.output_log(`Error: \'${option_name}\' import failed! The option \'${value}\' was not found!`) + state.core.actions.preset_output_log("no_option", option_name, value) + } + if(hash_res != null){ + let model_name = value + let hash_str = hash_res[0] + state.utils.searchCheckPointByHash(hash_str).then( downloadUrl => { + if(downloadUrl != undefined){ + // let warning_str = encodeURIComponent(`Click to download \ + // ${model_name} `) + // state.core.actions.output_warning(warning_str) + state.core.actions.preset_output_log("download_url", model_name, downloadUrl) + } + }); + } + } + + state.utils.triggerMouseEvent(input, 'blur'); + selectingQueue -= 1; + //console.log(`selectingQueue = ${selectingQueue}`) + }, 100); + + }, selectingQueue * 200) + } + + if(addEvtLsner) + { + setTimeout(() => { + state.utils.onContentChange(select, function (el) { + let selected = el.querySelector('span.single-select'); + if(force){ + let localized_id = state.utils.getTranslation(id) + let id_translations = state.utils.reverseTranslation(localized_id) + //宁可错存一千,也不漏存一个 + for (trans_id of id_translations){ + if (selected) { + store.set(trans_id, selected.textContent); + } else { + // new gradio version... + let input = el.querySelector('input'); + if (input) { + store.set(trans_id, input.value); + } + } + } + } + else{ + if (selected) { + store.set(id, selected.textContent); + } else { + // new gradio version... + let input = el.querySelector('input'); + if (input) { + store.set(id, input.value); + } + } + } + }); + }, 150); + } + } catch (error) { + console.warn('[state]: Error:', error); + } + }, + handleMultipleSelect: function handleMultipleSelect(select, id, store, addEvtLsner=true) { + try { + let value = store.get(id); + + if (value) { + + value = value.split(',').reverse(); + + if (value.length) { + + let input = select.querySelector('input'); + + let selectOption = function () { + + if (! value.length) { + state.utils.triggerMouseEvent(input, 'blur'); + return; + } + + let option = value.pop(); + state.utils.triggerMouseEvent(input, 'focus'); + + setTimeout(() => { + let successed = false + let items = Array.from(select.querySelectorAll('ul li')); + items.forEach(li => { + if (li.lastChild.wholeText.trim() === option) { + state.utils.triggerMouseEvent(li, 'mousedown'); + successed = true + return false; + } + }); + if(!successed){ + state.core.actions.preset_output_log("no_option", store.prefix + id, value) + //state.core.actions.output_log(`Error: \'${store.prefix + id}\' import failed! The option \'${value}\' was not found!`) + } + setTimeout(selectOption, 100); + }, 100); + } + selectOption(); + } + } + + if(addEvtLsner){ + state.utils.onContentChange(select, function (el) { + const selected = Array.from(el.querySelectorAll('.token > span')).map(item => item.textContent); + store.set(id, selected); + }); + } + } catch (error) { + console.warn('[state]: Error:', error); + } + }, + txtToId: function txtToId(txt) { + return txt.split(' ').join('-').toLowerCase(); + }, + callXTimes: function callXTimes(func, times) { + let called = 0; + return function() { + if (called < times) { + called++; + return func.apply(this); + } + } + }, + saveFile: function saveJSON(fileName ,data) { + const json = JSON.stringify(data, null, 4); + const blob = new Blob([json], {type: 'application/json'}); + const url = URL.createObjectURL(blob); + console.log(url) + const link = document.createElement('a'); + link.href = url; + link.download = fileName; + + document.body.appendChild(link); + link.click(); + link.parentNode.removeChild(link); + }, + debounce: function debounce(func, delay) { + let lastCallTime = 0; + return function() { + const currentCallTime = new Date().getTime(); + if (currentCallTime - lastCallTime > delay) { + lastCallTime = currentCallTime; + func.apply(this, arguments); + } + } + }, + onNextUiUpdates: function (func) { + // brute force this to to ensure that the method is called after next few updates + onUiUpdate(this.callXTimes(function () { setTimeout(func, 5); }, 150)); + } +}; + +state.utils.html = { + setStyle: function setStyle(elements, style) { + if (elements instanceof NodeList) { + elements = Array.from(elements); + } else if (elements instanceof Node){ + elements = [elements]; + } else { + return; + } + elements.forEach(element => { + for (let key in style) { + if (style.hasOwnProperty(key)) { + element.style[key] = style[key]; + } + } + }); + }, + create: function create(type, props, style) { + const element = document.createElement(type); + if (props) { + for (let key in props) { + if (props.hasOwnProperty(key)) { + element[key] = props[key]; + } + } + } + if (style) { + this.setStyle(element, style); + } + return element; + }, + createButton: function createButton(text, onclick) { + const btn = document.createElement('button'); + btn.innerHTML = text; + btn.onclick = onclick || function () {}; + btn.className = 'gr-button gr-button-lg gr-button-primary'; + return btn; + } +}; diff --git a/gf/LightDiffusionFlow/preload.py b/gf/LightDiffusionFlow/preload.py new file mode 100644 index 0000000000000000000000000000000000000000..d8561955382ae36a479879ff4a79ffe080631d6a --- /dev/null +++ b/gf/LightDiffusionFlow/preload.py @@ -0,0 +1,4 @@ +import argparse + +def preload(parser: argparse.ArgumentParser): + parser.add_argument("--local-flows-path", type=str, default="models/LightDiffusionFlow", help="Path to save flow files.") diff --git a/gf/LightDiffusionFlow/scripts/__pycache__/lightdiffusionflow_config.cpython-310.pyc b/gf/LightDiffusionFlow/scripts/__pycache__/lightdiffusionflow_config.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..357faa6f2b430138ee53adef4c03b45075063fa6 Binary files /dev/null and b/gf/LightDiffusionFlow/scripts/__pycache__/lightdiffusionflow_config.cpython-310.pyc differ diff --git a/gf/LightDiffusionFlow/scripts/__pycache__/lightdiffusionflow_version.cpython-310.pyc b/gf/LightDiffusionFlow/scripts/__pycache__/lightdiffusionflow_version.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..0aeace1ae986ce83ac3981e93ea138254c2e63e5 Binary files /dev/null and b/gf/LightDiffusionFlow/scripts/__pycache__/lightdiffusionflow_version.cpython-310.pyc differ diff --git a/gf/LightDiffusionFlow/scripts/__pycache__/state_api.cpython-310.pyc b/gf/LightDiffusionFlow/scripts/__pycache__/state_api.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..0ec353337e4e39b0dbee325eaf4ae47464f8be6e Binary files /dev/null and b/gf/LightDiffusionFlow/scripts/__pycache__/state_api.cpython-310.pyc differ diff --git a/gf/LightDiffusionFlow/scripts/__pycache__/state_settings.cpython-310.pyc b/gf/LightDiffusionFlow/scripts/__pycache__/state_settings.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..aecb540c4ba0c4c72efedea376ed0edb580ba01a Binary files /dev/null and b/gf/LightDiffusionFlow/scripts/__pycache__/state_settings.cpython-310.pyc differ diff --git a/gf/LightDiffusionFlow/scripts/lightdiffusionflow_config.py b/gf/LightDiffusionFlow/scripts/lightdiffusionflow_config.py new file mode 100644 index 0000000000000000000000000000000000000000..1f673bea3c7feb10361d4c0dd2780aa53599f8e8 --- /dev/null +++ b/gf/LightDiffusionFlow/scripts/lightdiffusionflow_config.py @@ -0,0 +1,320 @@ +import json,os +from modules.paths_internal import data_path + +PNGINFO_2_LIGHTDIFFUSIONFLOW = {} +PNGINFO_CN_2_LIGHTDIFFUSIONFLOW = {} +Image_Components_Key = {} +LoRAs_In_Use = "loras_in_use" +Flow_Save_mode = "All" +Auto_Fix_Params = True +Local_Flows_Path = "models/LightDiffusionFlow" + +class OutputPrompt_English: + + def startimport(): + return "
Start parsing settings..." + + def invalid_file(): + return "Please select a valid lightdiffusionflow or image file!" + + def importing_image(image_name): + return f"Importing image: '{image_name}'." + + def import_completed(): + return "Import Completed!" + + def import_completed_info(): + return "Import Completed!" + + def save_completed(): + return "Save Completed!" + + def save_failed(): + return "Save Failed!" + + def alternative_option(target_value, new_value): + return f'''Note: '{target_value}' not found,
\ + An approximate match '{new_value}' has been automatically selected as replacement.''' + + def no_option(option_name, value): + return f'''Error: '{option_name}' import failed!
\ + The option '{value}' was not found!''' + + def missing_extensions(ext_list:[]): + if(Flow_Save_mode == "Core"): + error_str = "Note: The current flow file contains parameters for other plugins:

" + for ext in ext_list: + error_str+="

- "+ext+"

" + error_str+="You can switch to 'ALL' mode to load more parameters. (This prompt is for reference only, please choose the plugin mode according to the actual situation.)

" + else: + error_str = "Note: Found missing extensions.

" + for ext in ext_list: + error_str+="

- "+ext+"

" + error_str+="The above Extension Missing Reminder is for reference only. Please determine the necessary plugins based on your actual needs and specific conditions.

" + return error_str + + def missing_loras(loras_list:[]): + error_str = "Note: Found missing LoRAs.

" + for lora in loras_list: + error_str+="

- "+lora+"

" + return error_str + + def click_to_download(file_name, file_url): + return f'''

Click to download \ + {file_name} + ''' + + def download_link(file_name, file_url): + return f''' {file_name} ''' + + def note_for_save_mode(): + return "Core mode only supports basic parameters for Text-to-Image and Image-to-Image, along with ControlNet parameters. \ +The All mode, on the other hand, aims to support as many parameters as possible on the page, \ +but the downside is that it may occasionally cause the UI to freeze with an infinite loading circle." + +class OutputPrompt_Chinese: + + def startimport(): + return "


开始解析设置..." + + def invalid_file(): + return "请选择一个有效的flow文件,或者含png_info数据的图片!" + + def importing_image(image_name): + return f"导入图片'{image_name}'..." + + def import_completed(): + return "导入完成!" + + def import_completed_info(): + return "导入完成!" + + def save_completed(): + return "保存完毕!" + + def save_failed(): + return "保存失败!" + + def alternative_option(target_value, new_value): + return f'''注意: 未找到选项'{target_value}',
\ + 已使用近似选项'{new_value}'代替.''' + + def no_option(option_name, value): + if(option_name == "stable diffusion checkpoint"): + return f'''未找到大模型'{value}'!''' + clear_option_name = option_name.replace("state-ext-","") + return f'''错误: '{clear_option_name}'导入失败!
\ + 未找到选项'{value}'!''' + + def missing_extensions(ext_list:[]): + global Flow_Save_mode + if(Flow_Save_mode == "Core"): + error_str = "注意, 当前flow文件含有其他插件参数:

" + for ext in ext_list: + error_str+="

- "+ext+"

" + error_str+="可切换至'ALL'模式加载更多参数。(提示仅供参考,请根据实际情况选择插件模式。)

" + else: + error_str = "注意, 发现缺失的插件:

" + for ext in ext_list: + error_str+="

- "+ext+"

" + error_str+="以上插件缺失提示仅供参考,请注意辨别实际情况下您所需要安装的插件。

" + return error_str + + def missing_loras(loras_list:[]): + error_str = "注意, 发现缺失的LoRA模型:

" + for lora in loras_list: + error_str+="

- "+lora+"

" + return error_str + + def click_to_download(file_name, file_url): + name = file_name + if(name == "ControlNet Models"): + name = "常用ControlNet模型" + return f'''

点击下载 \ + 常用ControlNet模型 + ''' + + return f'''

点击下载 \ + {name} + ''' + + def download_link(file_name, file_url): + return f''' {file_name} ''' + + def note_for_save_mode(): + return "Core模式仅支持文生图和图生图的基本参数+ControlNet参数。All模式则会尽可能多的支持页面上的参数,但是缺点是有概率导致UI卡住,无限转圈。" + + +OutputPrompt = OutputPrompt_English + +def create_local_flow_path(): + global Local_Flows_Path + + if os.path.isabs(Local_Flows_Path): + flows_path = Local_Flows_Path + else: + flows_path = os.path.join(data_path, Local_Flows_Path) + flows_path = flows_path.replace("\\","/") # linux下反斜杠有问题 + #print(flows_path) + try: + if(not os.path.exists(flows_path)): + os.makedirs(flows_path) + if(os.path.exists(flows_path)): + print(f"本地文件夹'{flows_path}'创建成功!") + else: + print(f"本地文件夹'{flows_path}'创建失败!") + except BaseException as e: + pass + if(not os.path.exists(flows_path)): + print(f"The creation of the folder '{Local_Flows_Path}' has failed! Please create this folder manually to ensure the proper functioning of the extension.") + print(f"创建文件夹'{Local_Flows_Path}'失败!请手动创建该文件夹,以保证插件功能正常运行。") + + Local_Flows_Path = flows_path + + +# 改成函数调用,修改配置之后能及时刷新 +def init(): + global PNGINFO_2_LIGHTDIFFUSIONFLOW,PNGINFO_CN_2_LIGHTDIFFUSIONFLOW + global OutputPrompt,Flow_Save_mode,Auto_Fix_Params,Local_Flows_Path,Image_Components_Key + # PNG Info的功能除了主要的选项以外其他的都靠第三方插件的主动支持,后续再考虑能否有优化的办法 + #print(parameters_copypaste.paste_fields) + PNGINFO_2_LIGHTDIFFUSIONFLOW = { + "Prompt": "state-txt2img_prompt", + "Negative prompt": "state-txt2img_neg_prompt", + "Steps": "state-txt2img_steps", + "Sampler": "state-txt2img_sampling", + "CFG scale": "state-txt2img_cfg_scale", + "Seed": "state-txt2img_seed", + "Face restoration": "state-txt2img_restore_faces", + "Size-1": "state-txt2img_width", + "Size-2": "state-txt2img_height", + "Model hash": "state-setting_sd_model_checkpoint", + "Denoising strength": "state-txt2img_denoising_strength", + "Hires upscale": "state-txt2img_hr_scale", + "Hires steps": "state-txt2img_hires_steps", + "Hires upscaler": "state-txt2img_hr_upscaler", + "Hires resize-1": "state-txt2img_hr_resize_x", + "Hires resize-2": "state-txt2img_hr_resize_y", + "Clip skip": "setting_CLIP_stop_at_last_layers", + "ENSD": "setting_eta_noise_seed_delta" + } + + PNGINFO_CN_2_LIGHTDIFFUSIONFLOW = { + "Module": "state-ext-control-net-txt2img_0-preprocessor", + "preprocessor": "state-ext-control-net-txt2img_0-preprocessor", + "model": "state-ext-control-net-txt2img_0-model", + "weight": "state-ext-control-net-txt2img_0-control-weight", + "starting": "state-ext-control-net-txt2img_0-starting-control-step", + "ending": "state-ext-control-net-txt2img_0-guidance-end-(t)", + "guidance start": "state-ext-control-net-txt2img_0-starting-control-step", + "guidance end": "state-ext-control-net-txt2img_0-ending-control-step", + "resize mode": "state-ext-control-net-txt2img_0-resize-mode", + "pixel perfect": "state-ext-control-net-txt2img_0-pixel-perfect", + "control mode": "state-ext-control-net-txt2img_0-control-mode", + "preprocessor params": "" + } + + try: + import modules.shared as shared + webui_settings = {} + with open(shared.cmd_opts.ui_settings_file, mode='r') as f: + json_str = f.read() + webui_settings = json.loads(json_str) + + # 保存模式 + try: + Flow_Save_mode = webui_settings["lightdiffusionflow-mode"] + except: + pass + + # 自动纠正错误的参数 + try: + Auto_Fix_Params = webui_settings["lightdiffusionflow-auto-fix-params"] + except: + pass + + # 本地flow保存位置,先读本地设置,再读命令行设置 + try: + Local_Flows_Path = webui_settings["lightdiffusionflow-local-flows-path"] + except: + pass + if shared.cmd_opts.local_flows_path: + Local_Flows_Path = shared.cmd_opts.local_flows_path + create_local_flow_path() + + + language_successed = False + auto_language = False + try: + # 优先读取自己的设置 + if(webui_settings['lightdiffusionflow-language'] == "default"): + auto_language = True + elif(webui_settings['lightdiffusionflow-language'] == "english"): + OutputPrompt = OutputPrompt_English + language_successed = True + else: + OutputPrompt = OutputPrompt_Chinese + language_successed = True + except: + OutputPrompt = OutputPrompt_English + + # 如果是default就读取其他设置配合 + if(auto_language and not language_successed): + # 自带的本地化文件 + localization_files = ["zh_CN", "zh-Hans (Stable) [vladmandic]", "zh-Hans (Stable)", + "zh-Hans (Testing) [vladmandic]", "zh-Hans (Testing)","chinese-all-1024","chinese-english-1024"] + try: + # 如果用户使用了中文汉化文件,插件也默认显示中文 + localization_files.index(webui_settings["localization"]) + OutputPrompt = OutputPrompt_Chinese + language_successed = True + except: + pass + + # 第三方翻译插件bilingual-localization + if(not language_successed): + try: + if(webui_settings["bilingual_localization_enabled"] and webui_settings["bilingual_localization_file"] != "None"): + OutputPrompt = OutputPrompt_Chinese + language_successed = True + except: + OutputPrompt = OutputPrompt_English + except: + pass + + Image_Components_Key = [ + # 第一个组件是用来预计算第一张有效图的索引 防止出现有没用的页面跳转 + "useless_Textbox", + # 每个图片组件的elem_id + "img2img_image","img2img_sketch","img2maskimg","inpaint_sketch","img_inpaint_base","img_inpaint_mask", + ] # 只保存图片组件id,其他参数js里搞定 + + + # # init number of controlnet + # try: + # webui_settings = {} + # with open(shared.cmd_opts.ui_settings_file, mode='r') as f: + # json_str = f.read() + # webui_settings = json.loads(json_str) + + # Multi_ControlNet = webui_settings.get("control_net_unit_count", None) # controlnet数量,新版名字 + # if(Multi_ControlNet == None): + # Multi_ControlNet = webui_settings.get("control_net_max_models_num", 0) + # print(f"Multi_ControlNet = {Multi_ControlNet}") + # if(Multi_ControlNet == 1): + # Image_Components_Key.append(f"txt2img_controlnet_ControlNet_input_image") + # Image_Components_Key.append(f"img2img_controlnet_ControlNet_input_image") + # else: + # for i in range(Multi_ControlNet): + # Image_Components_Key.append(f"txt2img_controlnet_ControlNet-{i}_input_image") + # Image_Components_Key.append(f"img2img_controlnet_ControlNet-{i}_input_image") + + # except: + # pass + + # # Segment Anything images + # Image_Components_Key.extend(["txt2img_sam_input_image","img2img_sam_input_image"]) + + + +init() \ No newline at end of file diff --git a/gf/LightDiffusionFlow/scripts/lightdiffusionflow_version.py b/gf/LightDiffusionFlow/scripts/lightdiffusionflow_version.py new file mode 100644 index 0000000000000000000000000000000000000000..950318a9b1249317b843661618645afe797f656e --- /dev/null +++ b/gf/LightDiffusionFlow/scripts/lightdiffusionflow_version.py @@ -0,0 +1 @@ +lightdiffusionflow_version = "v2.1.4" diff --git a/gf/LightDiffusionFlow/scripts/state_api.py b/gf/LightDiffusionFlow/scripts/state_api.py new file mode 100644 index 0000000000000000000000000000000000000000..e7ffe835e699e8ee9a520057a4efa2351522a256 --- /dev/null +++ b/gf/LightDiffusionFlow/scripts/state_api.py @@ -0,0 +1,1419 @@ +from fastapi import FastAPI, Body, HTTPException, Request, Response +from fastapi.responses import FileResponse +from pydantic import BaseModel +import gradio as gr + +import os, io, sys +import json +from PIL import Image +import re,base64,copy +import time,requests +import shutil +from urllib.parse import urlparse +import tempfile + +from modules import localization, images +import modules.shared as shared +import modules.scripts as scripts +import modules.script_callbacks as script_callbacks +import modules.generation_parameters_copypaste as parameters_copypaste +from modules.generation_parameters_copypaste import paste_fields, registered_param_bindings, parse_generation_parameters +from modules.sd_models import checkpoints_list +from modules import ui_components +from modules.paths_internal import data_path +import launch + +from scripts import lightdiffusionflow_version, lightdiffusionflow_config +import scripts.lightdiffusionflow_config as lf_config +# from scripts.lightdiffusionflow_config import OutputPrompt, Flow_Save_mode, Auto_Fix_Params, LoRAs_In_Use # + +Flow_Save_mode = lf_config.Flow_Save_mode +Auto_Fix_Params = lf_config.Auto_Fix_Params +LoRAs_In_Use = lf_config.LoRAs_In_Use +OutputPrompt = lf_config.OutputPrompt +Local_Flows_Path = lf_config.Local_Flows_Path #"models/LightDiffusionFlow" + +# current_path = os.path.abspath(os.path.dirname(__file__)) +# sys.path.append(os.path.join(current_path,"lib")) +api = None +workflow_json = {} +State_Comps = {} # 当前页面的按钮组件 +invisible_buttons = {} +Webui_Comps = {} # webui上需要操作的图片组件 +Webui_Comps_Cur_Val = [] # 顺序与Image_Components_Key一致 +Output_Log = "" +g_msg_info = "" + +conponents_originlist = [] +extensions_conponents = {} +extensions_id_conponents = {} +extensions_id_conponents_value = {} +txt2img_script_container = None +img2img_script_container = None + +local_flow_list = [] +Need_Preload = False +Preload_File = r"" +File_extension = ".flow" + +paste_symbol = '\u2199\ufe0f' # ↙ +refresh_symbol = '\U0001f504' # 🔄 +save_style_symbol = '\U0001f4be' # 💾 +clear_prompt_symbol = '\U0001f5d1\ufe0f' # 🗑️ +apply_style_symbol = '\U0001f4cb' # 📋 + + +def test_func(): + gr.Warning("hello") + # global extensions_conponents, extensions_id_conponents + # global Output_Log + # print("test_func") + #print(Output_Log) + #print(networks.available_networks) + #print(preprocessor_filters) + + #print(extensions_id_conponents["dropdown"]["state-ext-control-net-txt2img_0-model"].get_config()) + + # print(parameters_copypaste.paste_fields) + + +# fastapi触发不了弹窗 +def custom_msg_box(): + global g_msg_info + if(g_msg_info != ""): + print(f"gr.Info({g_msg_info})") + gr.Info(g_msg_info) + g_msg_info = "" + +def clear_markup(html_str): + clearly_str = html_str + results = re.findall("(<.+?>)",clearly_str) + for res in results: + clearly_str = clearly_str.replace(res,"") + return clearly_str + +def add_output_log(msg:str="", style:str=""): + global Output_Log + if(msg != ""): + clear_msg = clear_markup(msg) + print(clear_msg) + Output_Log += f"

{msg}

" + + #custom_msg_box() + return Output_Log, Output_Log + +def add_output_warning(msg:str=""): + add_output_log(msg, style="color:Orange;") + +def add_output_error(msg:str=""): + add_output_log(msg, style="color:Red;") + +def add_preset_output_log(preset, key, value): + global g_msg_info + g_msg_info = "" + if(preset == "start"): + add_output_log(OutputPrompt.startimport()) + elif(preset == "finished"): + g_msg_info = OutputPrompt.import_completed_info() + add_output_log(OutputPrompt.import_completed()) + elif(preset == "invalid"): + add_output_log(OutputPrompt.invalid_file()) + elif(preset == "importing_img"): + add_output_log(OutputPrompt.importing_image(key)) + elif(preset == "alt_option"): + add_output_log(OutputPrompt.alternative_option(key,value)) + elif(preset == "no_option"): + add_output_log(OutputPrompt.no_option(key,value)) + elif(preset == "missing_exts"): + ext_list = value.split(";") + add_output_log(OutputPrompt.missing_extensions(ext_list)) + elif(preset == "missing_loras"): + lora_list = value.split(";") + add_output_log(OutputPrompt.missing_loras(lora_list)) + elif(preset == "download_url"): + add_output_log(OutputPrompt.click_to_download(key, value)) + + +def read_chunks(file, size=io.DEFAULT_BUFFER_SIZE): + """Yield pieces of data from a file-like object until EOF.""" + while True: + chunk = file.read(size) + if not chunk: + break + yield chunk +def gen_file_sha256(filname): + import hashlib + """ pip-style sha256 hash generation""" + print("Use Memory Optimized SHA256") + blocksize=1 << 20 + sha256_hash = hashlib.sha256() + length = 0 + with open(os.path.realpath(filname), 'rb') as read_file: + for block in read_chunks(read_file, size=blocksize): + length += len(block) + sha256_hash.update(block) + + hash_value = sha256_hash.hexdigest() + print(f"sha256: {hash_value}") + print(f"length: {length}") + return hash_value + +def find_checkpoint_from_name(name:str): + + for checkpoint in checkpoints_list.keys(): + res = re.search(r"(.+)\.(.+)", checkpoint) + try: + if(res.group(1) == name): + return checkpoint + except: + pass + return name + +def find_checkpoint_from_hash(hash:str): + + for checkpoint in checkpoints_list.keys(): + res = re.search(r"\[([0-9a-fA-F]{8,10})\]", checkpoint) + try: + if(res.group(1) == hash): + return checkpoint + except: + pass + return hash + +def SearchingCheckPointByHashFromCivitai(hash:str): + hash_str = hash.replace("[","").replace("]","").strip() + try: + response = requests.get("https://civitai.com/api/v1/model-versions/by-hash/"+hash_str) + if(response.status_code == 200): + #print(response.content) + return response.json() + except: + pass + return {} + +def refresh_local_flows(*inputs): + print("refresh_local_flows") + global local_flow_list,Local_Flows_Path + try: + #flows_path = os.path.join(data_path, Local_Flows_Path) + local_flow_list = [f for f in os.listdir(Local_Flows_Path) if os.path.isfile( + os.path.join(Local_Flows_Path, f)) and os.path.splitext(f)[-1] == '.flow'] + except: + local_flow_list = [] + # print(inputs) + # print(local_flow_list) + ret = [] + for dd in inputs: + if dd in local_flow_list: + selected = dd + else: + selected = None + ret.append(gr.Dropdown.update(choices=local_flow_list, value=selected)) + #ret = [gr.Dropdown.update(choices=local_flow_list, value=selected) for i in inputs] + return ret + +def apply_local_flow(selected): + global local_flow_list,Local_Flows_Path + global Need_Preload,Preload_File + + if(selected != "" and selected != None): + flow_path = os.path.join(Local_Flows_Path, selected) + if(os.path.exists(flow_path)): + print("OK,Local File!") + print(flow_path) + Preload_File = flow_path + Need_Preload = True + gr.Info(clear_markup(OutputPrompt.startimport())) + +def delete_local_flow(selected): + global local_flow_list,Local_Flows_Path + if(selected != "" and selected != None): + flow_path = os.path.join(Local_Flows_Path, selected) + if(os.path.exists(flow_path)): + os.remove(flow_path) + print("Local File Deleted!") + +def set_lightdiffusionflow_file(): + global Preload_File + return Preload_File + +def on_dropdown_changed(*component): + global extensions_id_conponents, extensions_id_conponents_value + + extensions_id_conponents_value["dropdown"] = {} + i = 0 + for id in extensions_id_conponents["dropdown"].keys(): + extensions_id_conponents_value["dropdown"][id] = component[i] + i+=1 + +def cn_get_model_type(cn_model_name): + + cn_type_list = ['canny','depth','normalmap','openpose','mlsd','lineart_anime','lineart','softedge','scribble', + 'seg','shuffle','tile','inpaint','ip2p','brightness','illumination','qrcode_monster','qrcode','normalbae'] + + type_pattern = "("+"|".join(cn_type_list)+")" + res = re.search(type_pattern,cn_model_name) + if(res != None): + return res.group() + + return None + +def set_dropdowns(): + global extensions_id_conponents, workflow_json + global temp_index,next_index + global Webui_Comps_Cur_Val, Output_Log + + temp_index = len(Webui_Comps_Cur_Val) + next_index = temp_index + + return_vals = [] + for comp_id in extensions_id_conponents["dropdown"].keys(): + value = None + new_value = None + try: + value = workflow_json.get(comp_id, None) + if(value == None): + new_value = extensions_id_conponents["dropdown"][comp_id].get_config()["value"] + else: + new_value = value + matching_successed = False + options = extensions_id_conponents["dropdown"][comp_id].get_config()["choices"] + + for option in options: + if(option == new_value): + matching_successed = True + break + else: + #print(f"去掉多余文件夹路径---{temp_option}----{temp_new_value}----") + temp_option = os.path.split(option)[-1] + temp_new_value = os.path.split(new_value)[-1] + if(temp_option == temp_new_value): + matching_successed = True + break + + # 去掉模型的多余路径? + # if(os.path.split(option)[-1] == os.path.split(new_value)[-1]): + # matching_successed = True + # break + + # 没有完全匹配 + if(not matching_successed): + + # controlnet模型 + res = re.search(r"state-ext-control-net-txt2img_[0-9]-model", comp_id) + if(res != None): + cn_model = cn_get_model_type(new_value) + if(cn_model != None): + if(len(options) <= 1): + add_preset_output_log("download_url", "ControlNet Models", "https://huggingface.co/lllyasviel/ControlNet-v1-1/tree/main") + for option in options: + if(cn_model == cn_get_model_type(option)): + new_value = option + matching_successed = True + break + + # 哈希值匹配 + if(not matching_successed): + + # 寻找哈希值 + value_hash_val = None + value_no_hash = None + res = re.search(r"(\[[0-9A-Fa-f]{8,10}\])", new_value) + if(res != None): + value_hash_val = res.group(1) + value_no_hash = new_value.replace(value_hash_val,"").rstrip() + + for option in options: + + option_hash_val = None + option_no_hash = None + res = re.search(r"(\[[0-9A-Fa-f]{8,10}\])", option) + if(res != None): # 选项有哈希 + option_hash_val = res.group(1) + option_no_hash = option.replace(option_hash_val,"").rstrip() + if(value_hash_val == None): # 值没有哈希 + if(new_value.rstrip() == option_no_hash): + new_value = option + matching_successed = True + break + else: # 值有哈希 + if(value_hash_val == option_hash_val or option_no_hash == value_no_hash): + new_value = option + matching_successed = True + break + else: # 选项没有哈希 + if(value_hash_val == None): # 值没有哈希 + if(new_value.rstrip() == option.rstrip()): + new_value = option + matching_successed = True + break + else: # 值有哈希 + if(value_no_hash == option.rstrip()): + new_value = option + matching_successed = True + break + + if(matching_successed): + add_output_log(OutputPrompt.alternative_option(value,new_value)) + #add_output_log(f"Note: '{value}' not found. An approximate match '{new_value}' has been automatically selected as replacement.") + #print(f"Note: '{value}' not found. An approximate match '{new_value}' has been automatically selected as replacement.") + else: + add_output_log(OutputPrompt.no_option(comp_id,value)) + #add_output_log(f"Error: '{comp_id}' import failed! The option '{value}' was not found!") + #print(f"'{comp_id}' import failed! The option '{value}' was not found!") + new_value = extensions_id_conponents["dropdown"][comp_id].get_config()["value"] + + except KeyError as e: + print(e) + return_vals.append(new_value) + + return_vals.append(temp_index) # 给json2js + return_vals.append(Output_Log) + return_vals.append(Output_Log) + #print(return_vals) + return tuple(return_vals) + +def set_js_params(): + global temp_index,next_index + temp_index = next_index+1 + next_index = temp_index + print("set_js_params") + return temp_index + +# use_elem_id 是为了兼容旧版的图片组件直接使用elem_id作为组件名称的情况 +def comp_create_id(component, tab_name, ext_name, sub_tab, use_elem_id = False): + comp_id = None + if(use_elem_id): + try: + comp_id = component.elem_id + except: + pass + + if(comp_id == None): + #try: + comp_name = component.get_config()['label'].replace(" ", "-").lower() + temp_sub_tab = ("_"+sub_tab) if sub_tab != "base" else "" + comp_id = 'state-ext-'+ ext_name.replace(" ","-").lower() + "-" + tab_name + temp_sub_tab + "-" + comp_name + #except: + # pass + + return comp_id + +def params_create_ids(): + global extensions_id_conponents, extensions_conponents + extensions_id_conponents = {"dropdown":{}, "image":{}} + + for key in lf_config.Image_Components_Key: + if(key == "useless_Textbox"): + extensions_id_conponents["image"]["useless_Textbox"] = State_Comps["useless_Textbox"] + else: + extensions_id_conponents["image"][key] = Webui_Comps[key] + + for tab_name in extensions_conponents.keys(): # tab name + for ext_name in extensions_conponents[tab_name].keys(): # plugin name + comp_index = 0 + for sub_tab in extensions_conponents[tab_name][ext_name].keys(): # sub_tab 如果没有默认就是base + for comp in extensions_conponents[tab_name][ext_name][sub_tab]: # component + try: + # 先只考虑有label的组件 + if(isinstance(comp, gr.Dropdown)): + # try: + # comp_name = comp.get_config()['label'].replace(" ", "-").lower() + # temp_sub_tab = ("_"+sub_tab) if sub_tab != "base" else "" + # comp_id = 'state-ext-'+ ext_name.replace(" ","-").lower() + "-" + tab_name + temp_sub_tab + "-" + comp_name + # except: + # pass + comp_id = comp_create_id(comp, tab_name, ext_name, sub_tab) + # comp_name = comp.get_config()['name'] + "_" + str(comp_index) + # comp_index += 1 + # comp_id = 'state-ext-'+ ext_name.replace(" ","-").lower() + "-" + comp_name + extensions_id_conponents["dropdown"][comp_id] = comp + elif(isinstance(comp, gr.Image)): + comp_id = comp_create_id(comp, tab_name, ext_name, sub_tab, True) + if("generated_image" in comp_id): + pass + #print(f"skip component: {comp_id}") + else: + extensions_id_conponents["image"][comp_id] = comp + + except BaseException as e: + pass + #print(extensions_id_conponents) + +def get_lora_info(using_loras:list): + import networks + print(f"using_loras = {using_loras}") + loras_info = {} + for lora_name in using_loras: + for network in networks.available_networks.keys(): + if( (lora_name == networks.available_networks[network].name or lora_name == networks.available_networks[network].alias) and networks.available_networks[network].shorthash != ""): + print(networks.available_networks[network].filename) + sha256 = gen_file_sha256(networks.available_networks[network].filename) + loras_info[sha256] = { + 'name':networks.available_networks[network].name, + 'alias':networks.available_networks[network].alias + } + print(f"loras_info = {loras_info}") + return loras_info + +def get_lora_from_prompt(prompt): + import networks + + available_loras = [] + for network in networks.available_networks.keys(): + available_loras.append((networks.available_networks[network].name,networks.available_networks[network].alias)) + #print(available_loras) + + used_loras = [] + re_lora_prompt = re.compile("", re.IGNORECASE) + results = re.findall(re_lora_prompt, prompt) + # print("使用到的lora:") + # print(results) + for res in results: + lora_name = res[0] + for lora in available_loras: + if(lora_name == lora[0] or lora_name == lora[1]): + used_loras.append(lora_name) + return used_loras + +def get_missing_lora(workflow_json): + import networks + + data_str = json.dumps(workflow_json) + # re_parens = re.compile(r"[\\/\[\](){}]+") + # prompt = re.sub( + # re_parens, "", prompt.replace(",", ",")#.replace("-", " ").replace("_", " ") + # ) + + available_loras = [] + for network in networks.available_networks.keys(): + available_loras.append((networks.available_networks[network].name,networks.available_networks[network].alias)) + #print(available_loras) + + missing_loras = [] + loras_info = workflow_json[LoRAs_In_Use] + re_lora_prompt = re.compile("", re.IGNORECASE) + results = re.findall(re_lora_prompt, data_str) + # print("使用到的lora:") + # print(results) + for res in results: + lora_name = res[0] + for lora in available_loras: + if(lora_name == lora[0] or lora_name == lora[1]): + break + else: + # SD默认计算的lora的hash值无法用来在C站搜索,下载链接需要的Hash需单独计算导出才能支持。 + # name,alias,hash,downloadUrl + for sha256 in loras_info.keys(): + if(loras_info[sha256]["name"] == lora_name or loras_info[sha256]["alias"] == lora_name): + try: + model_info = SearchingCheckPointByHashFromCivitai(sha256) + if(model_info != {}): + lora_link = OutputPrompt.download_link(lora_name, model_info["downloadUrl"]) + missing_loras.append(lora_link) + break + except: + pass + else: + missing_loras.append(lora_name) + # print("丢失的lora:") + # print(missing_loras) + return missing_loras + +def get_extname_from_label(label): + ext_name = label + res = re.search(r"(.+) v[0-9\.]+", ext_name) + if(res != None): + ext_name = res.group(1) + #兼容旧命名 + if(ext_name == "ControlNet"): + ext_name = "Control-Net" + return ext_name + + +def get_script_container(component): + global txt2img_script_container, img2img_script_container + if(txt2img_script_container == None): + temp = component + #i = 10 + while temp: + if(temp.elem_id == "txt2img_script_container"): + txt2img_script_container = temp + break + else: + temp = temp.parent + #i-=1 + + if(img2img_script_container == None): + temp = component + i = 10 + while temp and i>0: + if(temp.elem_id == "img2img_script_container"): + img2img_script_container = temp + break + else: + temp = temp.parent + i-=1 + +def searching_extensions_title(): + global txt2img_script_container, img2img_script_container, extensions_conponents, Flow_Save_mode + + if(Flow_Save_mode == "Core"): + extensions_conponents["txt2img"] = {} + extensions_conponents["img2img"] = {} + extensions_conponents["txt2img"]["Control-Net"] = {"base":[]} + extensions_conponents["img2img"]["Control-Net"] = {"base":[]} + + else: + for group in txt2img_script_container.children: # 遍历读取所有的插件名称 + + label = "" + try: + label = get_extname_from_label(group.children[0].label) + except BaseException as e: + pass + + if(label == ""): + try: + label = get_extname_from_label(group.children[0].children[0].label) + except BaseException as e: + pass + + if(label != ""): + try: + extensions_conponents["txt2img"] + except: + extensions_conponents["txt2img"] = {} + extensions_conponents["img2img"] = {} + extensions_conponents["txt2img"][label] = {"base":[]} + extensions_conponents["img2img"][label] = {"base":[]} + #extensions_conponents[label] = [] + #print(extensions_conponents) + +def on_img_changed(*component): + global extensions_id_conponents, extensions_id_conponents_value + + #test_vals = [] + extensions_id_conponents_value["image"] = {} + i = 0 + for id in extensions_id_conponents["image"].keys(): + extensions_id_conponents_value["image"][id] = component[i] + # if(component[i] == None): + # test_vals.append("0") + # else: + # test_vals.append("1") + i+=1 + + #print(test_vals) + +temp_index = -1 +next_index = -1 +def func_for_invisiblebutton(): + global temp_index,next_index + global extensions_id_conponents, Webui_Comps_Cur_Val, Output_Log + + temp_index = next_index+1 + next_index = temp_index + + try: + while( next_index < len(Webui_Comps_Cur_Val) and Webui_Comps_Cur_Val[next_index+1] == None ): + next_index += 1 + except: + pass + + # try: + # print(f"func_for_invisiblebutton {temp_index} {next_index} {len(Webui_Comps_Cur_Val)}") + # print(f"func_for_invisiblebutton {lf_config.Image_Components_Key[temp_index]} {Webui_Comps_Cur_Val[temp_index]} ") + # except: + # pass + + + # 第一个组件是用来预计算第一张图的索引 防止出现有没用的页面跳转 所以不用输出日志信息 + if(temp_index > 0): + add_output_log(OutputPrompt.importing_image(list(extensions_id_conponents['image'].keys())[temp_index])) + #add_output_log(f"importing image: \'{list(extensions_id_conponents['image'].keys())[temp_index]}\' ") + + #if(next_index+1 == len(Webui_Comps_Cur_Val)): + #add_output_log(OutputPrompt.import_completed()) + #add_output_log(f"import completed!") + + # 因为显示日志的窗口分txt2img和img2img两个位置 所以两个位置同步导出 + return Webui_Comps_Cur_Val[temp_index], next_index, Output_Log, Output_Log + + +def config_filter(config): + global extensions_id_conponents + new_config = config + for comp_type in extensions_id_conponents.keys(): + for comp_id in extensions_id_conponents[comp_type].keys(): + try: + # 筛掉python相关组件的默认值选项 + default_val = extensions_id_conponents[comp_type][comp_id].get_config()["value"] + if(default_val == new_config[comp_id]): + del new_config[comp_id] + except KeyError as e: + pass + + if(Auto_Fix_Params == False): + return new_config + + # 处理旧版插件保存的错误参数问题 + print("-------错误参数纠正-------") + found_tabs = [] + fixed_config = {} + for param in new_config.keys(): + res = re.match("state-ext-control-net-(txt2img|img2img)_([0-9]+)-(.+)",param) + if(res != None): + if(res.group(3) not in ["presets","preprocessor","model"]): + try: + found_tabs.index(res.group(2)) + except ValueError: + found_tabs.append(res.group(2)) + + for param in new_config.keys(): + + # 缝缝补补。。。 js保存的参数因为是汉语反推回英文,所以就会有 model->模型->models 的情况。 + res = re.search("[-_](model|models|checkpoint|checkpoints)$",param) + if(res != None): + for key in ["model","models","checkpoint","checkpoints"]: + target_word = str(res.group()).replace(res.group(1),key) + new_param = re.sub("[-_](model|models|checkpoint|checkpoints)$", target_word, param) + fixed_config[new_param] = new_config[param] + #print(f"根据{param},补充{new_param}") + + # 纠正编号 + res = re.match("state-ext-control-net-(txt2img|img2img)_([0-9]+)-(.+)",param) + if(res != None): + tab_num = int(res.group(2)) + if(res.group(2) != "0" and tab_num%3 == 0 ): + try: + found_tabs.index(str(tab_num/3)) + # 如果是9 如果发现3的位置有参数,就还需要检查一下1 + if(tab_num == 9): + found_tabs.index("1") + except ValueError: + new_key = f"state-ext-control-net-{res.group(1)}_{int(tab_num/3)}-{res.group(3)}" + # 这三个选项,旧版是js控制会有序号问题,新版本py控制没有序号问题,所以这里判断纠正之后的位置有没有值,有的话就不动。 + if(res.group(3) in ["presets","preprocessor","model"]): + #print(param) + if(new_config.get(new_key,None) == None): + fixed_config[new_key] = new_config[param] + print(f" {param} 改为 {new_key}") + continue + else: + fixed_config[new_key] = new_config[param] + print(f" {param} 改为 {new_key}") + continue + + # 其余参数照搬 + fixed_config[param] = new_config[param] + + #print(fixed_config) + #print("-------处理错误-------") + + new_config = fixed_config + return new_config + + +def fn_import_workflow(workflow_file): + global workflow_json, Output_Log + global extensions_id_conponents, Webui_Comps_Cur_Val, temp_index, next_index + temp_index = -1 # 重置索引 + next_index = -1 + + workflow_json = {} + workflow_json_str = "" + if(workflow_file): + try: + config_file = workflow_file[0].name + except: + config_file = workflow_file.name + + print("fn_import_workflow "+str(config_file)) + if (os.path.splitext(config_file)[-1] in [File_extension, ".lightflow", ".json"]): # 兼容部分旧版本文件 + with open(config_file, mode='r', encoding='UTF-8') as f: + workflow_json_str = f.read() + workflow_json = json.loads(workflow_json_str) + + try: + response = requests.get("https://api.lightflow.ai/openapi/access?action=import") + if(response.status_code == 200): + pass + except: + pass + + else: + print("invalid file!") + + + workflow_json = config_filter(workflow_json) + + Webui_Comps_Cur_Val = [] + for key in extensions_id_conponents["image"].keys(): + #for key in lf_config.Image_Components_Key: + image = None + successed = 2 + tempkey = key + while successed > 0: + try: + image_data = workflow_json[key] + matchObj = re.match("data:image/[a-zA-Z0-9]+;base64,",image_data) + if matchObj != None: + image_data = image_data[len(matchObj.group()):] + image_data = base64.decodebytes(image_data.encode('utf-8')) + image = Image.open(io.BytesIO(image_data)) + successed = 0 + except: + # 如果是controlnet 第一张图 就修改一下key值重试一遍 + if(key == "txt2img_controlnet_ControlNet_input_image"): + key = "txt2img_controlnet_ControlNet-0_input_image" + elif(key == "img2img_controlnet_ControlNet_input_image"): + key = "img2img_controlnet_ControlNet-0_input_image" + + elif(key == "txt2img_controlnet_ControlNet-0_input_image"): + key = "txt2img_controlnet_ControlNet_input_image" + elif(key == "img2img_controlnet_ControlNet-0_input_image"): + key = "img2img_controlnet_ControlNet_input_image" + else: + successed = 0 + successed-=1 + + Webui_Comps_Cur_Val.append(image) + + # 检查lora + missing_loras = [] + try: + missing_loras = get_missing_lora(workflow_json) + if(len(missing_loras) > 0): + add_output_log(OutputPrompt.missing_loras(missing_loras)) + + # # SD默认计算的lora的hash值无法用来在C站搜索,下载链接需要的Hash需单独计算导出才能支持。 + # # name,alias,hash,downloadUrl + # loras_info = workflow_json[LoRAs_In_Use] + # for lora_name in missing_loras: + # for sha256 in loras_info.keys(): + # if(loras_info[sha256]["name"] == lora_name or loras_info[sha256]["alias"] == lora_name): + # try: + # model_info = SearchingCheckPointByHashFromCivitai(sha256) + # if(model_info != {}): + # add_preset_output_log("download_url", lora_name, model_info["downloadUrl"]) + # except: + # pass + except KeyError as e: + pass + #print(f"except missing loras error: {e}") + + #print(Webui_Comps_Cur_Val) + #set_elements() + # return_vals.append(str(time.time())) # 用来触发json2js事件,python设置完图片 js继续设置其他参数 弃用 + # return tuple(return_vals) + return str(temp_index)#, Output_Log, Output_Log + +class config_params(BaseModel): + config_data:dict + +class imgs_callback_params(BaseModel): + id:str + img:str + +class png_info_params(BaseModel): + img_path:str + +class file_params(BaseModel): + file_path:str + +class savefile_params(BaseModel): + file_name:str + file_data:dict + overwrite:bool + +class StateApi(): + + BASE_PATH = '/lightdiffusionflow' + + def get_path(self, path): + return f"{self.BASE_PATH}{path}" + + def add_api_route(self, path: str, endpoint, **kwargs): + return self.app.add_api_route(self.get_path(path), endpoint, **kwargs) + + def start(self, _: gr.Blocks, app: FastAPI): + print("----------------- light_diffusion_flow api start------------------") + + self.app = app + # 读取本地的config.json + self.add_api_route('/local/config.json', self.get_config, methods=['GET']) + # python已经加载好的配置workflow_json 发送给 js + self.add_api_route('/local/lightdiffusionflow_config', self.get_lightdiffusionflow_config, methods=['GET']) + self.add_api_route('/local/parse_lora_info', self.parse_lora_info, methods=['POST']) + # 获取图片的组件id 由js来设置onchange事件 + self.add_api_route('/local/get_imgs_elem_key', self.get_img_elem_key, methods=['GET']) + # 获取当前已安装的插件列表 + self.add_api_route('/local/get_ext_list', self.get_ext_list, methods=['GET']) + # 用户设置了新图片 触发回调保存到 workflow_json + self.add_api_route('/local/imgs_callback', self.imgs_callback, methods=['POST']) + self.add_api_route('/local/useless_config_filter', self.useless_config_filter, methods=['POST']) + # 刷新页面之后触发 + self.add_api_route('/local/refresh_ui', self.refresh_ui, methods=['GET']) + self.add_api_route('/local/output_log', add_output_log, methods=['GET']) + self.add_api_route('/local/preset_output_log', add_preset_output_log, methods=['GET']) + self.add_api_route('/local/png_info', self.png_info, methods=['POST']) # + # 传入一个文件路径,返回文件内容 + self.add_api_route('/local/read_file', self.read_file, methods=['POST']) + self.add_api_route('/local/file_exist', self.file_exist, methods=['POST']) + self.add_api_route('/local/need_preload', self.need_preload, methods=['GET']) + # 保存当前配置到本地文件夹 + self.add_api_route('/local/save_flow_to_local', self.saveFlowToLocal, methods=['POST']) + + self.add_api_route('/set_preload', self.set_preload, methods=['POST']) + + def get_config(self): + return FileResponse(shared.cmd_opts.ui_settings_file) + + def get_ext_list(self): + global extensions_conponents + ext_str = "" + try: + ext_str = ",".join(list(extensions_conponents["txt2img"].keys())).lower().replace(" ", "-") + except: + pass + return ext_str + + def useless_config_filter(self, config:config_params): + return config_filter(config.config_data) + + def get_lightdiffusionflow_config(self, data2export:bool = False): + global workflow_json, extensions_id_conponents, extensions_id_conponents_value + temp_json = {} + if(data2export): + + for key in extensions_id_conponents["image"].keys(): + # for key in lf_config.Image_Components_Key: + try: + temp_json[key] = workflow_json[key] + except: + pass + + # using_loras = get_lora_from_prompt(json.dumps(temp_json)) + # temp_json[LoRAs_In_Use] = get_lora_info(using_loras) + + # 导出时调用,这里把py负责的其他组件一起读进来 + for comp_type in extensions_id_conponents_value.keys(): + for comp_id in extensions_id_conponents_value[comp_type].keys(): + try: + # 默认值的选项不导出 + default_val = extensions_id_conponents[comp_type][comp_id].get_config()["value"] + if(default_val != extensions_id_conponents_value[comp_type][comp_id]): + temp_json[comp_id] = extensions_id_conponents_value[comp_type][comp_id] + except KeyError as e: + pass + else: + temp_json = copy.deepcopy(workflow_json) + for key in extensions_id_conponents["image"].keys(): + #for key in lf_config.Image_Components_Key: + temp_json[key] = "" + + return json.dumps(temp_json) + + def parse_lora_info(self, config:config_params): + temp_json = {} + json_str = json.dumps(config.config_data) + using_loras = get_lora_from_prompt(json_str) + temp_json[LoRAs_In_Use] = get_lora_info(using_loras) + return temp_json + + def str_2_json(self, str_data:str): + out_json = {} + res = re.findall(r"([^:]+:[^:]{1,})(,|$)",str_data) + for field in res: + data = field[0].split(":") + try: + out_json[data[0].strip()] = data[1].strip() + except IndexError as e: + print(f"str_2_json [key error]: {e}") + return out_json + + def png_info(self, img_data:png_info_params): + + geninfo, items = images.read_info_from_image(Image.open(img_data.img_path)) + geninfo = parse_generation_parameters(geninfo) + temp_json = {} + for key in geninfo.keys(): + + matchObj = re.match("ControlNet ([0-9])", key) + if(matchObj != None): # controlnet + cn_info = self.str_2_json(geninfo[key]) + if(len(cn_info.keys()) > 0): + temp_json["state-ext-control-net-txt2img_0-enable".replace("0",matchObj.group(1))] = True + + for cn_key in cn_info.keys(): + if(cn_key == "starting/ending"): + cn_key_split = cn_key.split("/") + data = cn_info[cn_key].replace("(","").replace(")","").split(",") + temp_json[lf_config.PNGINFO_CN_2_LIGHTDIFFUSIONFLOW[cn_key_split[0]].replace("0",matchObj.group(1))]\ + = data[0].strip() + temp_json[lf_config.PNGINFO_CN_2_LIGHTDIFFUSIONFLOW[cn_key_split[1]].replace("0",matchObj.group(1))]\ + = data[1].strip() + elif(cn_key == "pixel perfect"): + temp_json[lf_config.PNGINFO_CN_2_LIGHTDIFFUSIONFLOW[cn_key].replace("0",matchObj.group(1))]\ + = (cn_info[cn_key].lower() == "true") + else: + try: + temp_json[lf_config.PNGINFO_CN_2_LIGHTDIFFUSIONFLOW[cn_key.lower()].replace("0",matchObj.group(1))] = cn_info[cn_key] + except KeyError as e: + print(f"ControlNet option '{cn_key}' parsing failed.") + + elif(key == "Model hash"): + target_model = find_checkpoint_from_hash(geninfo[key]) + if(target_model == geninfo[key]):#说明没有找到相同hash值的模型,改用名称查找 + try: + target_model = find_checkpoint_from_name(geninfo["Model"]) + except: + pass + temp_json[lf_config.PNGINFO_2_LIGHTDIFFUSIONFLOW[key]] = target_model + + elif(key == "Face restoration"): + temp_json[lf_config.PNGINFO_2_LIGHTDIFFUSIONFLOW[key]] = True + else: + try: + temp_json[lf_config.PNGINFO_2_LIGHTDIFFUSIONFLOW[key]] = geninfo[key] + except KeyError as e: + pass + #print(e) + + if(key in ["Hires upscale","Hires steps","Hires upscaler","Hires resize-1","Hires resize-2"]): + temp_json["state-txt2img_enable_hr"] = True + + return json.dumps(temp_json) + + def read_file(self, params:file_params): + print("read_file") + file_content = "" + with open(params.file_path, mode='r', encoding='UTF-8') as f: + file_content = f.read() + return file_content + + def file_exist(self, params:file_params): + print("file_exist") + if(not os.path.exists(params.file_path)): + flow_path = os.path.join(Local_Flows_Path, params.file_path) + if(os.path.exists(flow_path)): + return True + else: + return False + return False + + def get_img_elem_key(self): + global extensions_id_conponents + keys_str = "" + #keys_str = ",".join(lf_config.Image_Components_Key) + try: + keys_str = ",".join(list(extensions_id_conponents["image"].keys())) + except: + pass + return keys_str + + def imgs_callback(self, img_data:imgs_callback_params): + global workflow_json + workflow_json[img_data.id] = img_data.img + + def refresh_ui(self): + global workflow_json, Output_Log + workflow_json = {} + Output_Log = "" + print("refresh_ui") + tag = launch.git_tag() + return tag + + def set_preload(self, params:file_params): + global Need_Preload,Preload_File + print(params.file_path) + res = "OK" + if(params.file_path): + if(os.path.exists(params.file_path)): + Preload_File = params.file_path + Need_Preload = True + res = "OK,Local File!" + else: + response = requests.get(params.file_path) + if(response.status_code == 200): + parsed_url = urlparse(params.file_path) + file_name = os.path.basename(parsed_url.path) + tempdir = os.path.join(tempfile.gettempdir(),"lightdiffusionflow_temp") + if(os.path.exists(tempdir)): + shutil.rmtree(tempdir) + if(not os.path.exists(tempdir)): + os.mkdir(tempdir) + temp_file = os.path.join(tempdir,file_name) + + with open(temp_file,"wb") as f: + f.write(response.content) + + print(temp_file) + Preload_File = temp_file + Need_Preload = True + res = "OK,Network File!" + else: + res = "Invalid File!" + + return res + + def need_preload(self): + global Need_Preload,Preload_File + if(Need_Preload): + Need_Preload = False + return Preload_File + return "" + + def saveFlowToLocal(self, data_to_save:savefile_params): + global Local_Flows_Path + global add_output_log + + overall_data = {} + print(data_to_save.file_name) + print(data_to_save.file_data) + filedata = data_to_save.file_data + + # python处理的图片和dropdown信息 + overall_data = json.loads(self.get_lightdiffusionflow_config(True)) + for key in filedata.keys(): + if(filedata[key] != ""): + overall_data[key] = filedata[key] + + # Lora信息 + params = config_params(config_data=overall_data) + lora_info = self.parse_lora_info(params) + for key in lora_info.keys(): + overall_data[key] = lora_info[key] + + # 过滤掉一些没用的默认值的信息 + overall_data = config_filter(overall_data) + + flow_path = os.path.join(Local_Flows_Path, data_to_save.file_name) + print(flow_path) + if(not os.path.exists(flow_path) or (data_to_save.overwrite)): + with open(flow_path,"w") as f: + #json.dump(overall_data,f) + f.write(json.dumps(overall_data, ensure_ascii=False, indent=4)) + add_output_log(OutputPrompt.save_completed()) + else: + add_output_log(OutputPrompt.save_failed()) + + return flow_path + +class Script(scripts.Script): + + def __init__(self) -> None: + super().__init__() + #记录无id元素的id + self.new_ids={} + + def title(self): + return "lightdiffusionflow plugin" + + def show(self, is_img2img): + return scripts.AlwaysVisible + + def after_component(self, component, **kwargs): + ''' + python触发导入事件,按正常触发逻辑先执行js代码,把除图片以外的参数全部设置好, + 然后回到python代码,读取图片保存到Webui_Comps_Cur_Val, + 再用json2js的onchange事件触发js来点击隐藏按钮开始触发设置图片的事件队列。 + 代码从on_after_component 移到script里的after_component: + txt2img和img2img面板以外的组件回调不再有效 + 但是可以读取self.is_img2img确定当前组件存在的面板 + ''' + # #当前模式 + # teb_mode="img2img" if self.is_img2img else "txt2img" + # #空id补全方法 + # def in_zone(dom,id=""): + # if dom: + # if dom.elem_id: + # id+="-"+dom.elem_id + # return in_zone(dom.parent,id) + # elif dom.elem_classes and dom.elem_classes[0]!='gradio-blocks': + # if dom.elem_classes[0]=='gradio-accordion': + # id+='-'+dom.label+"?" + # return in_zone(dom.parent,id) + # if re.search('2img_(textual|hypernetworks|checkpoints|lora)_',id): + # return False + # else: + # id=re.sub(r'\?[^\?]+$|[ \?]','',id) + # if id in self.new_ids: + # self.new_ids[id]+=1 + # else: + # self.new_ids[id]=1 + # return id+'-'+str(self.new_ids[id]) + + # #记录组件 + # try: + # #拉取id + # id=component.elem_id + # #若没有就重构 + # if id==None: + # id=component.elem_id==in_zone(component.parent,teb_mode) + # Webui_Comps[id] = component + # except BaseException as e: + # pass + + global txt2img_script_container, img2img_script_container, extensions_id_conponents, extensions_conponents + + conponents_originlist.append((component, 'img2img' if self.is_img2img else 'txt2img')) + #print(f"after_component {component} {kwargs.get('elem_id', None)} {'img2img' if self.is_img2img else 'txt2img'} ") + try: + + if kwargs["elem_id"] == "txt2img_generation_info_button": # or kwargs["elem_id"] == "img2img_generation_info_button": + self.custom_ui() + + if(Webui_Comps.get(kwargs["elem_id"], None) == None): + Webui_Comps[kwargs["elem_id"]] = component + #print(kwargs["elem_id"]) + + except BaseException as e: + pass + + get_script_container(component) + + if (isinstance(component, gr.Button) and kwargs["elem_id"] == "img2img_generation_info_button"): # 加载到最后一个组件了。 兼容旧版,暂时不使用“img2img_preview_filename” + + searching_extensions_title() + #print(extensions_conponents) + + for comp_tuple in conponents_originlist: + comp = comp_tuple[0] + temp_parent = comp.parent + mode_tab = comp_tuple[1] + tab = None + tabs = None + ext_name = "" + + # --------------------------------------组件分类-------------------------------------------------- + while temp_parent: + try: + # tab 如果有多层只存最上层 + if(isinstance(temp_parent,gr.Tab)): + tab = temp_parent + if(isinstance(temp_parent,gr.Tabs)): + tabs = temp_parent + + temp_ext_name = get_extname_from_label(temp_parent.label) + if(extensions_conponents[mode_tab].get(temp_ext_name, None) != None): + ext_name = temp_ext_name + break + except BaseException as e: + pass + temp_parent = temp_parent.parent + + try: + if(tabs): + tab_index = 0 + for temp_tab in tabs.children: + if (tab == temp_tab): + break + tab_index+=1 + if(extensions_conponents[mode_tab][ext_name].get(str(tab_index), None) != None): + extensions_conponents[mode_tab][ext_name][str(tab_index)].append(comp) + else: + extensions_conponents[mode_tab][ext_name][str(tab_index)] = [comp] + else: + extensions_conponents[mode_tab][ext_name]["base"].append(comp) + except KeyError as e: + pass + + # print(extensions_conponents) # 整理好的第三方插件用到的组件 + # --------------------------------------组件分类-------------------------------------------------- + + if(self.is_img2img): + State_Comps["useless_Textbox"] = \ + gr.Textbox(value='useless_Textbox', elem_id='useless_Textbox', visible=False) + + params_create_ids() + + # img2img下的面板 + self.custom_ui() + + target_comps = [] + target_comps.append(State_Comps["json2js"]) # 触发事件传递json给js + #target_comps.append(State_Comps["outlog"][0]) + #target_comps.append(State_Comps["outlog"][1]) # 因为显示日志的窗口分txt2img和img2img两个位置 所以两个位置同步导出 + + for i in range(len(State_Comps["local_flows"])): + #State_Comps["local_flows"] + State_Comps["refresh"][i].click(refresh_local_flows, inputs=State_Comps["local_flows"],outputs=State_Comps["local_flows"]) + State_Comps["apply"][i].click(apply_local_flow, inputs=[State_Comps["local_flows"][i]],outputs=[]) + State_Comps["save"][i].click(fn=None,_js="state.core.actions.saveFlowToLocal", inputs=[],outputs=[]) + #State_Comps["delete"][i].click(delete_local_flow, inputs=[State_Comps["local_flows"][i]],outputs=State_Comps["local_flows"]) + + + for btn in State_Comps["export"]: + btn.click(None,_js="state.core.actions.exportState") #, inputs=[],outputs=[] + + for btn in State_Comps["import"]: + # js里加载除图片以外的参数 python加载图片 + btn.upload(fn_import_workflow, _js=f"state.core.actions.handleLightDiffusionFlow", + inputs=[btn],outputs=target_comps, show_progress=False) + + State_Comps["json2js"].change(fn=None,_js="state.core.actions.startImportImage", + inputs=[State_Comps["json2js"]]) + + State_Comps["test_button"].click(test_func,_js="state.utils.testFunction",inputs=[]) + + State_Comps["refresh_log"].click(add_output_log,inputs=[],outputs=[State_Comps["outlog"][0], State_Comps["outlog"][1]]) + + on_dropdown_change_inputs = list(extensions_id_conponents["dropdown"].values()) + for comp_to_bind in extensions_id_conponents["dropdown"].keys(): + extensions_id_conponents["dropdown"][comp_to_bind].change(on_dropdown_changed,inputs=on_dropdown_change_inputs,outputs=[]) + + # change事件,每次上传一张图会触发change时间几十次 + # clear事件,删除图片的时候不触发,官方bug好像新版本有解决。 + # edit事件,每次编辑都能触发一次,上传图片的时候也能触发。 + # 没有合适的清除图片的事件,无法使用python监控图片组件 + # on_img_change_inputs = list(extensions_id_conponents["image"].values()) + # for comp_to_bind in extensions_id_conponents["image"].keys(): + # extensions_id_conponents["image"][comp_to_bind].edit(on_img_changed,inputs=on_img_change_inputs,outputs=[]) + # extensions_id_conponents["image"][comp_to_bind].clear(on_img_changed,inputs=on_img_change_inputs,outputs=[]) + + temp_dropdown_outputs = list(extensions_id_conponents["dropdown"].values()) + temp_dropdown_outputs.append(State_Comps["json2js"]) # json2js触发完成事件 + temp_dropdown_outputs.append(State_Comps["outlog"][0]) # 输出日志 + temp_dropdown_outputs.append(State_Comps["outlog"][1]) # 输出日志 + State_Comps["set_dropdowns"].click(set_dropdowns,inputs=[],outputs=temp_dropdown_outputs,show_progress=False) + + State_Comps["set_js_params"].click(set_js_params,inputs=[],outputs=State_Comps["json2js"]) + + input_component = State_Comps["background_import"] #State_Comps["import"][0] + State_Comps["set_file_button"].click(set_lightdiffusionflow_file,inputs=[],outputs=[input_component]) + State_Comps["preload_button"].click(fn_import_workflow, _js=f"state.core.actions.handleLightDiffusionFlow", + inputs=[input_component],outputs=target_comps) + + for key in invisible_buttons.keys(): + segs = key.split("_") + comp_name = "_".join(segs[2:]) + #print(comp_name) + try: + invisible_buttons[key].click(func_for_invisiblebutton, + inputs=[], + outputs=[ + #Webui_Comps[comp_name], + extensions_id_conponents["image"][comp_name], + State_Comps["json2js"], + State_Comps["outlog"][0], + State_Comps["outlog"][1] + ]) + except KeyError: + print(f"No such component: {comp_name}") + + print("LightDiffusionFlow 绑定完成") + + def ui(self, is_img2img): + pass + + def custom_ui(self): + global File_extension, extensions_id_conponents + try: + State_Comps["import"] + State_Comps["export"] + State_Comps["outlog"] + State_Comps["local_flows"] + State_Comps["apply"] + State_Comps["delete"] + State_Comps["save"] + State_Comps["refresh"] + except: + State_Comps["import"] = [] + State_Comps["export"] = [] + State_Comps["outlog"] = [] + State_Comps["local_flows"] = [] + State_Comps["apply"] = [] + State_Comps["delete"] = [] + State_Comps["save"] = [] + State_Comps["refresh"] = [] + + + cur_mode = "img2img" if self.is_img2img else "txt2img" + save_mode = " (for all extensions)" + if(Flow_Save_mode == "Core"): + save_mode = " (only ControlNet)" + + global local_flow_list,Local_Flows_Path + try: + #flows_path = os.path.join(data_path, Local_Flows_Path) + local_flow_list = [f for f in os.listdir(Local_Flows_Path) if os.path.isfile( + os.path.join(Local_Flows_Path, f)) and os.path.splitext(f)[-1] == '.flow'] + except: + local_flow_list = [] + + with gr.Accordion('LightDiffusionFlow '+lightdiffusionflow_version.lightdiffusionflow_version + save_mode, open=True, visible=True, elem_id=cur_mode+'_lightdiffusionflow'): + + with gr.Row(): + State_Comps["local_flows"].append(gr.Dropdown(label="", show_label=False , multiselect=False, choices=local_flow_list,value='',elem_id=cur_mode+'_ldf_local_flows')) + State_Comps["apply"].append(ui_components.ToolButton(value=paste_symbol,elem_id=cur_mode+'_ldf_apply')) + #State_Comps["delete"].append(ui_components.ToolButton(value=clear_prompt_symbol,elem_id=cur_mode+'_ldf_delete')) + State_Comps["save"].append(ui_components.ToolButton(value=save_style_symbol,elem_id=cur_mode+'_ldf_save')) + State_Comps["refresh"].append(ui_components.ToolButton(value=refresh_symbol,elem_id=cur_mode+'_ldf_refresh')) + + with gr.Row(): + gr.HTML(label="",value="
") + + with gr.Row(): + lightdiffusionflow_file = gr.File(label="LightDiffusionFlow File",file_count="single", file_types=[File_extension], elem_id=cur_mode+'_ldf_import') + State_Comps["import"].append(lightdiffusionflow_file) + + with gr.Column(scale=1): + gr.HTML(label="",value=''' + 开源社区/open-source community''') + State_Comps["outlog"].append(gr.HTML(label="Output Log",elem_id=cur_mode+'_ldf_outlog',value=''' +

Welcome to LightDiffusionFlow! \(^o^)/~

+

Welcome to LightDiffusionFlow! \(^o^)/~

+

Welcome to LightDiffusionFlow! \(^o^)/~

''')) + + with gr.Row(): + export_config = gr.Button(value='导出/Export',elem_id=cur_mode+'_ldf_export') + State_Comps["export"].append(export_config) + + if(self.is_img2img): + State_Comps["background_import"] = gr.File(label="LightDiffusionFlow File",file_count="single", + file_types=[File_extension],visible=False) + State_Comps["json2js"] = gr.Textbox(label="json2js",visible=False) + State_Comps["test_button"] = gr.Button(value='测试',elem_id='test_button',visible=True) + State_Comps["refresh_log"] = gr.Button(value='刷新日志',elem_id='img2img_invisible_refresh_log',visible=False) + State_Comps["set_dropdowns"] = gr.Button(value='设置部分参数',elem_id='lightdiffusionflow_set_dropdowns',visible=False) + State_Comps["set_js_params"] = gr.Button(value='设置剩下的js参数',elem_id='lightdiffusionflow_set_js_params',visible=False) + State_Comps["set_file_button"] = gr.Button(value='设置文件',elem_id='set_lightdiffusionflow_file',visible=False) + State_Comps["preload_button"] = gr.Button(value='预加载',elem_id='preload_button',visible=False) + + with gr.Row(): + #State_Comps["useless_Textbox"] = \ + # gr.Textbox(value='useless_Textbox', elem_id='useless_Textbox', visible=False) + + #for key in lf_config.Image_Components_Key: + for key in extensions_id_conponents["image"].keys(): + #print(key) + elem_id = ("img2img_" if self.is_img2img else "txt2img_") + "invisible_" + key + invisible_buttons[elem_id] = gr.Button(value=elem_id, elem_id=elem_id, visible=False) + + + +def on_after_component(component, **kwargs): + global Webui_Comps + + # img2img和txt2img面板以外的组件信息只能在这里获取 + # 如果希望python里控制这些组件,还需要把部分代码移到这里实现 + try: + if(Webui_Comps.get(kwargs["elem_id"], None) == None): + Webui_Comps[kwargs["elem_id"]] = component + #if( kwargs["elem_id"] == "setting_sd_model_checkpoint"): + # print("--------------setting_sd_model_checkpoint---------------") + except KeyError: + pass + + +def on_before_reload(): + lightdiffusionflow_config.init() + +# add callbacks +api = StateApi() +script_callbacks.on_app_started(api.start) +script_callbacks.on_after_component(on_after_component) +script_callbacks.on_before_reload(on_before_reload) diff --git a/gf/LightDiffusionFlow/scripts/state_settings.py b/gf/LightDiffusionFlow/scripts/state_settings.py new file mode 100644 index 0000000000000000000000000000000000000000..3ca8bfdb3e8b53db9965a21f9ea0d7481318acbb --- /dev/null +++ b/gf/LightDiffusionFlow/scripts/state_settings.py @@ -0,0 +1,52 @@ +import gradio as gr +import modules.shared as shared +from modules import scripts +from scripts import lightdiffusionflow_config +#from scripts.lightdiffusionflow_config import OutputPrompt +OutputPrompt = lightdiffusionflow_config.OutputPrompt + +def on_ui_settings(): + + section = ("lightdiffusionflow", "Light Diffusion Flow") + + # "sd_lora": shared.OptionInfo("None", "Add network to prompt", gr.Dropdown, + # lambda: {"choices": ["None", *networks.available_networks]}, refresh=networks.list_available_networks), + shared.opts.add_option("lightdiffusionflow-language", + shared.OptionInfo( + "default", + "显示语言/Language", + gr.Dropdown, + lambda: { + "choices": ["default","中文","english"] + }, section=section) + ) + + shared.opts.add_option("lightdiffusionflow-mode", + shared.OptionInfo( + "All", + f"模式/Mode: ({OutputPrompt.note_for_save_mode()})", + gr.Dropdown, + lambda: { + "choices": ["Core","All"] + }, section=section) + ) + + shared.opts.add_option("lightdiffusionflow-local-flows-path", + shared.OptionInfo( + default="models/LightDiffusionFlow", + label=f"本地保存的flow文件路径/Local path to save flow files", + #component=gr.TextBox, + section=section) + ) + + shared.opts.add_option("lightdiffusionflow-auto-fix-params", + shared.OptionInfo( + True, + f"自动纠正错误的参数/Automatically fix incorrect parameters", + gr.Checkbox, + {"interactive": True}, + section=section) + ) + + +scripts.script_callbacks.on_ui_settings(on_ui_settings) diff --git a/gf/LightDiffusionFlow/style.css b/gf/LightDiffusionFlow/style.css new file mode 100644 index 0000000000000000000000000000000000000000..9eaccc58ea4fb3ea83ecf5ba838c8ded40c6864f --- /dev/null +++ b/gf/LightDiffusionFlow/style.css @@ -0,0 +1,20 @@ +#settings_state_buttons button, +#settings_state_buttons input[type="file"] { + color: white; + background: rgb(249, 115, 22); + border-radius: 8px; + padding: 5px 15px; + font-weight: bold; + font-size: 16px; + margin-left: 5px; + margin-right: 5px; +} + +#settings_state_buttons input[type="file"] { + padding: 3px 5px; +} + +#settings_state hr { + margin-top: 32px; + margin-bottom: 32px; +} \ No newline at end of file diff --git a/gf/a1111-stable-diffusion-webui-vram-estimator/.gitignore b/gf/a1111-stable-diffusion-webui-vram-estimator/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..349612d6a598727ca38610e6f9f565e94c34258b --- /dev/null +++ b/gf/a1111-stable-diffusion-webui-vram-estimator/.gitignore @@ -0,0 +1,6 @@ +.DS_Store +.idea +*.log +tmp/ + +stats.json diff --git a/gf/a1111-stable-diffusion-webui-vram-estimator/LICENSE b/gf/a1111-stable-diffusion-webui-vram-estimator/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..3edb4ee93d82d43e8f29deb4c934208b63c96153 --- /dev/null +++ b/gf/a1111-stable-diffusion-webui-vram-estimator/LICENSE @@ -0,0 +1,9 @@ +MIT License + +Copyright 2023 space-nuko + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/gf/a1111-stable-diffusion-webui-vram-estimator/README.md b/gf/a1111-stable-diffusion-webui-vram-estimator/README.md new file mode 100644 index 0000000000000000000000000000000000000000..72484ffb6ecb7ba060f0bd4ea98db5fc9a338691 --- /dev/null +++ b/gf/a1111-stable-diffusion-webui-vram-estimator/README.md @@ -0,0 +1,13 @@ +# VRAM Estimator for stable-diffusion-webui + +When using stable-diffusion-webui, if you find yourself frequently running out of VRAM or worried that pushing your settings too far will break your webui, this extension might be of use. It gathers a set of statistics based on running `txt2img` and `img2img` with various different settings and uses extrapolation to estimate the amount of VRAM your settings will use. + +![Preview](./static/preview.png) + +## Usage + +First you will need to gather some data from running a bunch of image generations for the extension to use (you'll only have to do this step once). Go to the `VRAM Estimator` tab and set the `Max Image Size` and `Max Batch Count` parameters to the maximum that your system can handle when generating with `txt2img` and Hires Fix enabled (Hires Fix uses more VRAM than plain `txt2img` even when the target resolution is the same). The default settings of size 960 and 8 batch count are appropriate for an RTX 3090. Make sure you've closed any other programs that can take up VRAM, then click `Run Benchmark` and watch the console for progress. The benchmark will generate several images at different resolutions with `txt2img` and `img2img` and save the VRAM used by each run. + +After the benchmark has finished, you can go to the `txt2img` or `img2img` tab and adjust the sliders for Width/Height/Hires Fix to see the estimated VRAM usage. Note that the estimate is based on the amount of used VRAM on your system at the time you change the slider. As a rule of thumb you should expect about 1 GB more of VRAM to actually be used by the generating operation you choose. + +Also note that this extension does not take into account VRAM allocated by other extensions like ControlNet; it only looks at the width, height and batch count of the vanilla generation config. If you update webui, it's worth running the benchmark again if you find the estimates are no longer accurate. diff --git a/gf/a1111-stable-diffusion-webui-vram-estimator/scripts/__pycache__/vram_estimator.cpython-310.pyc b/gf/a1111-stable-diffusion-webui-vram-estimator/scripts/__pycache__/vram_estimator.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..8b9cd14d816f2b2caeb51132599f09d01c0834b0 Binary files /dev/null and b/gf/a1111-stable-diffusion-webui-vram-estimator/scripts/__pycache__/vram_estimator.cpython-310.pyc differ diff --git a/gf/a1111-stable-diffusion-webui-vram-estimator/scripts/vram_estimator.py b/gf/a1111-stable-diffusion-webui-vram-estimator/scripts/vram_estimator.py new file mode 100644 index 0000000000000000000000000000000000000000..7d0d94c1c4790c000bc92ae398fcff69f214e8b7 --- /dev/null +++ b/gf/a1111-stable-diffusion-webui-vram-estimator/scripts/vram_estimator.py @@ -0,0 +1,405 @@ +import time +import logging +import socket +import json +import math +import os.path +import torch.cuda +import numpy as np +import scipy.linalg +import scipy.optimize +from collections import defaultdict +from hashlib import sha256 +from PIL import Image +from logging.handlers import SysLogHandler +from modules import shared, script_callbacks, devices, scripts +from modules.processing import StableDiffusionProcessingTxt2Img, StableDiffusionProcessingImg2Img, Processed, process_images +import gradio as gr +import pandas as pd + + +curves = {} +stats_file = os.path.join(scripts.basedir(), "stats.json") + + +class VRAMCurve(): + """3d plane where x=pixels, y=batch_size, z=reserved_peak""" + def __init__(self, data): + arr = [] + batch_sizes_x = defaultdict(list) + batch_sizes_y = defaultdict(list) + for entry in data: + size = math.sqrt(entry["pixels"]) + batch_size = entry["batch_size"] + reserved = entry["reserved_peak"] + + arr.append([size, batch_size, reserved]) + batch_sizes_x[batch_size].append(size) + batch_sizes_y[batch_size].append(reserved) + + data = np.array(arr) + + # VRAM usage seems to scale linearly within one batch size, but the rate + # of increase differs between batch sizes with no real pattern + self.batch_sizes = {} + for k in batch_sizes_x.keys(): + z = np.polyfit(batch_sizes_x[k], batch_sizes_y[k], 3) + f = np.poly1d(z) + self.batch_sizes[k] = f + + # best-fit cubic curve + # M = [ones(size(x)), x, y, x.^2, x.*y, y.^2, x.^3, x.^2.*y, x.*y.^2, y.^3] + A = np.c_[np.ones(data.shape[0]), data[:,:2], data[:,0]**2, np.prod(data[:,:2], axis=1), \ + data[:,1]**2, data[:,0]**3, np.prod(np.c_[data[:,0]**2,data[:,1]],axis=1), \ + np.prod(np.c_[data[:,0],data[:,1]**2],axis=1), data[:,1]**3] + self.C,_,_,_ = scipy.linalg.lstsq(A, data[:,2]) + + def estimate(self, pixels, batch_size): + x = math.sqrt(pixels) + y = batch_size + + if batch_size in self.batch_sizes: + return self.batch_sizes[batch_size](x) + + return np.dot([1, x, y, + x ** 2, x * y, y ** 2, + x ** 3, x ** 2 * y, x * y ** 2, y ** 3], self.C) + + +def load_curve(): + global curves + if not os.path.isfile(stats_file): + print("[VRAMEstimator] No stats available, run benchmark first") + return None, None + + with open(stats_file, "r", encoding="utf-8") as f: + data = json.load(f) + + if "txt2img" not in data or "txt2img_hr" not in data or "txt2img_hr_latent" not in data or "img2img" not in data: + print("[VRAMEstimator] No stats available, run benchmark first") + return None, None + + curves = {} + for k, v in data.items(): + curves[k] = VRAMCurve(v) + + print("[VRAMEstimator] Loaded benchmark data.") + return make_plots(data) + + +def get_memory_stats(): + devices.torch_gc() + torch.cuda.reset_peak_memory_stats() + shared.mem_mon.monitor() + return {k: -(v//-(1024*1024)) for k, v in shared.mem_mon.stop().items()} + + +def run_benchmark(max_width, max_batch_count): + global curves + DEFAULT_ARGS = { + 'sd_model': shared.sd_model, + 'prompt': 'postapocalyptic steampunk city, exploration, cinematic, realistic, hyper detailed, photorealistic maximum detail, volumetric light, (((focus))), wide-angle, (((brightly lit))), (((vegetation))), lightning, vines, destruction, devastation, wartorn, ruins', + 'sampler_name': 'Euler a', + 'batch_size': 1, + 'n_iter': 1, + 'steps': 1, + 'cfg_scale': 15.0, + 'width': 512, + 'height': 512, + 'restore_faces': False, + 'tiling': False, + 'do_not_save_samples': True, + 'do_not_save_grid': True, + 'negative_prompt': '(((blurry))), ((foggy)), (((dark))), ((monochrome)), sun, (((depth of field)))', + 'do_not_reload_embeddings': True + } + results = {} + + print("[VRAMEstimator] Starting benchmark...") + mem_stats = get_memory_stats() + base_active = mem_stats["active_peak"] + base_reserved = mem_stats["reserved_peak"] + base_used = mem_stats["total"] - mem_stats["free"] + + shared.state.begin() + shared.state.job = "VRAM Estimator Benchmark" + shared.state.job_count = max_batch_count * int((max_width - 256) / 64) + + for op in ["txt2img", "txt2img_hr", "txt2img_hr_latent", "img2img"]: + results[op] = [] + for b in range(1, max_batch_count+1): + for i in range(256, max_width+64, 64): + #More verbose and informative progress strings + print(f"run benchmark:\nOperation: {op} \nBatch Count: {b}\nWidth: {i}") + devices.torch_gc() + torch.cuda.reset_peak_memory_stats() + shared.mem_mon.monitor() + shared.state.begin() + + args = DEFAULT_ARGS.copy() + args["batch_size"] = b + + if op == "txt2img": + args["width"] = i + args["height"] = i + p = StableDiffusionProcessingTxt2Img(**args) + elif op.startswith("txt2img_hr"): + args["width"] = int(i / 2) + args["height"] = int(i / 2) + args["enable_hr"] = True + args["hr_second_pass_steps"] = 1 + args["hr_resize_x"] = i + args["hr_resize_y"] = i + if op == "txt2img_hr_latent": + args["hr_upscaler"] = "Latent" + else: + args["hr_upscaler"] = "Lanczos" + p = StableDiffusionProcessingTxt2Img(**args) + elif op == "img2img": + args["width"] = i + args["height"] = i + args["init_images"] = [Image.new("RGB", (512, 512))] + p = StableDiffusionProcessingImg2Img(**args) + else: + print(f'unknown operation: {op}') + return 'error' + + t0 = time.time() + try: + process_images(p) + except Exception as e: + print(f'benchmark error: {e}') + shared.state.end() + break + + t1 = time.time() + shared.state.end() + + its = args["steps"] * args["batch_size"] / (t1 - t0) + + mem_stats = {k: -(v//-(1024*1024)) for k, v in shared.mem_mon.stop().items()} + results[op].append({ + "width": i, + "pixels": i * i, + "batch_size": args["batch_size"], + "active_peak": mem_stats["active_peak"] - base_active, + "reserved_peak": mem_stats["reserved_peak"] - base_reserved, + "sys_peak": mem_stats["system_peak"], + "sys_total": mem_stats["total"], + "its": round(its, 2) + }) + + with open(stats_file, "w", encoding="utf-8") as f: + json.dump(results, f) + + del p + del args + + curves = {} + for k, v in results.items(): + curves[k] = VRAMCurve(v) + + shared.state.end() + print("[VRAMEstimator] Benchmark finished.") + + results = make_plots(results) + results.append("Benchmark finished.") + return results + + +def make_plots(results): + dfs = [] + + for op in ["txt2img", "txt2img_hr", "txt2img_hr_latent", "img2img"]: + x = [] + y = [] + z = [] + + for result in results[op]: + x.append(result["pixels"]) + y.append(result["reserved_peak"]) + z.append(result["batch_size"]) + + df = pd.DataFrame({"pixels": x, "reserved_peak": y, "batch_size": z}) + dfs.append(df) + + return dfs + + +def on_ui_tabs(): + load_curve() + + with gr.Blocks(analytics_enabled=False) as vram_estimator_tab: + with gr.Row(): + plot = gr.LinePlot(title="txt2img Reserved VRAM", x="pixels", y="reserved_peak", color="batch_size", width=400, height=400, tooltip=["pixels", "reserved_peak", "batch_size"]) + plot2 = gr.LinePlot(title="txt2img Highres Fix Reserved VRAM", x="pixels", y="reserved_peak", color="batch_size", width=400, height=400, tooltip=["pixels", "reserved_peak", "batch_size"]) + with gr.Row(): + plot3 = gr.LinePlot(title="txt2img Highres Fix (Latent) Reserved VRAM", x="pixels", y="reserved_peak", color="batch_size", width=400, height=400, tooltip=["pixels", "reserved_peak", "batch_size"]) + plot4 = gr.LinePlot(title="img2img Reserved VRAM", x="pixels", y="reserved_peak", color="batch_size", width=400, height=400, tooltip=["pixels", "reserved_peak", "batch_size"]) + with gr.Row(): + width = gr.Slider(minimum=256, maximum=2048, step=64, label="Max Image Size", value=1024) + batch_count = gr.Slider(minimum=1, maximum=16, step=1, label="Max Batch Count", value=8) + with gr.Column(): + bench_run_btn = gr.Button("Run benchmark", variant="primary").style(full_width=False) + load_results_button = gr.Button("Load results").style(full_width=False) + with gr.Row(): + status = gr.HTML("") + + bench_run_btn.click(run_benchmark, inputs=[width, batch_count], outputs=[plot, plot2, plot3, plot4, status]) + load_results_button.click(load_curve, inputs=[], outputs=[plot, plot2, plot3, plot4]) + + return [(vram_estimator_tab, "VRAM Estimator", "vram_estimator_tab")] + + +heatmap = [ + [-1.0, (0.2, 0.2, 1.0)], + [ 0.0, (0.2, 0.2, 1.0)], + [ 0.2, (0.0, 1.0, 1.0)], + [ 0.4, (0.0, 1.0, 0.0)], + [ 0.6, (1.0, 1.0, 0.0)], + [ 0.8, (1.0, 0.0, 0.0)], + [ 1.0, (1.0, 0.0, 0.8)], + [ 2.0, (1.0, 0.0, 0.8)], +] + + +def gaussian(x, a, b, c, d=0): + return a * math.exp(-(x - b)**2 / (2 * c**2)) + d + +def get_color(x, map=[], spread=1): + x = min(1.0, max(0, x)) + r = min(1.0, sum([gaussian(x, p[1][0], p[0], 1.0 / (spread * len(map))) for p in map])) + g = min(1.0, sum([gaussian(x, p[1][1], p[0], 1.0 / (spread * len(map))) for p in map])) + b = min(1.0, sum([gaussian(x, p[1][2], p[0], 1.0 / (spread * len(map))) for p in map])) + return 'rgb(%.2f, %.2f, %.2f)' % (r * 255, g * 255, b * 255) + + +def make_span(reserved_vram_estimate): + if shared.state.job_count != 0: + return '
(Currently generating...)
' + + mem_stats = get_memory_stats() + + # VRAM estimate is taken after used system VRAM is subtracted + base_rest = mem_stats["total"] - mem_stats["free"] + estimate_total = base_rest + reserved_vram_estimate * 1.1 # buffer + + formatted_est = f"{round(estimate_total, 2):.2f} MB" + formatted_total = f"{mem_stats['total']} MB" + percent_usage = estimate_total / mem_stats['total'] + color = get_color(percent_usage, map=heatmap, spread=1.5) + + return f''' +
+
Estimated VRAM usage: {formatted_est} / {formatted_total} ({percent_usage * 100:.2f}%)
+
({base_rest} MB system + {reserved_vram_estimate:.2f} MB used)
+
+ ''' + + +def estimate_vram_txt2img(width, height, batch_size, enable_hr, hr_upscaler, hr_scale, hr_resize_x, hr_resize_y): + global curves + if "txt2img" not in curves or "txt2img_hr" not in curves: + return "(No stats yet, run benchmark in VRAM Estimator tab)" + + final_width = width + final_height = height + + if enable_hr: + if hr_resize_x == 0 and hr_resize_y == 0: + final_width = width * hr_scale + final_height = height * hr_scale + else: + if hr_resize_y == 0: + final_width = hr_resize_x + final_height = hr_resize_x * height // width + elif hr_resize_x == 0: + final_width = hr_resize_y * width // height + final_height = hr_resize_y + else: + src_ratio = width / height + dst_ratio = hr_resize_x / hr_resize_y + + if src_ratio < dst_ratio: + final_width = hr_resize_x + final_height = hr_resize_x * height // width + else: + final_width = hr_resize_y * width // height + final_height = hr_resize_y + + if hr_upscaler in shared.latent_upscale_modes: + upscaler_curve = "txt2img_hr_latent" + else: + upscaler_curve = "txt2img_hr" + print(hr_upscaler) + print(upscaler_curve) + + vram_estimate_normal = curves["txt2img"].estimate(final_width * final_height, batch_size) + vram_estimate_hr = curves[upscaler_curve].estimate(final_width * final_height, batch_size) + vram_estimate = max(vram_estimate_normal, vram_estimate_hr) + else: + vram_estimate = curves["txt2img"].estimate(final_width * final_height, batch_size) + + return make_span(vram_estimate) + + +def estimate_vram_img2img(width, height, batch_size): + global curves + if "img2img" not in curves: + return "(No stats yet, run benchmark in VRAM Estimator tab)" + + vram_estimate = curves["img2img"].estimate(width * height, batch_size) + return make_span(vram_estimate) + + +class Script(scripts.Script): + def title(self): + return "VRAM Usage Estimator" + + def show(self, is_img2img): + return scripts.AlwaysVisible + + def ui(self, is_img2img): + with gr.Group(): + with gr.Row(): + vram_status = gr.HTML() + + #Some extensions co-exist with vram estimator by throwing an error on load but nothing is wrong + try: + if is_img2img: + inputs = [self.i2i_width, self.i2i_height, self.i2i_batch_size] + fn = estimate_vram_img2img + else: + inputs = [self.t2i_width, self.t2i_height, self.t2i_batch_size, self.t2i_enable_hr, self.t2i_hr_upscaler, self.t2i_hr_scale, self.t2i_hr_resize_x, self.t2i_hr_resize_y] + fn = estimate_vram_txt2img + + + for input in inputs: + input.change( + fn=fn, + inputs=inputs, + outputs=[vram_status], + show_progress=False, + ) + #So if it thinks something is wrong, we'll just tell it to carry on + except Exception as e: + return [vram_status] + return [vram_status] + + def after_component(self, component, **kwargs): + elem_id = kwargs.get("elem_id") + + if elem_id == "txt2img_width": self.t2i_width = component + elif elem_id == "txt2img_height": self.t2i_height = component + elif elem_id == "txt2img_batch_size": self.t2i_batch_size = component + elif elem_id == "txt2img_enable_hr": self.t2i_enable_hr = component + elif elem_id == "txt2img_hr_upscaler": self.t2i_hr_upscaler = component + elif elem_id == "txt2img_hr_scale": self.t2i_hr_scale = component + elif elem_id == "txt2img_hr_resize_x": self.t2i_hr_resize_x = component + elif elem_id == "txt2img_hr_resize_y": self.t2i_hr_resize_y = component + elif elem_id == "img2img_width": self.i2i_width = component + elif elem_id == "img2img_height": self.i2i_height = component + elif elem_id == "img2img_batch_size": self.i2i_batch_size = component + + +script_callbacks.on_ui_tabs(on_ui_tabs) diff --git a/gf/a1111-stable-diffusion-webui-vram-estimator/static/preview.png b/gf/a1111-stable-diffusion-webui-vram-estimator/static/preview.png new file mode 100644 index 0000000000000000000000000000000000000000..fe4736551be599ecf5dd4c806e442dfa9a4c44eb Binary files /dev/null and b/gf/a1111-stable-diffusion-webui-vram-estimator/static/preview.png differ diff --git a/gf/asymmetric-tiling-sd-webui/.gitignore b/gf/asymmetric-tiling-sd-webui/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..7e99e367f8443d86e5e8825b9fda39dfbb39630d --- /dev/null +++ b/gf/asymmetric-tiling-sd-webui/.gitignore @@ -0,0 +1 @@ +*.pyc \ No newline at end of file diff --git a/gf/asymmetric-tiling-sd-webui/LICENSE b/gf/asymmetric-tiling-sd-webui/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..0e259d42c996742e9e3cba14c677129b2c1b6311 --- /dev/null +++ b/gf/asymmetric-tiling-sd-webui/LICENSE @@ -0,0 +1,121 @@ +Creative Commons Legal Code + +CC0 1.0 Universal + + CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE + LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN + ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS + INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES + REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS + PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM + THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED + HEREUNDER. + +Statement of Purpose + +The laws of most jurisdictions throughout the world automatically confer +exclusive Copyright and Related Rights (defined below) upon the creator +and subsequent owner(s) (each and all, an "owner") of an original work of +authorship and/or a database (each, a "Work"). + +Certain owners wish to permanently relinquish those rights to a Work for +the purpose of contributing to a commons of creative, cultural and +scientific works ("Commons") that the public can reliably and without fear +of later claims of infringement build upon, modify, incorporate in other +works, reuse and redistribute as freely as possible in any form whatsoever +and for any purposes, including without limitation commercial purposes. +These owners may contribute to the Commons to promote the ideal of a free +culture and the further production of creative, cultural and scientific +works, or to gain reputation or greater distribution for their Work in +part through the use and efforts of others. + +For these and/or other purposes and motivations, and without any +expectation of additional consideration or compensation, the person +associating CC0 with a Work (the "Affirmer"), to the extent that he or she +is an owner of Copyright and Related Rights in the Work, voluntarily +elects to apply CC0 to the Work and publicly distribute the Work under its +terms, with knowledge of his or her Copyright and Related Rights in the +Work and the meaning and intended legal effect of CC0 on those rights. + +1. Copyright and Related Rights. A Work made available under CC0 may be +protected by copyright and related or neighboring rights ("Copyright and +Related Rights"). Copyright and Related Rights include, but are not +limited to, the following: + + i. the right to reproduce, adapt, distribute, perform, display, + communicate, and translate a Work; + ii. moral rights retained by the original author(s) and/or performer(s); +iii. publicity and privacy rights pertaining to a person's image or + likeness depicted in a Work; + iv. rights protecting against unfair competition in regards to a Work, + subject to the limitations in paragraph 4(a), below; + v. rights protecting the extraction, dissemination, use and reuse of data + in a Work; + vi. database rights (such as those arising under Directive 96/9/EC of the + European Parliament and of the Council of 11 March 1996 on the legal + protection of databases, and under any national implementation + thereof, including any amended or successor version of such + directive); and +vii. other similar, equivalent or corresponding rights throughout the + world based on applicable law or treaty, and any national + implementations thereof. + +2. Waiver. To the greatest extent permitted by, but not in contravention +of, applicable law, Affirmer hereby overtly, fully, permanently, +irrevocably and unconditionally waives, abandons, and surrenders all of +Affirmer's Copyright and Related Rights and associated claims and causes +of action, whether now known or unknown (including existing as well as +future claims and causes of action), in the Work (i) in all territories +worldwide, (ii) for the maximum duration provided by applicable law or +treaty (including future time extensions), (iii) in any current or future +medium and for any number of copies, and (iv) for any purpose whatsoever, +including without limitation commercial, advertising or promotional +purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each +member of the public at large and to the detriment of Affirmer's heirs and +successors, fully intending that such Waiver shall not be subject to +revocation, rescission, cancellation, termination, or any other legal or +equitable action to disrupt the quiet enjoyment of the Work by the public +as contemplated by Affirmer's express Statement of Purpose. + +3. Public License Fallback. Should any part of the Waiver for any reason +be judged legally invalid or ineffective under applicable law, then the +Waiver shall be preserved to the maximum extent permitted taking into +account Affirmer's express Statement of Purpose. In addition, to the +extent the Waiver is so judged Affirmer hereby grants to each affected +person a royalty-free, non transferable, non sublicensable, non exclusive, +irrevocable and unconditional license to exercise Affirmer's Copyright and +Related Rights in the Work (i) in all territories worldwide, (ii) for the +maximum duration provided by applicable law or treaty (including future +time extensions), (iii) in any current or future medium and for any number +of copies, and (iv) for any purpose whatsoever, including without +limitation commercial, advertising or promotional purposes (the +"License"). The License shall be deemed effective as of the date CC0 was +applied by Affirmer to the Work. Should any part of the License for any +reason be judged legally invalid or ineffective under applicable law, such +partial invalidity or ineffectiveness shall not invalidate the remainder +of the License, and in such case Affirmer hereby affirms that he or she +will not (i) exercise any of his or her remaining Copyright and Related +Rights in the Work or (ii) assert any associated claims and causes of +action with respect to the Work, in either case contrary to Affirmer's +express Statement of Purpose. + +4. Limitations and Disclaimers. + + a. No trademark or patent rights held by Affirmer are waived, abandoned, + surrendered, licensed or otherwise affected by this document. + b. Affirmer offers the Work as-is and makes no representations or + warranties of any kind concerning the Work, express, implied, + statutory or otherwise, including without limitation warranties of + title, merchantability, fitness for a particular purpose, non + infringement, or the absence of latent or other defects, accuracy, or + the present or absence of errors, whether or not discoverable, all to + the greatest extent permissible under applicable law. + c. Affirmer disclaims responsibility for clearing rights of other persons + that may apply to the Work or any use thereof, including without + limitation any person's Copyright and Related Rights in the Work. + Further, Affirmer disclaims responsibility for obtaining any necessary + consents, permissions or other rights required for any use of the + Work. + d. Affirmer understands and acknowledges that Creative Commons is not a + party to this document and has no duty or obligation with respect to + this CC0 or use of the Work. diff --git a/gf/asymmetric-tiling-sd-webui/README.md b/gf/asymmetric-tiling-sd-webui/README.md new file mode 100644 index 0000000000000000000000000000000000000000..dcb16274a1adbfe705d42f1e3141c70690720938 --- /dev/null +++ b/gf/asymmetric-tiling-sd-webui/README.md @@ -0,0 +1,22 @@ +# Asymmetric Tiling for stable-diffusion-webui + +An always visible script extension for [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui/) to configure seamless image tiling independently for the X and Y axes. + +To use, install this repository from the "Extensions" tab in stable-diffusion-webui, restart your server, open the `Asymmetric tiling` foldout on txt2img or img2img, and make it active, and check `Tile X` or `Tile Y` as desired. While this script is active, the `Tiling` checkbox in the main UI will be ignored. + +Like existing tiling options this won't guarantee seamless tiling 100% of the time, but it should manage it for most prompts. You can check that images tile seamlessly using online tools like [Seamless Texture Check](https://www.pycheung.com/checker/). + +For the old, non-extension version of this script, use the "classic_script" branch. + +## X axis tiling examples + +![00817-3274117678-midnight cityscape, stunning environment, wide-angle, massive scale, landscape, panoramic, lush vegetation, idyllic](https://user-images.githubusercontent.com/19196175/195132862-8c050327-92f3-44a4-9c02-0f11cce0b609.png) +![01064-1316547214-(((domino run))), domino toppling, line of standing dominoes, domino cascade, domino effect, black dominos](https://user-images.githubusercontent.com/19196175/195137782-e72fc69a-14f1-4ae7-bac2-219734509aea.png) + +## Y Axis tiling examples + +![00840-2320166501-man climbing ladder, safety diagram](https://user-images.githubusercontent.com/19196175/195132867-1b36848e-135d-4103-8e10-1d760b3a0a4e.png)![01095-949590403-tree, thick branches, photograph, 80mm Sigma f1 4, studio quality, nature photography](https://user-images.githubusercontent.com/19196175/195140638-49b0a4be-fbca-45bc-8e52-6c985202ce29.png) + +## Update 2024/04/05 + +Support Forge (https://github.com/lllyasviel/stable-diffusion-webui-forge) diff --git a/gf/asymmetric-tiling-sd-webui/scripts/__pycache__/asymmetric_tiling.cpython-310.pyc b/gf/asymmetric-tiling-sd-webui/scripts/__pycache__/asymmetric_tiling.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..a428eab296db7d8f3843922f0e30bdf8368f42f0 Binary files /dev/null and b/gf/asymmetric-tiling-sd-webui/scripts/__pycache__/asymmetric_tiling.cpython-310.pyc differ diff --git a/gf/asymmetric-tiling-sd-webui/scripts/asymmetric_tiling.py b/gf/asymmetric-tiling-sd-webui/scripts/asymmetric_tiling.py new file mode 100644 index 0000000000000000000000000000000000000000..6010c549dd22f50fe785e5b13e19859ef7cba0a1 --- /dev/null +++ b/gf/asymmetric-tiling-sd-webui/scripts/asymmetric_tiling.py @@ -0,0 +1,116 @@ +import typing as tg + +import gradio as gr +from torch import Tensor +from torch.nn import Conv2d +from torch.nn import functional as F +from torch.nn.modules.utils import _pair + +import modules.scripts +import modules.sd_hijack +import modules.shared + +try: + from modules_forge.unet_patcher import UnetPatcher # type: ignore + + IS_FORGE = True +except ImportError: + IS_FORGE = False + + +# Asymmetric tiling script for stable-diffusion-webui +# +# This script allows seamless tiling to be enabled separately for the X and Y axes. +# When this script is in use, the "Tiling" option in the regular UI is ignored. +class Script(modules.scripts.Script): + # Override from modules.scripts.Script + def title(self): + return "Asymmetric tiling" + + # Override from modules.scripts.Script + def show(self, is_img2img): + return modules.scripts.AlwaysVisible + + # Override from modules.scripts.Script + def ui(self, is_img2img): + with gr.Accordion("Asymmetric tiling", open=False): + active = gr.Checkbox(False, label="Active") + tileX = gr.Checkbox(True, label="Tile X") + tileY = gr.Checkbox(False, label="Tile Y") + startStep = gr.Number(0, label="Start tiling from step N", precision=0) + stopStep = gr.Number(-1, label="Stop tiling after step N (-1: Don't stop)", precision=0) + + return [active, tileX, tileY, startStep, stopStep] + + # Override from modules.scripts.Script + def process(self, p, active, tileX, tileY, startStep, stopStep): + if (active): + # Record tiling options chosen for each axis. + p.extra_generation_params = { + "Tile X": tileX, + "Tile Y": tileY, + "Start Tiling From Step": startStep, + "Stop Tiling After Step": stopStep, + } + + # Modify the model's Conv2D layers to perform our chosen tiling. + if IS_FORGE: + unet_patcher = tg.cast(UnetPatcher, p.sd_model.forge_objects.unet).clone() + layers = [layer for layer in unet_patcher.model.modules() if isinstance(layer, Conv2d)] + self.__hijackConv2DMethods(layers, tileX, tileY, startStep, stopStep) + p.sd_model.forge_objects.unet = unet_patcher + else: + layers = [layer for layer in tg.cast(list, modules.sd_hijack.model_hijack.layers) if isinstance(layer, Conv2d)] + self.__hijackConv2DMethods(layers, tileX, tileY, startStep, stopStep) + else: + # Restore model behaviour to normal. + if IS_FORGE: + unet_patcher = tg.cast(UnetPatcher, p.sd_model.forge_objects.unet).clone() + layers = [layer for layer in unet_patcher.model.modules() if isinstance(layer, Conv2d)] + self.__restoreConv2DMethods(layers) + p.sd_model.forge_objects.unet = unet_patcher + else: + layers = [layer for layer in tg.cast(list, modules.sd_hijack.model_hijack.layers) if layer if isinstance(layer, Conv2d)] + self.__restoreConv2DMethods(layers) + + def postprocess(self, *args): + # Restore model behaviour to normal. + if not IS_FORGE: + layers = [layer for layer in tg.cast(list, modules.sd_hijack.model_hijack.layers) if isinstance(layer, Conv2d)] + self.__restoreConv2DMethods(layers) + + # [Private] + # Go through all the "Conv2D" layers in the model and patch them to use the requested asymmetric tiling mode. + def __hijackConv2DMethods(self, layers: list, tileX: bool, tileY: bool, startStep: int, stopStep: int): + for layer in layers: + layer.padding_modeX = 'circular' if tileX else 'constant' + layer.padding_modeY = 'circular' if tileY else 'constant' + layer.paddingX = (layer._reversed_padding_repeated_twice[0], layer._reversed_padding_repeated_twice[1], 0, 0) + layer.paddingY = (0, 0, layer._reversed_padding_repeated_twice[2], layer._reversed_padding_repeated_twice[3]) + layer.paddingStartStep = startStep + layer.paddingStopStep = stopStep + layer._conv_forward = TilingConv2d.replacementConv2DConvForward.__get__(layer, Conv2d) + + # [Private] + # Go through all the "Conv2D" layers in the model and restore them to their origanal behaviour. + def __restoreConv2DMethods(self, layers: list): + for layer in layers: + layer._conv_forward = Conv2d._conv_forward.__get__(layer, Conv2d) + + +class TilingConv2d(Conv2d): + # A replacement for the Conv2d._conv_forward method that pads axes asymmetrically. + # This replacement method performs the same operation (as of torch v1.12.1+cu113), but it pads the X and Y axes separately based on the members + # padding_modeX (string, either 'circular' or 'constant') + # padding_modeY (string, either 'circular' or 'constant') + # paddingX (tuple, cached copy of _reversed_padding_repeated_twice with the last two values zeroed) + # paddingY (tuple, cached copy of _reversed_padding_repeated_twice with the first two values zeroed) + def replacementConv2DConvForward(self, input: Tensor, weight: Tensor, bias: Tensor | None): + step = modules.shared.state.sampling_step + if ((self.paddingStartStep < 0 or step >= self.paddingStartStep) and (self.paddingStopStep < 0 or step <= self.paddingStopStep)): + working = F.pad(input, self.paddingX, mode=self.padding_modeX) + working = F.pad(working, self.paddingY, mode=self.padding_modeY) + else: + working = F.pad(input, self.paddingX, mode='constant') + working = F.pad(working, self.paddingY, mode='constant') + return F.conv2d(working, weight, bias, self.stride, _pair(0), self.dilation, self.groups) diff --git a/gf/sd-webui-crop-extension/.gitignore b/gf/sd-webui-crop-extension/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..1670801e7193c0b44430e3a8369f4a57102468cd --- /dev/null +++ b/gf/sd-webui-crop-extension/.gitignore @@ -0,0 +1,2 @@ + +*.pyc \ No newline at end of file diff --git a/gf/sd-webui-crop-extension/install.py b/gf/sd-webui-crop-extension/install.py new file mode 100644 index 0000000000000000000000000000000000000000..46fb87ba6df2f2e482982862f238e6b0eb042825 --- /dev/null +++ b/gf/sd-webui-crop-extension/install.py @@ -0,0 +1,6 @@ +import launch +# launch is imported in context of webui +if not launch.is_installed("dghs-imgutils") and not launch.is_installed("dghs-imgutils[gpu]"): + import torch.cuda as cuda + print("Installing dghs-imgutils") + launch.run_pip("install dghs-imgutils[gpu]" if cuda.is_available() else "install dghs-imgutils") \ No newline at end of file diff --git a/gf/sd-webui-crop-extension/scripts/__pycache__/ui.cpython-310.pyc b/gf/sd-webui-crop-extension/scripts/__pycache__/ui.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..d925a247afef5194667fc92b2fb726817215c154 Binary files /dev/null and b/gf/sd-webui-crop-extension/scripts/__pycache__/ui.cpython-310.pyc differ diff --git a/gf/sd-webui-crop-extension/scripts/ui.py b/gf/sd-webui-crop-extension/scripts/ui.py new file mode 100644 index 0000000000000000000000000000000000000000..d7a250a2b083dab1711b6cbc2775b4b2abc644b8 --- /dev/null +++ b/gf/sd-webui-crop-extension/scripts/ui.py @@ -0,0 +1,32 @@ +from modules import script_callbacks +import gradio as gr +from PIL import Image +import numpy as np +from imgutils.detect import detect_halfbody, detect_heads + + +def on_ui_tab_called(): + with gr.Blocks() as transparent_interface: + with gr.Row(): + with gr.Tabs(): + with gr.TabItem("CropHead"): + image_upload_input = gr.Image(label="Upload Image", source="upload",type="pil") + button = gr.Button(label="Convert") + image_upload_output = gr.Image(label="Output Image",type="numpy") + + def convert_image(image:Image.Image): + # first convert to RGB + if image.mode == "RGBA": + # convert transparent pixels to white + white_image = Image.new("RGB", image.size, (255, 255, 255)) + white_image.paste(image, mask=image.split()[3]) + image = white_image + else: + image = image.convert("RGB") + result = detect_heads(image) + new_image = image.crop(result[0][0]) + return new_image # return the new image + button.click(convert_image, inputs=[image_upload_input], outputs=[image_upload_output]) + return (transparent_interface, "CropHead", "script_crophead_interface"), + +script_callbacks.on_ui_tabs(on_ui_tab_called) diff --git a/gf/sd-webui-lua/.github/FUNDING.yml b/gf/sd-webui-lua/.github/FUNDING.yml new file mode 100644 index 0000000000000000000000000000000000000000..c4e17a6add911705ce00822a7b12da63fb7eb24e --- /dev/null +++ b/gf/sd-webui-lua/.github/FUNDING.yml @@ -0,0 +1 @@ +ko_fi: Yownas diff --git a/gf/sd-webui-lua/.gitignore b/gf/sd-webui-lua/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..bee8a64b79a99590d5303307144172cfe824fbf7 --- /dev/null +++ b/gf/sd-webui-lua/.gitignore @@ -0,0 +1 @@ +__pycache__ diff --git a/gf/sd-webui-lua/LICENSE b/gf/sd-webui-lua/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..f288702d2fa16d3cdf0035b15a9fcbc552cd88e7 --- /dev/null +++ b/gf/sd-webui-lua/LICENSE @@ -0,0 +1,674 @@ + GNU GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU General Public License is a free, copyleft license for +software and other kinds of works. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +the GNU General Public License is intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. We, the Free Software Foundation, use the +GNU General Public License for most of our software; it applies also to +any other work released this way by its authors. You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + To protect your rights, we need to prevent others from denying you +these rights or asking you to surrender the rights. Therefore, you have +certain responsibilities if you distribute copies of the software, or if +you modify it: responsibilities to respect the freedom of others. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must pass on to the recipients the same +freedoms that you received. You must make sure that they, too, receive +or can get the source code. And you must show them these terms so they +know their rights. + + Developers that use the GNU GPL protect your rights with two steps: +(1) assert copyright on the software, and (2) offer you this License +giving you legal permission to copy, distribute and/or modify it. + + For the developers' and authors' protection, the GPL clearly explains +that there is no warranty for this free software. For both users' and +authors' sake, the GPL requires that modified versions be marked as +changed, so that their problems will not be attributed erroneously to +authors of previous versions. + + Some devices are designed to deny users access to install or run +modified versions of the software inside them, although the manufacturer +can do so. This is fundamentally incompatible with the aim of +protecting users' freedom to change the software. The systematic +pattern of such abuse occurs in the area of products for individuals to +use, which is precisely where it is most unacceptable. Therefore, we +have designed this version of the GPL to prohibit the practice for those +products. If such problems arise substantially in other domains, we +stand ready to extend this provision to those domains in future versions +of the GPL, as needed to protect the freedom of users. + + Finally, every program is threatened constantly by software patents. +States should not allow patents to restrict development and use of +software on general-purpose computers, but in those that do, we wish to +avoid the special danger that patents applied to a free program could +make it effectively proprietary. To prevent this, the GPL assures that +patents cannot be used to render the program non-free. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Use with the GNU Affero General Public License. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU Affero General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the special requirements of the GNU Affero General Public License, +section 13, concerning interaction through a network will apply to the +combination as such. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If the program does terminal interaction, make it output a short +notice like this when it starts in an interactive mode: + + Copyright (C) + This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, your program's commands +might be different; for a GUI interface, you would use an "about box". + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU GPL, see +. + + The GNU General Public License does not permit incorporating your program +into proprietary programs. If your program is a subroutine library, you +may consider it more useful to permit linking proprietary applications with +the library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. But first, please read +. diff --git a/gf/sd-webui-lua/README.md b/gf/sd-webui-lua/README.md new file mode 100644 index 0000000000000000000000000000000000000000..cfb788947d692015768610e363a8d46d7573d226 --- /dev/null +++ b/gf/sd-webui-lua/README.md @@ -0,0 +1,38 @@ +# sd-webui-lua +Generate images with Lua scripts in Stable Diffusion webui. + +This is an extension for [Vlad's SD.next](https://github.com/vladmandic/automatic/) and [automatic1111's stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) to let you run Lua code to generate images (and more?). + +# Installation + +Under `Extensions` go to `Manual install` and paste the link to this repository. Click `Install`. + +# Usage + +It too early to try writing some sort of proper guide. But all functions should be listed if you open up the "Lua Extras..." at the bottom, but the best way to get started is probably by looking in the `examples` folder and tinker with those. There are also some information in the wiki. + +![sd-webui-lua](https://github.com/yownas/sd-webui-lua/assets/13150150/f5bbd9f0-1519-4219-8ff0-a296c4ec1172) + +# Why? + +This project started as a joke, but quickly turned into a fun way to dig into and learn more about the code behind the webui. It also turned out +to be a fun tool to generate images in weird ways that normally isn't really possible through the webui. So, is this an amazing tool to help you +generate beautiful images? No. But it will probably be fun to play with if you know a little about how Stable Diffusion works, but not enough +to be comfortable to write your own code from scratch. + +With this extension you can let the webui do the heavy lifting but still have access to poke at things on a fairly low level. + +# Issues & todo + +* Better css for the poor Code-box. + +* Needs a lot more functions to manipulate/access things. (and a save button/function) + +* Split sd.sample() into more parts. + +* Maybe import the diffusers-library, but I'm not sure how to use that with ckpt/safetensors. + +* More examples to show how things work. (Or actual documentation) + +* More ideas. I'd rather have a PR than a vague Issue. But I'm happy for any input. :) + diff --git a/gf/sd-webui-lua/examples/basics/README_example.lua b/gf/sd-webui-lua/examples/basics/README_example.lua new file mode 100644 index 0000000000000000000000000000000000000000..ba48be67355f01cd769d779c16001aece1cc4d1c --- /dev/null +++ b/gf/sd-webui-lua/examples/basics/README_example.lua @@ -0,0 +1,9 @@ +-- This is the example in the README - Yownas + +for i = 1, 12, 1 do + ui.status("Generating " .. input .. " #" .. i) + img = sd.process('a cute ' .. input) + ui.gallery.add(img) +end + +return("Done...") diff --git a/gf/sd-webui-lua/examples/basics/show_samplers.lua b/gf/sd-webui-lua/examples/basics/show_samplers.lua new file mode 100644 index 0000000000000000000000000000000000000000..9f0dc3d618b0db872ac3163f88fc12f29f902ab4 --- /dev/null +++ b/gf/sd-webui-lua/examples/basics/show_samplers.lua @@ -0,0 +1,10 @@ +-- Simple example on how to list available samplers - Yownas + +-- To iterate over the list from python we'll use python.iter() + +samplers = sd.getsamplers() + +ui.out("Samplers:") +for name in python.iter(samplers) do + ui.out(name) +end diff --git a/gf/sd-webui-lua/examples/basics/simple_output.lua b/gf/sd-webui-lua/examples/basics/simple_output.lua new file mode 100644 index 0000000000000000000000000000000000000000..d171ec92d657224afc65e70ab2e6c9dc34a484c3 --- /dev/null +++ b/gf/sd-webui-lua/examples/basics/simple_output.lua @@ -0,0 +1,21 @@ +-- Simple output examples - Yownas + +-- Clear Output box +ui.clear() + +-- Add text to Output +ui.out("Adding text to Output with ui.out()") + +-- These two have basically the same functionality +print("Using print()") +ui.console("Using ui.consol()") + +-- Log +ui.log.info("Some information.") +ui.log.warning("A warning!") +ui.log.error("An error! Help!") + +-- An optional return() will write to the Output box +a = 1 +b = 2 +return(a + b) diff --git a/gf/sd-webui-lua/examples/basics/start_end_debugging.lua b/gf/sd-webui-lua/examples/basics/start_end_debugging.lua new file mode 100644 index 0000000000000000000000000000000000000000..08f256b6d373195cdba8012838a08b0a14403037 --- /dev/null +++ b/gf/sd-webui-lua/examples/basics/start_end_debugging.lua @@ -0,0 +1,15 @@ +-- Example to show how to execute only part of a script - Yownas + +ui.console("This will not happen.") + +--START-- + +-- Adding a START comment like this will skip execution of the script above it. + +ui.console("Debug here!") + +--END-- + +-- Adding a END comment like this will stop execution. + +ui.console("This will not happen either.") diff --git a/gf/sd-webui-lua/examples/basics/text_to_image.lua b/gf/sd-webui-lua/examples/basics/text_to_image.lua new file mode 100644 index 0000000000000000000000000000000000000000..a39f9a94f34489c8abd1a0e049fccf44d474ab24 --- /dev/null +++ b/gf/sd-webui-lua/examples/basics/text_to_image.lua @@ -0,0 +1,25 @@ +-- text_to_image.lua - Different way to generate images - Yownas + +-- Simple, with hardcoded defaults +img = sd.process('a cute cat') +ui.gallery.add(img) + +-- Simple, using a Processing object (with caption) +p = sd.getp() +p.prompt = 'a cute puppy' +p.negative_prompt = 'angry bear' +p.steps = 25 +img = sd.pipeline(p) +ui.gallery.addc(img, 'Not an angry bear') + +-- Do the steps manually, also show latent before it is parsed by vae +p = sd.getp() +c = sd.cond('bunny') +uc = sd.negcond('banana') +latent = sd.sample(p, c, uc) +tmp = torch.clamp(torch.mul(torch.add(latent, 1.0) , 0.5), 0.0, 1.0) -- Convert range -1..1 to 0..1 +img = sd.toimage(tmp) +ui.gallery.add(img) +vae = sd.vae(latent) +img = sd.toimage(vae) +ui.gallery.add(img) diff --git a/gf/sd-webui-lua/examples/gifs/seed_travel_cat.lua b/gf/sd-webui-lua/examples/gifs/seed_travel_cat.lua new file mode 100644 index 0000000000000000000000000000000000000000..198fcb5ef954c7fb50e31548dc77ec2755b5ae36 --- /dev/null +++ b/gf/sd-webui-lua/examples/gifs/seed_travel_cat.lua @@ -0,0 +1,21 @@ +-- Seed travel, making a gif using the images in the Gallery - Yownas +-- Get a Processing-object +p = sd.getp() + +p.prompt = 'a cute cat' +p.seed = 42 +p.subseed = 1337 + +-- Change subseed_strength from 0 to 1 +for s = 0,1,0.1 +do + p.subseed_strength = s + img = sd.process(p) + ui.gallery.addc(img, 'Seed strength: ' .. tostring(s)) +end + +-- Get a gif from the images in the gallery, show each image 100 ms +gif = ui.gallery.getgif(100) +-- Remove this comment to clear the gallery before adding the gif +-- ui.gallery.clear() +ui.gallery.addc(gif, 'A cute cat gif') diff --git a/gf/sd-webui-lua/examples/gifs/seed_travel_puppy.lua b/gf/sd-webui-lua/examples/gifs/seed_travel_puppy.lua new file mode 100644 index 0000000000000000000000000000000000000000..35cbb630873577dead986ad6c10475ea9c873a08 --- /dev/null +++ b/gf/sd-webui-lua/examples/gifs/seed_travel_puppy.lua @@ -0,0 +1,21 @@ +-- Seed travel, making a gif using a table of images - Yownas +-- Get a Processing-object +p = sd.getp() + +p.prompt = 'a cute puppy' +p.seed = 62345 +p.subseed = 3455 +p.steps = 10 + +gif = {} + +-- Change subseed_strength from 0 to 1 +for s = 0,1,0.1 +do + p.subseed_strength = s + gif[#gif + 1] = sd.process(p) +end + +anim= sd.makegif(gif, 500) + +ui.gallery.addc(anim, 'A cute puppy gif') diff --git a/gf/sd-webui-lua/examples/misc/play_telephone.lua b/gf/sd-webui-lua/examples/misc/play_telephone.lua new file mode 100644 index 0000000000000000000000000000000000000000..8b4aa34aabd50c35fe3e9f6687f6317fdbe988e8 --- /dev/null +++ b/gf/sd-webui-lua/examples/misc/play_telephone.lua @@ -0,0 +1,28 @@ +-- Playing "telephone" - Yownas +p = sd.getp() + +p.prompt = [[view of a futuristic city at night, +quantum technology, stars, moon, night sky, neon signs, +highly detailed, 4k uhd, sci-fi, intricate detail, +high quality, sharp focus, 2.8f, fractal landscape]] + +p.negative_prompt = [[nsfw, illustration, drawing, +painting, out of focus, blurry, distorted, 3d, sketch, +digital art, watermark, signature, hands, giger]] + +p.seed = -1 +p.width = 640 +p.height = 480 +p.steps = 10 +p.sampler = "UniPC" +p.cfg_scale = 3 + +for i = 1, 12, 1 do + ui.status("Generating: " .. i) + ui.console(p.prompt) + img = sd.restorefaces(sd.pipeline(p)) + ui.gallery.addc(img, i) + p.prompt = sd.interrogate.clip(img) +end + +return("Seed: " .. p.seed) diff --git a/gf/sd-webui-lua/examples/misc/random_prompt.lua b/gf/sd-webui-lua/examples/misc/random_prompt.lua new file mode 100644 index 0000000000000000000000000000000000000000..2985cc7f00b03bd735e9c7ee1a550dab6cd30bd0 --- /dev/null +++ b/gf/sd-webui-lua/examples/misc/random_prompt.lua @@ -0,0 +1,15 @@ +-- Create random prompts - Yownas +adjective = {'cute', 'big', 'small', 'cool'} +animals = {'dog', 'cat', 'mouse', 'rabbit'} +verb = {'sitting', 'sleeping', 'running', 'playing'} + +function rnd (table) + return(table[math.random(1,#table)]) +end + +for i = 1, 4, 1 +do + prompt = "A " .. rnd(adjective) .. " " .. rnd(animals) .. " " .. rnd(verb) + img = sd.process(prompt) + ui.gallery.addc(img, prompt) +end diff --git a/gf/sd-webui-lua/examples/misc/vae_tensor.lua b/gf/sd-webui-lua/examples/misc/vae_tensor.lua new file mode 100644 index 0000000000000000000000000000000000000000..bf83ec83a91b61681b10531e6251d78885fe355f --- /dev/null +++ b/gf/sd-webui-lua/examples/misc/vae_tensor.lua @@ -0,0 +1,29 @@ +p = sd.getp() +c = sd.cond('bunny') +uc = sd.negcond('banana') +latent = sd.sample(p, c, uc) + +-- Generate original images +tmp = torch.clamp(torch.mul(torch.add(latent, 1.0), 0.5), 0.0, 1.0) +ui.gallery.add(sd.toimage(tmp)) + +ui.gallery.add(sd.toimage(sd.vae(latent))) + +-- The latent space we got from the sampler has +-- shape [1][4][64][64], lets put random numbers +-- in the upper left corner. +-- (Yes, this is Lua, but the tensors start at 0.) +for z = 0, 3, 1 do + for x = 0, 32, 1 do + for y = 0, 32, 1 do + -- Poke the tensor + latent[0][z][x][y] = math.random() + end + end +end + +-- Generate new images +tmp = torch.clamp(torch.mul(torch.add(latent, 1.0), 0.5), 0.0, 1.0) +ui.gallery.add(sd.toimage(tmp)) + +ui.gallery.add(sd.toimage(sd.vae(latent))) diff --git a/gf/sd-webui-lua/examples/textencodings/clip_mixing.lua b/gf/sd-webui-lua/examples/textencodings/clip_mixing.lua new file mode 100644 index 0000000000000000000000000000000000000000..f4189b3c75b39cc9f36909c373ba81d4539b37a2 --- /dev/null +++ b/gf/sd-webui-lua/examples/textencodings/clip_mixing.lua @@ -0,0 +1,32 @@ +-- Mix Clip encoded texts - Yownas + +prompt = 'dog' +prompt2 = 'cat' + +-- We need to clear the gallery before we start. +-- ui.gallery.getgif() will fail later if there is anthing +-- but images in the gellery. +ui.gallery.clear() + +p = sd.getp() +p.seed = 4 + +-- Get encoded text from both prompts +p1 = sd.textencode(prompt) +p2 = sd.textencode(prompt2) + +for weight = 0, 1, 0.1 +do + -- Mix p1 and p2 + c = torch.lerp(p1, p2, weight) + -- sd.sample wants a conditional and unconditional object + -- but will also take a string as argument and change it + -- to what it needs, so we don't have to bother creating + -- a proper unconditional value for it here. + latent = sd.sample(p, c, '') + ui.gallery.addc(sd.toimage(sd.vae(latent)), weight) +end + +-- Create a gif from the images in the gallery +gif = ui.gallery.getgif(200) +ui.gallery.addc(gif, 'Dog to cat gif') diff --git a/gf/sd-webui-lua/examples/textencodings/clip_poking.lua b/gf/sd-webui-lua/examples/textencodings/clip_poking.lua new file mode 100644 index 0000000000000000000000000000000000000000..8b02f13b398238d32683d9209d929be2a419754d --- /dev/null +++ b/gf/sd-webui-lua/examples/textencodings/clip_poking.lua @@ -0,0 +1,27 @@ +-- Mess with the Clip encoded text - Yownas +prompt = 'cat' + +p = sd.getp() +p.seed = 1234 +c= sd.cond(prompt) +uc = sd.negcond('') + +latent = sd.sample(p, c, uc) +ui.gallery.addc(sd.toimage(sd.vae(latent)), 'Original') + +a = sd.textencode(prompt) + +ui.gallery.addc(sd.toimage(a), 'Original text encode') + +-- Print the tensor and size to Output +ui.out(a) +ui.out(torch.size(a)) + +-- Poke the encoded text (Just change one of the 77*768 numbers) +a[0][0][0] = 1.0 + +ui.gallery.addc(sd.toimage(a), 'Text encode after change') + +c = sd.negcond2cond(sd.clip2negcond(a)) +latent = sd.sample(p, c, uc) +ui.gallery.addc(sd.toimage(sd.vae(latent)), 'Poked latent') diff --git a/gf/sd-webui-lua/examples/textencodings/mixing_textencodings.lua b/gf/sd-webui-lua/examples/textencodings/mixing_textencodings.lua new file mode 100644 index 0000000000000000000000000000000000000000..85a36bb3760806cae60f159500b7f3780270c815 --- /dev/null +++ b/gf/sd-webui-lua/examples/textencodings/mixing_textencodings.lua @@ -0,0 +1,56 @@ +-- Mixing two Clip encoded texts - Yownas +prompt1 = [[painting concept art of a futuristic city in a future city at night, +highly detailed, concept art,4k uhd, digital painting, octane render, sci-fi, +highly detailed, intricate detail, high quality, photorealistic, 4k uhd render, +sharp focus, hdr, smooth, sharp focus]] + +prompt2 = [[portrait of a male wizard casting a magic spell in a dark forest, fantasy, +intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, +sharp focus, illustration, 4k ultra hd, illustration, natural lighting, octane render]] + +uc = sd.negcond('out of focus, blurry, distorted, watermark, signature') + +-- We want garbage collecting running +collectgarbage(restart) +p = sd.getp() +p.seed = -1 +p.width = 640 +p.height = 512 +p.steps = 15 +p.sampler = "UniPC" +p.cfg_scale = 4 + +ui.status("Encoding text") +c1 = sd.textencode(prompt1) +c2 = sd.textencode(prompt2) + +-- Get two empty tensors with the same size as a text-encode +e1 = torch.new_zeros({1, 77, 768}) +e2 = torch.new_zeros({1, 77, 768}) + +-- Copy half of c1 & c2 into e1 & e2 +ui.status("Mixing") +for x = 0, 76, 1 do + for y = 0, 767, 2 do + -- even numbers; c1 -> e1, c2 -> e2 + e1[0][x][y] = c1[0][x][y] + e2[0][x][y] = c2[0][x][y] + end + for y = 1, 767, 2 do + -- odd numbers; c2 -> e1, c1 -> e2 + e1[0][x][y] = c2[0][x][y] + e2[0][x][y] = c1[0][x][y] + end +end + +ui.status("Generating image 1/4") +ui.gallery.addc(sd.toimage(sd.vae(sd.sample(p, c1, uc))), "Original prompt 1") +ui.status("Generating image 2/4") +ui.gallery.addc(sd.toimage(sd.vae(sd.sample(p, c2, uc))), "Original prompt 2") +ui.status("Generating image 3/4") +ui.gallery.addc(sd.toimage(sd.vae(sd.sample(p, e1, uc))), "Mixed 1") +ui.status("Generating image 4/4") +ui.gallery.addc(sd.toimage(sd.vae(sd.sample(p, e2, uc))), "Mixed 2") + +-- Return the seed to the Output box +return(p.seed) diff --git a/gf/sd-webui-lua/examples/textencodings/parts_of_encoded_text.lua b/gf/sd-webui-lua/examples/textencodings/parts_of_encoded_text.lua new file mode 100644 index 0000000000000000000000000000000000000000..a2dfa3c7fc9f0b2d362c6c2268f57d908b715a64 --- /dev/null +++ b/gf/sd-webui-lua/examples/textencodings/parts_of_encoded_text.lua @@ -0,0 +1,28 @@ +-- Copy parts of the Clip encoded text - Yownas +prompt = [[portrait of a cyberpunk assassin, blade runner style, cyberpunk style, blade runner style, sci-fi character, +dark environment. highly detailed painting, 8k, mid shot. cinematic lighting. futurism dystopian setting. +realistic proportions and composition. oil on canvas. award winning. realistic proportions and faces. +dramatic scene, cute sparkles and rainbows]] + +-- We want garbage collecting running +collectgarbage(restart) +p = sd.getp() +p.seed = -1 +c= sd.textencode(prompt) +uc = sd.negcond('nsfw, out of focus, blurry, boring and other things we do not want to see') + +-- Get a completely empty "text encode" +e = torch.new_zeros({1, 77, 768}) + +-- Copy parts of c into e and generate images along the way +for y = 0, 767, 1 do + for x = 0, 76, 1 do + e[0][x][y] = c[0][x][y] + end + ui.status("Generating step: " .. tostring(y) .. "/768") + -- Generate an image every 50th row + if y%50 == 0 then + ui.gallery.addc(sd.toimage(sd.vae(sd.sample(p, e, uc))), tostring(math.floor(100*y/767)) .. "%") + end +end +ui.gallery.addc(sd.toimage(sd.vae(sd.sample(p, c, uc))), "100%") diff --git a/gf/sd-webui-lua/examples/textencodings/swapping_textencodings.lua b/gf/sd-webui-lua/examples/textencodings/swapping_textencodings.lua new file mode 100644 index 0000000000000000000000000000000000000000..e1826ed35f9a3c35084c4c25c95af4a5d57afcc5 --- /dev/null +++ b/gf/sd-webui-lua/examples/textencodings/swapping_textencodings.lua @@ -0,0 +1,49 @@ +-- Randomly swapping numbers in two Clip encoded texts - Yownas +prompt1 = [[painting concept art of a futuristic city in a future city at night, +highly detailed, concept art,4k uhd, digital painting, octane render, sci-fi, +highly detailed, intricate detail, high quality, photorealistic, 4k uhd render, +sharp focus, hdr, smooth, sharp focus]] + +prompt2 = [[portrait of a female steampunk wizard casting a magic spell in a magical forest, fantasy, +intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, +sharp focus, illustration, 4k ultra hd, illustration, natural lighting, octane render, +sparkles, vivid colors, fractal landscape, smiling, rainbow]] + +uc = sd.negcond('out of focus, blurry, distorted, watermark, signature') + +-- We want garbage collecting running +collectgarbage(restart) +p = sd.getp() +p.seed = -1 +p.width = 640 +p.height = 512 +p.steps = 15 +p.sampler = "UniPC" +p.cfg_scale = 4 + +ui.status("Encoding text") +c1 = sd.textencode(prompt1) +c2 = sd.textencode(prompt2) +ui.status("Generating image 1/4") +ui.gallery.addc(sd.toimage(sd.vae(sd.sample(p, c1, uc))), "Original prompt 1") +ui.status("Generating image 2/4") +ui.gallery.addc(sd.toimage(sd.vae(sd.sample(p, c2, uc))), "Original prompt 2") + +-- Swap some random tensors +ui.status("Mixing") +for i = 0, 50000, 1 do + x = math.random(76) + y = math.random(767) + t = torch.t2f(c1[0][x][y]) -- convert into a float + c1[0][x][y] = c2[0][x][y] + c2[0][x][y] = t +end + +ui.status("Generating image 3/4") +ui.gallery.addc(sd.restorefaces(sd.toimage(sd.vae(sd.sample(p, c1, uc)))), "Mixed 1") +ui.status("Generating image 4/4") +ui.gallery.addc(sd.restorefaces(sd.toimage(sd.vae(sd.sample(p, c2, uc)))), "Mixed 2") + +ui.status("Done.") +-- Return the seed to the Output box +return(p.seed) diff --git a/gf/sd-webui-lua/extras.html b/gf/sd-webui-lua/extras.html new file mode 100644 index 0000000000000000000000000000000000000000..56d43591310229b47d625c5b34ac2cf894071100 --- /dev/null +++ b/gf/sd-webui-lua/extras.html @@ -0,0 +1,235 @@ +
+ sd-webui-lua links: + Github | + Wiki | + Lupa +
+
+
Functions
+
UI
+ + + + + + + > + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ui.out(string)>Write string to the Output box.
ui.clear()Clear Output box.
ui.console(string)Print to console.
ui.out(string)Print to Output box.
ui.gallery.add(image)Add image to Gallery,
ui.gallery.addc(image, string)Add image with caption to Gallery.
ui.gallery.clear()Clear the gallery.
ui.gallery.del(index)Delete image from gallery. (Starts at 1 since this is Lua.)
ui.gallery.getgif(duration)Get a gif from the images in the gallery. Show each image for "duration" ms.
ui.gallery.saveall()Save all images in the gallery.
ui.image.save(image, name)Save image.
ui.status(text)Update status-text under the buttons during run.
ui.log.info(text)Write info log to console.
ui.log.warning(text)Write warning log to console.
ui.log.error(text)Write error log to console.
+
SD
+ + + + + + + > + + + + > + + + + + + + + + + + + + + + + > + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
sd.pipeline(p)Deconstructed pipeline from the webui. Generate picture from a Processing object.
sd.process(string)Webui pipeline, generate image from a prompt-string or processing object.
sd.getp()Returns a default processing object (see below).
sd.cond(string)Run prompt string through clip.
sd.negcond(string)Run negative prompt string through clip. (These are unfortunately slightly different at the momemt)
sd.sample(p, cond, negcond)Turn noise into something that can get turned into an image. Takes a Processing object, a cond and a negcond value. Cond and negcond can also be Null, string or a tensor from sd.textencode().
sd.vae(latent)Variational auto-envoder.
sd.toimage(latent)Last step to get an image after the vae.
sd.textencode(string)Get a tensor from Clips text encode.
sd.clip2negcond(text encode)Convert tensor to a negative conditioning used by functions from the webui.
sd.negcond2cond(negcond)Convert a negative conditioning to conditioning used by functions from the webui. The regular prompt and the negative prompt is treated slightly different internally, this is why this is needed.
sd.getsamplers()Get list of samplers.
sd.restorefaces(image)Postprocess an image to restore faces.
sd.interrogate.clip(image)Get prompt from an image.
sd.interrogate.blip(image)(Same as sd.interrigate.clip()) Get prompt from an image.
sd.interrogate.deepbooru(image)Get prompt from an image. Using DeepBooru.
+
Torch
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
torch.clamp(v1, min, max)Clamp vector v1 between min and max.
torch.lerp(v1, v2, weight)Linear interpolation of v1 and v2, by weight. v1 + weight * (v2 - v1).
torch.abs(v1)Absolute value of v1.
torch.add(v1, v2)Add v2 (vector or float) to v1.
torch.sub(v1, v2)Subtract v2 (vector or float) from v1.
torch.mul(v1, v2)Multiply v2 (vector or float) with v1.
torch.div(v1, v2)Divide v1 with v2 (vector or float).
torch.size(v1)Return the size of vector v1.
torch.new_zeros(size)Take a Lua table, size, and create a zero-filled tensor.
torch.max(v)Return the max value in v.
torch.min(v)Return the min value in v.
torch.f2t(tensor)Return a tensor from a float.
torch.t2f(tensor)Return a float from a tensor.
torch.cat({table, with, tensors, ...}, dim)Concatenate tensors in dimension dim. For example, textencodings can be concatenated in dimension 1.
+
+
+
+ Default Processing-object:
+
+p = StableDiffusionProcessingTxt2Img(
+ sd_model=shared.sd_model,
+ outpath_samples=shared.opts.outdir_samples or shared.opts.outdir_txt2img_samples,
+ outpath_grids=shared.opts.outdir_grids or shared.opts.outdir_txt2img_grids,
+ prompt='',
+ styles=[],
+ negative_prompt='',
+ seed=-1,
+ subseed=-1,
+ subseed_strength=0,
+ seed_resize_from_h=0,
+ seed_resize_from_w=0,
+ seed_enable_extras=True,
+ sampler_name='Euler a',
+ batch_size=1,
+ n_iter=1,
+ steps=20,
+ cfg_scale=7,
+ width=512,
+ height=512,
+ restore_faces=False,
+ tiling=False,
+ enable_hr=False,
+ denoising_strength=0,
+ hr_scale=0,
+ hr_upscaler=None,
+ hr_second_pass_steps=0,
+ hr_resize_x=0,
+ hr_resize_y=0,
+ override_settings=[],
+)
+	
+
diff --git a/gf/sd-webui-lua/install.py b/gf/sd-webui-lua/install.py new file mode 100644 index 0000000000000000000000000000000000000000..42a5f1ed0d7a5f824322fbf0ad977ab0fa64e8f1 --- /dev/null +++ b/gf/sd-webui-lua/install.py @@ -0,0 +1,4 @@ +import launch + +if not launch.is_installed("lupa"): + launch.run_pip("install lupa", "requirements for Lua") diff --git a/gf/sd-webui-lua/javascript/sd-webui-lua.js b/gf/sd-webui-lua/javascript/sd-webui-lua.js new file mode 100644 index 0000000000000000000000000000000000000000..3b6a65b97a0e337a3ec644b86b30d8c025b0fd68 --- /dev/null +++ b/gf/sd-webui-lua/javascript/sd-webui-lua.js @@ -0,0 +1,8 @@ +function submit_sd_webui_lua(){ + var id = randomId() + requestProgress(id, gradioApp().getElementById('sd_webui_lua_results'), null, function(){}) + + var res = create_submit_args(arguments) + res[0] = id + return res +} diff --git a/gf/sd-webui-lua/scripts/__pycache__/sd-webui-lua.cpython-310.pyc b/gf/sd-webui-lua/scripts/__pycache__/sd-webui-lua.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..482dad4fc9ff5df8a4c17efebb7f70938d675259 Binary files /dev/null and b/gf/sd-webui-lua/scripts/__pycache__/sd-webui-lua.cpython-310.pyc differ diff --git a/gf/sd-webui-lua/scripts/sd-webui-lua.py b/gf/sd-webui-lua/scripts/sd-webui-lua.py new file mode 100644 index 0000000000000000000000000000000000000000..9f5d94891cd2e8a42818a2180f32c0218b262ef6 --- /dev/null +++ b/gf/sd-webui-lua/scripts/sd-webui-lua.py @@ -0,0 +1,578 @@ +from collections import namedtuple +import gradio as gr +import lupa +import numpy as np +import os +from pathlib import Path +from PIL import Image +import re +import torch +from torchvision import transforms +import traceback +from modules.call_queue import wrap_gradio_gpu_call +from modules import scripts, script_callbacks, devices, ui, shared, processing, sd_samplers, sd_samplers_common, paths +from modules import prompt_parser, ui, face_restoration, deepbooru +import modules.images as images +from modules.shared import opts, cmd_opts, state +from modules.processing import StableDiffusionProcessingTxt2Img, Processed, process_images, fix_seed, decode_first_stage, apply_overlay, apply_color_correction, create_infotext, create_random_tensors +try: + from modules.rng import ImageRNG +except ImportError: + ImageRNG = None + +sd_webui_lua_dir = scripts.basedir() + +def filter_attribute_access(obj, attr_name, is_setting): + if isinstance(attr_name, (str)): + if not attr_name.startswith('_'): + return attr_name + raise AttributeError('access denied') + +L = lupa.LuaRuntime(register_eval=False, attribute_filter=filter_attribute_access) +G = L.globals() +LUA_output = '' +LUA_gallery = [] + +def lua_run(id_task, lua_input, lua_code): + global LUA_output, LUA_gallery + print(f"LUA: START: {id_task}") + # Crop --START-- and --END-- + lua_code = re.sub('.*\n--START--', '', lua_code, 1, flags=re.S) + lua_code = re.sub('\n--END--.*', '', lua_code, 1, flags=re.S) + # Prepend input + lua_code = f"input = [[{lua_input}]]\n{lua_code}" + try: + result = L.execute(lua_code) + if result: + LUA_output += str(result)+'\n' + except Exception as err: + traceback.print_exc() + result = f"ERROR: {err}" + print(f"LUA {result}") + raise gr.Error(result) + print(f"LUA: END: {id_task}") + return LUA_output, LUA_gallery, '' + +def lua_reset(): + global L, G, LUA_output, LUA_gallery + L = lupa.LuaRuntime(register_eval=False, attribute_filter=filter_attribute_access) + G = L.globals() + LUA_output = '' + LUA_gallery = [] + # Setup python functions (messy list. Will most likely change) + G.sd = { + 'empty_latent': sd_lua_empty_latent, + 'pipeline': sd_lua_pipeline, + 'process': sd_lua_process, + 'getp': sd_lua_getp, + 'cond': sd_lua_cond, + 'negcond': sd_lua_negcond, + 'sample': sd_lua_sample, + 'vae': sd_lua_vae, + 'toimage': sd_lua_toimage, + 'makegif': sd_lua_makegif, + 'getsamplers': sd_lua_getsamplers, + 'restorefaces': sd_lua_restorefaces, + 'textencode': sd_lua_textencode, + 'clip2negcond': sd_lua_clip2negcond, + 'negcond2cond': sd_lua_negcond2cond, + 'interrogate': { + 'clip': sd_lua_interrogate_clip, + 'blip': sd_lua_interrogate_clip, + 'deepbooru': sd_lua_interrogate_deepbooru, + } + } + G.ui = { + 'clear': ui_lua_output_clear, + 'console': ui_lua_console, + 'out': ui_lua_output, + 'gallery': { + 'add': ui_lua_gallery_add, + 'addc': ui_lua_gallery_addc, + 'clear': ui_lua_gallery_clear, + 'del': ui_lua_gallery_del, + 'getgif': ui_lua_gallery_getgif, + 'saveall': ui_lua_gallery_saveall, + }, + 'image': { + 'save': ui_lua_imagesave, + }, + 'status': ui_status, + 'log': { + 'info': ui_lua_log_info, + 'warning': ui_lua_log_warning, + 'error': ui_lua_log_error, + } + } + G.torch = { + 'abs': torch_abs, + 'add': torch_add, + 'sub': torch_sub, + 'clamp': torch_clamp, + 'lerp': torch_lerp, + 'mul': torch_mul, + 'div': torch_div, + 'size': torch_size, + 'new_zeros': torch_new_zeros, + 'max': torch_max, + 'min': torch_min, + 't2f': torch_t2f, + 'f2t': torch_f2t, + 'cat': torch_cat, + } + devices.torch_gc() + return LUA_output, LUA_gallery + +def lua_refresh(): + global LUA_output, LUA_gallery + devices.torch_gc() + return LUA_output, LUA_gallery + +# Functions for Lua +def ui_lua_console(text): + print(f"Lua: {text}") + +def ui_lua_output(text): + global LUA_output + LUA_output += str(text)+'\n' + +def ui_lua_output_clear(): + global LUA_output + LUA_output = '' + +def ui_lua_gallery_add(image): + ui_lua_gallery_addc(image, '') + +def ui_lua_gallery_addc(image, caption): + global LUA_gallery + LUA_gallery.insert(0, (image, caption)) + +def ui_lua_gallery_getgif(duration): + global LUA_gallery + gif = [] + for i in LUA_gallery: + gif.insert(0, i[0]) + gif_path = sd_lua_list2gif(gif, duration) + return(gif_path) + +def ui_lua_gallery_saveall(): + path_to_save = os.path.join(opts.outdir_extras_samples, 'lua') + if not os.path.exists(path_to_save): + try: + os.makedirs(path_to_save, exist_ok=True) + print('LUA: Creating folder:', path_to_save) + except: + pass + for image in LUA_gallery[::-1]: + #gif.insert(0, i[0]) + name = images.get_next_sequence_number(path_to_save, '') + path_to_image = os.path.join(path_to_save, f"{name}.png") + image[0].save(path_to_image) + +def ui_lua_gallery_clear(): + global LUA_gallery + LUA_gallery = [] + +def ui_lua_gallery_del(index): + global LUA_gallery + # FIXME add code here to match caption + del LUA_gallery[index-1] + +def ui_status(text): + shared.state.textinfo = text + +def ui_lua_log_info(text): + #log.info(text) + print(f"Lua INFO: {text}") + +def ui_lua_log_warning(text): + #log.warning(text) + print(f"Lua WARNING: {text}") + +def ui_lua_log_error(text): + #log.error(text) + print(f"Lua ERROR: {text}") + +def sd_lua_interrogate_clip(image): + return(shared.interrogator.interrogate(image)) + +def sd_lua_interrogate_deepbooru(image): + return(deepbooru.model.tag(image)) + +# Empty latent +# IN: width, height +# OUT: latent +# FIXME, remove this, use torch.new_zeros() instead +def sd_lua_empty_latent (w, h): + tensor = torch.tensor((), dtype=torch.float32) + return tensor.new_zeros((w, h)) + +def sd_lua_getsamplers(): + return([x.name for x in sd_samplers.all_samplers]) + +def sd_lua_restorefaces(image): + return(Image.fromarray(face_restoration.restore_faces(np.asarray(image)))) + +# IN: +# OUT: p +def sd_lua_getp(): + p = StableDiffusionProcessingTxt2Img( + sd_model=shared.sd_model, + outpath_samples=shared.opts.outdir_samples or shared.opts.outdir_txt2img_samples, + outpath_grids=shared.opts.outdir_grids or shared.opts.outdir_txt2img_grids, + prompt='', + styles=[], + negative_prompt='', + seed=-1, + subseed=-1, + subseed_strength=0, + seed_resize_from_h=0, + seed_resize_from_w=0, + seed_enable_extras=True, + sampler_name='Euler a', + batch_size=1, + n_iter=1, + steps=20, + cfg_scale=7, + width=512, + height=512, + restore_faces=False, + tiling=False, + enable_hr=False, + denoising_strength=0, + hr_scale=0, + hr_upscaler=None, + hr_second_pass_steps=0, + hr_resize_x=0, + hr_resize_y=0, + override_settings=[], + ) + return(p) + +# Conditioning functions +ScheduledPromptConditioning = namedtuple("ScheduledPromptConditioning", ["end_at_step", "cond"]) +class ComposableScheduledPromptConditioning: + def __init__(self, schedules, weight=1.0): + self.schedules: List[ScheduledPromptConditioning] = schedules + self.weight: float = weight +class MulticondLearnedConditioning: + def __init__(self, shape, batch): + self.shape: tuple = shape # the shape field is needed to send this object to DDIM/PLMS + self.batch: List[List[ComposableScheduledPromptConditioning]] = batch + +def sd_lua_cond(prompt): + with devices.autocast(): + cond = prompt_parser.get_multicond_learned_conditioning(shared.sd_model, [prompt], 1000) + return(cond) +def sd_lua_negcond(prompt): + with devices.autocast(): + cond = prompt_parser.get_learned_conditioning(shared.sd_model, [prompt], 1000) + return(cond) + +def sd_lua_textencode(prompt): # Prompt to learned conditioning + with devices.autocast(): + conds = shared.sd_model.get_learned_conditioning([prompt]) + return(conds) + +def sd_lua_clip2negcond(conds): # Probably need a better name than clip + res = [] + cond_schedule = [] + + with devices.autocast(): + for cond in conds: + cond_schedule.append(ScheduledPromptConditioning(1000, cond)) # Hardcoded steps, hopefully not breaking too many things + res.append(cond_schedule) + return(res) + +def sd_lua_negcond2cond(negcond): + res = [] + cond_schedule = [] + with devices.autocast(): + for spcs in negcond: + cond_schedule.append([ComposableScheduledPromptConditioning(spcs, 1.0)]) + + # [[]] + return MulticondLearnedConditioning(shape=(1,), batch=cond_schedule) + +# IN: p, c, uc +# OUT: latent +def sd_lua_sample(p, c, uc): + fix_seed(p) + + opt_C = 4 + opt_f = 8 + if ImageRNG is not None: + p.rng = ImageRNG((opt_C, p.height // opt_f, p.width // opt_f), [p.seed]) + + # Fix c and uc so they are of the correc type + if c == None: + c = '' + if uc == None: + uc = '' + if isinstance(c, str): + c = sd_lua_cond(c) + if torch.is_tensor(c): + c = sd_lua_clip2negcond(c) + c = sd_lua_negcond2cond(c) + if isinstance(uc, str): + uc = sd_lua_negcond(uc) + if torch.is_tensor(uc): + uc = sd_lua_clip2negcond(uc) + + with devices.without_autocast() if devices.unet_needs_upcast else devices.autocast(): + samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=[p.seed], subseeds=[p.subseed], subseed_strength=p.subseed_strength, prompts=[p.prompt]) + + devices.torch_gc() + + return(samples_ddim) + +# IN: latent +# OUT: latent +def sd_lua_vae(samples_ddim): + x_samples_ddim = [decode_first_stage(shared.sd_model, samples_ddim.to(dtype=devices.dtype_vae))[0].cpu()] + try: + for x in x_samples_ddim: + devices.test_for_nans(x, "vae") + except devices.NansException as e: + if not shared.cmd_opts.no_half and not shared.cmd_opts.no_half_vae and shared.cmd_opts.rollback_vae: + print('\nA tensor with all NaNs was produced in VAE, try converting to bf16.') + devices.dtype_vae = torch.bfloat16 + vae_file, vae_source = sd_vae.resolve_vae(p.sd_model.sd_model_checkpoint) + sd_vae.load_vae(p.sd_model, vae_file, vae_source) + x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))] + for x in x_samples_ddim: + devices.test_for_nans(x, "vae") + else: + raise e + x_samples_ddim = torch.stack(x_samples_ddim).float() + x_samples_ddim = torch.clamp((x_samples_ddim + 1.0) / 2.0, min=0.0, max=1.0) + + devices.torch_gc() + + return(x_samples_ddim) + +# IN: latent +# OUT: image (maybe) +def sd_lua_toimage(latent): + if len(latent.size()) > 3: # Really ugly kludge + latent = latent[0] + #for i, x_sample in enumerate(latent): + # x_sample = 255. * np.moveaxis(x_sample.cpu().numpy(), 0, 2) + # x_sample = x_sample.astype(np.uint8) + #image = Image.fromarray(x_sample) + T = transforms.ToPILImage() + image = T(latent) + return image + +def sd_lua_list2gif(gif, duration): + path_to_save = os.path.join(opts.outdir_extras_samples, 'lua') + if not os.path.exists(path_to_save): + try: + os.makedirs(path_to_save, exist_ok=True) + print('LUA: Creating folder:', path_to_save) + except: + pass + name = images.get_next_sequence_number(path_to_save, '') + path_to_save = os.path.join(path_to_save, f"{name}.gif") + gif[0].save(path_to_save, save_all=True, append_images=gif[1:], optimize=False, duration=duration, loop=0) + return(path_to_save) + +# IN: Lua-table with images +# OUT: string, name of gif +def sd_lua_makegif(table, duration): + gif = [x for x in table.values()] + gif_path = sd_lua_list2gif(gif, duration) + return(gif_path) + +# IN: image +# OUT: string (path to image) +def ui_lua_imagesave(image, name): + path_to_save = os.path.join(opts.outdir_extras_samples, 'lua') + if not os.path.exists(path_to_save): + try: + os.makedirs(path_to_save, exist_ok=True) + print('LUA: Creating folder:', path_to_save) + except: + pass + path_to_save = os.path.join(path_to_save, name) + # FIXME check so it doesn't overwrite images? + image.save(path_to_save) + return(path_to_save) + +# IN: p +# OUT: image +def sd_lua_pipeline(p): + devices.torch_gc() + + fix_seed(p) + + seed = p.seed + subseed = p.subseed + + comments = {} + + # FIXME remove? ignoring infotext will make things simpler + p.all_prompts = [p.prompt] + p.all_negative_prompts = [p.negative_prompt] + p.all_seeds = [int(seed)] + p.all_subseeds = [int(subseed)] + + def infotext(iteration=0, position_in_batch=0): + return create_infotext(p, [p.prompt], [p.seed], [p.subseed], comments, iteration, position_in_batch) + + infotexts = [] + output_images = [] + + #with torch.no_grad(), p.sd_model.ema_scope(): + with torch.no_grad(): + prompts = [p.prompt] + negative_prompts = [p.negative_prompt] + seeds = [p.seed] + subseeds = [p.subseed] + + c = sd_lua_cond(p.prompt) + uc = sd_lua_negcond(p.negative_prompt) + + # Sample + samples_ddim = sd_lua_sample(p, c, uc) + + x_samples_ddim = sd_lua_vae(samples_ddim) + + del samples_ddim + + if shared.cmd_opts.lowvram or shared.cmd_opts.medvram: + lowvram.send_everything_to_cpu() + + devices.torch_gc() + + for i, x_sample in enumerate(x_samples_ddim): + x_sample = 255. * np.moveaxis(x_sample.cpu().numpy(), 0, 2) + x_sample = x_sample.astype(np.uint8) + + + image = Image.fromarray(x_sample) + + if opts.samples_save and not p.do_not_save_samples: + images.save_image(image, p.outpath_samples, "", seeds[i], prompts[i], opts.samples_format, info=infotext(0, i), p=p) + + text = infotext(0, i) + infotexts.append(text) + if opts.enable_pnginfo: + image.info["parameters"] = text + output_images.append(image) + + del x_samples_ddim + + devices.torch_gc() + + return output_images[0] + +############################################################################3 + +# IN: p or string +# OUT: image +def sd_lua_process(prompt): + if isinstance(prompt, str): + p = sd_lua_getp() + p.prompt = prompt + else: + p = prompt + + processed = process_images(p) + p.close() + return processed.images[0] + +### Torch functions + +def torch_lerp(v1, v2, weight): + return torch.lerp(v1, v2, weight) + +def torch_abs(v1): + return torch.abs(v1) + +def torch_add(v1, v2): + return torch.add(v1, v2) + +def torch_sub(v1, v2): + return torch.sub(v1, v2) + +def torch_mul(v1, v2): + return torch.mul(v1, v2) + +def torch_div(v1, v2): + return torch.div(v1, v2) + +def torch_clamp(v1, min, max): + return torch.clamp(v1, min=min, max=max) + +def torch_max(v1): + return float(torch.max(v1)) + +def torch_min(v1): + return float(torch.min(v1)) + +def torch_size(v1): + return list(v1.size()) + +def torch_new_zeros(size): + tensor = torch.tensor((), dtype=torch.float32) + # Lua will return a "table", which for python looks like a dict, so we need to + # pick out the values (skipping the keys) and make it into a tuple + return tensor.new_zeros(tuple(size.values()), device=torch.device('cuda')) + +def torch_t2f(tensor): + return float(tensor) +def torch_f2t(tensor): + return float(tensor) + +def torch_cat(tensor_table, dim): + tensors = [x for x in tensor_table.values()] + return torch.cat(tuple(tensors), dim) + +def torch_copy(v): + return copy(v) + +def add_tab(): + with gr.Blocks(analytics_enabled=False) as tab: + with gr.Row(): + with gr.Column(scale=1): + # Would use this if the css wasn't broken + lua_code = gr.Code(label="Lua", elem_id="lua_code", language=None, show_label=False, lines=30, placeholder="(Lua code)") + #lua_code = gr.Textbox(label="Lua", show_label=False, lines=30, placeholder="(Lua code)") + with gr.Group(elem_id="sd_webui_lua_results"): + with gr.Row(): + res_info = gr.HTML() + with gr.Row(visible=False): + res = gr.Label("") + with gr.Column(scale=1): + with gr.Row(): + lua_input = gr.Textbox(label="Input", lines=1) + with gr.Row(): + run = gr.Button('Run', variant='primary') + reset = gr.Button('Reset') + refresh = gr.Button('Refresh') + with gr.Row(): + gallery = gr.Gallery(label="Gallery").style(preview=True, grid=4) + with gr.Row(): + results = gr.Textbox(label="Output", show_label=True, lines=10) + + run.click( + show_progress=False, + fn=ui.wrap_gradio_gpu_call(lua_run, extra_outputs=['']), + _js="submit_sd_webui_lua", + inputs=[res, lua_input, lua_code], + outputs=[results, gallery, res_info] + ) + reset.click(lua_reset, show_progress=False, inputs=[], outputs=[results, gallery]) + refresh.click(lua_refresh, show_progress=False, inputs=[], outputs=[results, gallery]) + with gr.Row(): + with gr.Accordion(label='Lua Extras...', open=False): + with open(os.path.join(sd_webui_lua_dir, "extras.html"), encoding="utf8") as file: + extras = file.read() + gr.HTML(extras) + + return [(tab, "Lua", "lua")] + +x,y = lua_reset() + +script_callbacks.on_ui_tabs(add_tab) diff --git a/gf/sd-webui-lua/style.css b/gf/sd-webui-lua/style.css new file mode 100644 index 0000000000000000000000000000000000000000..e2f9564162da878d11ca15a85502a0aa65eec367 --- /dev/null +++ b/gf/sd-webui-lua/style.css @@ -0,0 +1,7 @@ +#lua_code { + background-color: #333333; +} + +.cm-line { + color: #AAFFEE; +} diff --git a/gf/sd-webui-refiner/.gitattributes b/gf/sd-webui-refiner/.gitattributes new file mode 100644 index 0000000000000000000000000000000000000000..dfe0770424b2a19faf507a501ebfc23be8f54e7b --- /dev/null +++ b/gf/sd-webui-refiner/.gitattributes @@ -0,0 +1,2 @@ +# Auto detect text files and perform LF normalization +* text=auto diff --git a/gf/sd-webui-refiner/README.md b/gf/sd-webui-refiner/README.md new file mode 100644 index 0000000000000000000000000000000000000000..0a8b4d4f9ca5b4c5ac8eff8becac914b6c9994dc --- /dev/null +++ b/gf/sd-webui-refiner/README.md @@ -0,0 +1,11 @@ +# Refiner (webui Extension) +## Webui Extension for integration refiner in generation process + +Extension loads from refiner checkpoint only UNET and replaces base UNET with it at last steps of generation. + +## Usage + +1. Activate extension and choose refiner checkpoint in extension settings on txt2img tab. +2. Set percent of refiner steps from total sampling steps. + +Use Tiled VAE if you have 12GB or less VRAM. diff --git a/gf/sd-webui-refiner/scripts/__pycache__/refiner.cpython-310.pyc b/gf/sd-webui-refiner/scripts/__pycache__/refiner.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..4fb0a307c20935afba6d36dee256a6f1275924c1 Binary files /dev/null and b/gf/sd-webui-refiner/scripts/__pycache__/refiner.cpython-310.pyc differ diff --git a/gf/sd-webui-refiner/scripts/refiner.py b/gf/sd-webui-refiner/scripts/refiner.py new file mode 100644 index 0000000000000000000000000000000000000000..3f6e1d055e55f299706404ce09a6c53558aaad15 --- /dev/null +++ b/gf/sd-webui-refiner/scripts/refiner.py @@ -0,0 +1,146 @@ +from math import ceil +import torch +from modules import scripts, script_callbacks, devices, sd_models, sd_models_config, shared +import gradio as gr +import sgm.modules.diffusionmodules.denoiser_scaling +import sgm.modules.diffusionmodules.discretizer +from sgm.modules.encoders.modules import ConcatTimestepEmbedderND +from safetensors.torch import load_file, load +from sgm.modules.diffusionmodules.wrappers import OPENAIUNETWRAPPER +from omegaconf import OmegaConf +from sgm.util import ( + disabled_train, + get_obj_from_str, + instantiate_from_config, +) + +class Refiner(scripts.Script): + def __init__(self): + super().__init__() + self.callback_set = False + self.model = None + self.conditioner = None + self.base = None + self.swapped = False + self.model_name = '' + self.embedder = ConcatTimestepEmbedderND(256) + self.c_ae = None + self.uc_ae = None + + def title(self): + return "Refiner" + + def show(self, is_img2img): + return scripts.AlwaysVisible + + def build_model(self): + refiner_config = OmegaConf.load(sd_models_config.config_sdxl_refiner).model.params.network_config + self.model = instantiate_from_config(refiner_config) + self.model = get_obj_from_str(OPENAIUNETWRAPPER)( + self.model, compile_model=False + ).eval() + self.model.to('cpu', devices.dtype_unet) + self.model.train = disabled_train + self.model.diffusion_model.dtype = devices.dtype_unet + self.model.conditioning_key = 'crossattn' + self.model.cond_stage_key = 'txt' + self.model.parameterization = 'v' + discretization = sgm.modules.diffusionmodules.discretizer.LegacyDDPMDiscretization() + self.model.alphas_cumprod = torch.asarray(discretization.alphas_cumprod, device=devices.device, dtype=devices.dtype_unet) + for param in self.model.parameters(): + param.requires_grad = False + + def load_model(self, model_name): + if not shared.opts.disable_mmap_load_safetensors: + ckpt = load_file(sd_models.checkpoints_list[model_name].filename) + else: + ckpt = load(open(sd_models.checkpoints_list[model_name].filename, 'rb').read()) + model_type = '' + for key in ckpt.keys(): + if 'conditioner' in key: + model_type = 'Refiner' + if 'input_blocks.7.1.transformer_blocks.4.attn1.to_k.weight' in key: + model_type = 'Base' + break + if model_type != 'Refiner': + self.enable = False + script_callbacks.remove_current_script_callbacks() + if model_type == 'Base': + print('\nIt\'s Base model, use Refiner, extension disabled!\n') + else: + print('\nNot refiner, extension disabled!\n') + return False + + print('\nLoading refiner...\n') + self.build_model() + + state_dict = dict() + for key in ckpt.keys(): + if 'model.diffusion_model' in key: + state_dict[key.replace('model.d', 'd')] = ckpt[key].half() + self.model.load_state_dict(state_dict) + self.model_name = model_name + return True + + def ui(self, is_img2img): + with gr.Accordion(label='Refiner', open=False): + enable = gr.Checkbox(label='Enable Refiner', value=False) + with gr.Row(): + checkpoint = gr.Dropdown(choices=['None', *sd_models.checkpoints_list.keys()], label='Model', value='None') + steps = gr.Slider(minimum=0, maximum=50, step=1, label='Percent of refiner steps from total sampling steps', value=20) + + gr.HTML('

It\'s recommended to keep the percentage at 20% (80% base steps, 20% refiner steps). Higher values may result in distortions.

') + + ui = [enable, checkpoint, steps] + return ui + + def process(self, p, enable, checkpoint, steps): + if self.base != None or self.swapped == True or self.callback_set == True: + self.reset(p) + if not enable or checkpoint == 'None': + script_callbacks.remove_current_script_callbacks() + self.model = None + return + if self.model == None or self.model_name != checkpoint: + if not self.load_model(checkpoint): return + self.c_ae = self.embedder(torch.tensor(shared.opts.sdxl_refiner_high_aesthetic_score).unsqueeze(0).to(devices.device).repeat(p.batch_size, 1)) + self.uc_ae = self.embedder(torch.tensor(shared.opts.sdxl_refiner_low_aesthetic_score).unsqueeze(0).to(devices.device).repeat(p.batch_size, 1)) + p.extra_generation_params['Refiner model'] = checkpoint.rsplit('.', 1)[0] + p.extra_generation_params['Refiner steps'] = ceil((p.steps * (steps / 100))) + + def denoiser_callback(params: script_callbacks.CFGDenoiserParams): + if params.sampling_step > params.total_sampling_steps * (1 - steps / 100) - 2: + params.text_cond['vector'] = torch.cat((params.text_cond['vector'][:, :2304], self.c_ae), 1) + params.text_uncond['vector'] = torch.cat((params.text_uncond['vector'][:, :2304], self.uc_ae), 1) + params.text_cond['crossattn'] = params.text_cond['crossattn'][:, :, -1280:] + params.text_uncond['crossattn'] = params.text_uncond['crossattn'][:, :, -1280:] + if not self.swapped: + self.base = p.sd_model.model.to('cpu', devices.dtype_unet) + devices.torch_gc() + p.sd_model.model = self.model.to(devices.device, devices.dtype_unet) + self.swapped = True + + def denoised_callback(params: script_callbacks.CFGDenoiserParams): + if params.sampling_step == params.total_sampling_steps - 2: + self.reset(p, keep_hook=True) + + if not self.callback_set: + script_callbacks.on_cfg_denoiser(denoiser_callback) + script_callbacks.on_cfg_denoised(denoised_callback) + self.callback_set = True + + def reset(self, p, keep_hook=False): + if self.model is not None: + self.model.to('cpu', devices.dtype_unet) + p.sd_model.model = (self.base or p.sd_model.model).to(devices.device, devices.dtype_unet) + devices.torch_gc() + self.base = None + self.swapped = False + if not keep_hook: + script_callbacks.remove_current_script_callbacks() + self.callback_set = False + + def postprocess(self, p, processed, enable, checkpoint, steps): + if enable and checkpoint != 'None': + self.reset(p) + diff --git a/gf/sd-webui-rich-text/.gitignore b/gf/sd-webui-rich-text/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..074342a4f9899f1fe6d65a9235113bc66e5adb5f --- /dev/null +++ b/gf/sd-webui-rich-text/.gitignore @@ -0,0 +1,4 @@ +venv +__pycache__/ +*.pyc +gradio_cached_examples/ \ No newline at end of file diff --git a/gf/sd-webui-rich-text/README.md b/gf/sd-webui-rich-text/README.md new file mode 100644 index 0000000000000000000000000000000000000000..5ce9c306c5ef277d17c7f68469db241ba3e2537b --- /dev/null +++ b/gf/sd-webui-rich-text/README.md @@ -0,0 +1,49 @@ +# Rich-Text-to-Image for Stable Diffusion WebUI +#### [Project Page](https://rich-text-to-image.github.io/) | [Paper](https://arxiv.org/abs/2304.06720) | [Code](https://github.com/songweige/rich-text-to-image) | [HuggingFace Demo](https://huggingface.co/spaces/songweig/rich-text-to-image) | [Video](https://youtu.be/ihDbAUh0LXk) + +The WebUI extension for integrating a rich-text editor for text-to-image generation. + +![image](https://github.com/songweige/sd-webui-rich-text/assets/22885450/c57cf981-8332-41fb-8f47-b03238311ca4) + +This extension is for [AUTOMATIC1111's Stable Diffusion web UI](https://github.com/AUTOMATIC1111/stable-diffusion-webui), allows the Web UI to add [rich-text-to-image](https://rich-text-to-image.github.io/) to the original Stable Diffusion model to generate images. + +## Installation + +1. Open "Extensions" tab. +1. Open "Install from URL" tab in the tab. +1. Enter URL of this repo (https://github.com/songweige/sd-webui-rich-text) to "URL for extension's git repository". +1. Press "Install" button. +1. Restart Web UI. + +## Usage + +The extension now supports [SD-v1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5) (default), [SD-XL-v1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), and [ANIMAGINE-XL](https://huggingface.co/Linaqruf/animagine-xl). The checkpoints will be automatically downloaded when the model is first selected. + + +#### Font Color + +![color](assets/color.png) + +Font color is used to control the precise color of the generated objects. + +#### Footnote + +![footnote](assets/footnote.png) + +Footnotes provide supplementary descriptions for selected text elements. + +#### Font Style + +![style](assets/font.png) + +Just as the font style distinguishes the styles of individual text elements, it is used to define the artistic style of specific areas in the generation. + +#### Font Size + +![size](assets/size.png) + +Font size indicates the weight of each token in the final generation. + +## Acknowledgement + +The extension is built on the [extension-templates](https://github.com/udon-universe/stable-diffusion-webui-extension-templates). The rich-text editor is built on [Quill](https://quilljs.com/). The model code is built on [huggingface / diffusers](https://github.com/huggingface/diffusers#readme). diff --git a/gf/sd-webui-rich-text/__pycache__/share_btn.cpython-310.pyc b/gf/sd-webui-rich-text/__pycache__/share_btn.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..6a9cab5d43961a3b20f719c617a0a2ef7333a04f Binary files /dev/null and b/gf/sd-webui-rich-text/__pycache__/share_btn.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/assets/color.png b/gf/sd-webui-rich-text/assets/color.png new file mode 100644 index 0000000000000000000000000000000000000000..aa01631ca032b7c466bcda04427baf043f3793c8 --- /dev/null +++ b/gf/sd-webui-rich-text/assets/color.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:02c0054b37908ec3ed5f08271c9794695ef72061e8dffa2fbd1c81bcdd7f7608 +size 1808879 diff --git a/gf/sd-webui-rich-text/assets/font.png b/gf/sd-webui-rich-text/assets/font.png new file mode 100644 index 0000000000000000000000000000000000000000..790340fd094b49c3f4c11a6a0da33c8044c57c82 --- /dev/null +++ b/gf/sd-webui-rich-text/assets/font.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6532f8df043dd5abd6ecbc57511d47592ab1fdda6f11204ce4f2d49d7679d666 +size 1752028 diff --git a/gf/sd-webui-rich-text/assets/footnote.png b/gf/sd-webui-rich-text/assets/footnote.png new file mode 100644 index 0000000000000000000000000000000000000000..5b0ae8605bd0ff35188a48adf864d52d5428b3a1 --- /dev/null +++ b/gf/sd-webui-rich-text/assets/footnote.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0c5f4695baee5e75a593ac50fb6cd9efedcc8d3aeaf5c82fbe90734962bbb69 +size 1486571 diff --git a/gf/sd-webui-rich-text/assets/size.png b/gf/sd-webui-rich-text/assets/size.png new file mode 100644 index 0000000000000000000000000000000000000000..14e2f73b4b7be66c758747df7b1f93384e7f65f0 --- /dev/null +++ b/gf/sd-webui-rich-text/assets/size.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e577682573b36c021381e11371c4916145a8c8ca02224a80639741f81ff18955 +size 1867147 diff --git a/gf/sd-webui-rich-text/diffusers_official/__init__.py b/gf/sd-webui-rich-text/diffusers_official/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..f455dcc3961ebce98602ba3dcb9631f5c8f6178a --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/__init__.py @@ -0,0 +1,175 @@ +__version__ = "0.18.2" + +from .configuration_utils import ConfigMixin +from .utils import ( + OptionalDependencyNotAvailable, + is_flax_available, + is_inflect_available, + is_invisible_watermark_available, + is_k_diffusion_available, + is_k_diffusion_version, + is_librosa_available, + is_note_seq_available, + is_onnx_available, + is_scipy_available, + is_torch_available, + is_torchsde_available, + is_transformers_available, + is_transformers_version, + is_unidecode_available, + logging, +) + + +try: + if not is_onnx_available(): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_onnx_objects import * # noqa F403 +else: + from .pipelines import OnnxRuntimeModel + +try: + if not is_torch_available(): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_pt_objects import * # noqa F403 +else: + from .models import ( + AutoencoderKL, + ControlNetModel, + ModelMixin, + PriorTransformer, + T5FilmDecoder, + Transformer2DModel, + UNet1DModel, + UNet2DConditionModel, + UNet2DModel, + UNet3DConditionModel, + VQModel, + ) + from .optimization import ( + get_constant_schedule, + get_constant_schedule_with_warmup, + get_cosine_schedule_with_warmup, + get_cosine_with_hard_restarts_schedule_with_warmup, + get_linear_schedule_with_warmup, + get_polynomial_decay_schedule_with_warmup, + get_scheduler, + ) + from .schedulers import ( + CMStochasticIterativeScheduler, + DDIMInverseScheduler, + DDIMParallelScheduler, + DDIMScheduler, + DDPMParallelScheduler, + DDPMScheduler, + DEISMultistepScheduler, + DPMSolverMultistepInverseScheduler, + DPMSolverMultistepScheduler, + DPMSolverSinglestepScheduler, + EulerAncestralDiscreteScheduler, + EulerDiscreteScheduler, + HeunDiscreteScheduler, + IPNDMScheduler, + KarrasVeScheduler, + KDPM2AncestralDiscreteScheduler, + KDPM2DiscreteScheduler, + PNDMScheduler, + RePaintScheduler, + SchedulerMixin, + ScoreSdeVeScheduler, + UnCLIPScheduler, + UniPCMultistepScheduler, + VQDiffusionScheduler, + ) + from .training_utils import EMAModel + +try: + if not (is_torch_available() and is_scipy_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_torch_and_scipy_objects import * # noqa F403 +else: + from .schedulers import LMSDiscreteScheduler + +try: + if not (is_torch_available() and is_torchsde_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_torch_and_torchsde_objects import * # noqa F403 +else: + from .schedulers import DPMSolverSDEScheduler + +try: + if not (is_torch_available() and is_transformers_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_torch_and_transformers_objects import * # noqa F403 + +try: + if not (is_torch_available() and is_transformers_available() and is_invisible_watermark_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_torch_and_transformers_and_invisible_watermark_objects import * # noqa F403 +else: + from .pipelines import StableDiffusionXLImg2ImgPipeline, StableDiffusionXLPipeline + +try: + if not (is_torch_available() and is_transformers_available() and is_k_diffusion_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_torch_and_transformers_and_k_diffusion_objects import * # noqa F403 + +try: + if not (is_torch_available() and is_transformers_available() and is_onnx_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_torch_and_transformers_and_onnx_objects import * # noqa F403 + +try: + if not (is_torch_available() and is_librosa_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_torch_and_librosa_objects import * # noqa F403 + +try: + if not (is_transformers_available() and is_torch_available() and is_note_seq_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_transformers_and_torch_and_note_seq_objects import * # noqa F403 + +try: + if not is_flax_available(): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_flax_objects import * # noqa F403 +else: + from .models.controlnet_flax import FlaxControlNetModel + from .models.modeling_flax_utils import FlaxModelMixin + from .models.unet_2d_condition_flax import FlaxUNet2DConditionModel + from .models.vae_flax import FlaxAutoencoderKL + from .pipelines import FlaxDiffusionPipeline + from .schedulers import ( + FlaxDDIMScheduler, + FlaxDDPMScheduler, + FlaxDPMSolverMultistepScheduler, + FlaxKarrasVeScheduler, + FlaxLMSDiscreteScheduler, + FlaxPNDMScheduler, + FlaxSchedulerMixin, + FlaxScoreSdeVeScheduler, + ) + + +try: + if not (is_flax_available() and is_transformers_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_flax_and_transformers_objects import * # noqa F403 + +try: + if not (is_note_seq_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from .utils.dummy_note_seq_objects import * # noqa F403 diff --git a/gf/sd-webui-rich-text/diffusers_official/__pycache__/__init__.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/__pycache__/__init__.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..eb248b187fae6452d9dc179c78ef82cef0ab1fcb Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/__pycache__/__init__.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/__pycache__/configuration_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/__pycache__/configuration_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..3df1bb67f8402d73f1c16748b50a5ac3590f8dc3 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/__pycache__/configuration_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/__pycache__/image_processor.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/__pycache__/image_processor.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..08e463312de7dff6e9318d2306677bb88cf063c3 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/__pycache__/image_processor.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/__pycache__/loaders.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/__pycache__/loaders.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..d887c0da151fe0360f0557850a2eb4fda8fe6ced Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/__pycache__/loaders.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/__pycache__/optimization.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/__pycache__/optimization.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..f2c01a856d66ec7bcb893fec25a1e726933f3bfd Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/__pycache__/optimization.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/__pycache__/training_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/__pycache__/training_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..f10e4d9e06a8df90d330e796cedd8a248aaf4e30 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/__pycache__/training_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/configuration_utils.py b/gf/sd-webui-rich-text/diffusers_official/configuration_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..f5c8e8919c9fcd48de5a89e0664bd6c00643f515 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/configuration_utils.py @@ -0,0 +1,664 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" ConfigMixin base class and utilities.""" +import dataclasses +import functools +import importlib +import inspect +import json +import os +import re +from collections import OrderedDict +from pathlib import PosixPath +from typing import Any, Dict, Tuple, Union + +import numpy as np +from huggingface_hub import hf_hub_download +from huggingface_hub.utils import EntryNotFoundError, RepositoryNotFoundError, RevisionNotFoundError +from requests import HTTPError + +from . import __version__ +from .utils import ( + DIFFUSERS_CACHE, + HUGGINGFACE_CO_RESOLVE_ENDPOINT, + DummyObject, + deprecate, + extract_commit_hash, + http_user_agent, + logging, +) + + +logger = logging.get_logger(__name__) + +_re_configuration_file = re.compile(r"config\.(.*)\.json") + + +class FrozenDict(OrderedDict): + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + + for key, value in self.items(): + setattr(self, key, value) + + self.__frozen = True + + def __delitem__(self, *args, **kwargs): + raise Exception(f"You cannot use ``__delitem__`` on a {self.__class__.__name__} instance.") + + def setdefault(self, *args, **kwargs): + raise Exception(f"You cannot use ``setdefault`` on a {self.__class__.__name__} instance.") + + def pop(self, *args, **kwargs): + raise Exception(f"You cannot use ``pop`` on a {self.__class__.__name__} instance.") + + def update(self, *args, **kwargs): + raise Exception(f"You cannot use ``update`` on a {self.__class__.__name__} instance.") + + def __setattr__(self, name, value): + if hasattr(self, "__frozen") and self.__frozen: + raise Exception(f"You cannot use ``__setattr__`` on a {self.__class__.__name__} instance.") + super().__setattr__(name, value) + + def __setitem__(self, name, value): + if hasattr(self, "__frozen") and self.__frozen: + raise Exception(f"You cannot use ``__setattr__`` on a {self.__class__.__name__} instance.") + super().__setitem__(name, value) + + +class ConfigMixin: + r""" + Base class for all configuration classes. All configuration parameters are stored under `self.config`. Also + provides the [`~ConfigMixin.from_config`] and [`~ConfigMixin.save_config`] methods for loading, downloading, and + saving classes that inherit from [`ConfigMixin`]. + + Class attributes: + - **config_name** (`str`) -- A filename under which the config should stored when calling + [`~ConfigMixin.save_config`] (should be overridden by parent class). + - **ignore_for_config** (`List[str]`) -- A list of attributes that should not be saved in the config (should be + overridden by subclass). + - **has_compatibles** (`bool`) -- Whether the class has compatible classes (should be overridden by subclass). + - **_deprecated_kwargs** (`List[str]`) -- Keyword arguments that are deprecated. Note that the `init` function + should only have a `kwargs` argument if at least one argument is deprecated (should be overridden by + subclass). + """ + config_name = None + ignore_for_config = [] + has_compatibles = False + + _deprecated_kwargs = [] + + def register_to_config(self, **kwargs): + if self.config_name is None: + raise NotImplementedError(f"Make sure that {self.__class__} has defined a class name `config_name`") + # Special case for `kwargs` used in deprecation warning added to schedulers + # TODO: remove this when we remove the deprecation warning, and the `kwargs` argument, + # or solve in a more general way. + kwargs.pop("kwargs", None) + + if not hasattr(self, "_internal_dict"): + internal_dict = kwargs + else: + previous_dict = dict(self._internal_dict) + internal_dict = {**self._internal_dict, **kwargs} + logger.debug(f"Updating config from {previous_dict} to {internal_dict}") + + self._internal_dict = FrozenDict(internal_dict) + + def __getattr__(self, name: str) -> Any: + """The only reason we overwrite `getattr` here is to gracefully deprecate accessing + config attributes directly. See https://github.com/huggingface/diffusers/pull/3129 + + Tihs funtion is mostly copied from PyTorch's __getattr__ overwrite: + https://pytorch.org/docs/stable/_modules/torch/nn/modules/module.html#Module + """ + + is_in_config = "_internal_dict" in self.__dict__ and hasattr(self.__dict__["_internal_dict"], name) + is_attribute = name in self.__dict__ + + if is_in_config and not is_attribute: + deprecation_message = f"Accessing config attribute `{name}` directly via '{type(self).__name__}' object attribute is deprecated. Please access '{name}' over '{type(self).__name__}'s config object instead, e.g. 'scheduler.config.{name}'." + deprecate("direct config name access", "1.0.0", deprecation_message, standard_warn=False) + return self._internal_dict[name] + + raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'") + + def save_config(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs): + """ + Save a configuration object to the directory specified in `save_directory` so that it can be reloaded using the + [`~ConfigMixin.from_config`] class method. + + Args: + save_directory (`str` or `os.PathLike`): + Directory where the configuration JSON file is saved (will be created if it does not exist). + """ + if os.path.isfile(save_directory): + raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file") + + os.makedirs(save_directory, exist_ok=True) + + # If we save using the predefined names, we can load using `from_config` + output_config_file = os.path.join(save_directory, self.config_name) + + self.to_json_file(output_config_file) + logger.info(f"Configuration saved in {output_config_file}") + + @classmethod + def from_config(cls, config: Union[FrozenDict, Dict[str, Any]] = None, return_unused_kwargs=False, **kwargs): + r""" + Instantiate a Python class from a config dictionary. + + Parameters: + config (`Dict[str, Any]`): + A config dictionary from which the Python class is instantiated. Make sure to only load configuration + files of compatible classes. + return_unused_kwargs (`bool`, *optional*, defaults to `False`): + Whether kwargs that are not consumed by the Python class should be returned or not. + kwargs (remaining dictionary of keyword arguments, *optional*): + Can be used to update the configuration object (after it is loaded) and initiate the Python class. + `**kwargs` are passed directly to the underlying scheduler/model's `__init__` method and eventually + overwrite the same named arguments in `config`. + + Returns: + [`ModelMixin`] or [`SchedulerMixin`]: + A model or scheduler object instantiated from a config dictionary. + + Examples: + + ```python + >>> from diffusers import DDPMScheduler, DDIMScheduler, PNDMScheduler + + >>> # Download scheduler from huggingface.co and cache. + >>> scheduler = DDPMScheduler.from_pretrained("google/ddpm-cifar10-32") + + >>> # Instantiate DDIM scheduler class with same config as DDPM + >>> scheduler = DDIMScheduler.from_config(scheduler.config) + + >>> # Instantiate PNDM scheduler class with same config as DDPM + >>> scheduler = PNDMScheduler.from_config(scheduler.config) + ``` + """ + # <===== TO BE REMOVED WITH DEPRECATION + # TODO(Patrick) - make sure to remove the following lines when config=="model_path" is deprecated + if "pretrained_model_name_or_path" in kwargs: + config = kwargs.pop("pretrained_model_name_or_path") + + if config is None: + raise ValueError("Please make sure to provide a config as the first positional argument.") + # ======> + + if not isinstance(config, dict): + deprecation_message = "It is deprecated to pass a pretrained model name or path to `from_config`." + if "Scheduler" in cls.__name__: + deprecation_message += ( + f"If you were trying to load a scheduler, please use {cls}.from_pretrained(...) instead." + " Otherwise, please make sure to pass a configuration dictionary instead. This functionality will" + " be removed in v1.0.0." + ) + elif "Model" in cls.__name__: + deprecation_message += ( + f"If you were trying to load a model, please use {cls}.load_config(...) followed by" + f" {cls}.from_config(...) instead. Otherwise, please make sure to pass a configuration dictionary" + " instead. This functionality will be removed in v1.0.0." + ) + deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False) + config, kwargs = cls.load_config(pretrained_model_name_or_path=config, return_unused_kwargs=True, **kwargs) + + init_dict, unused_kwargs, hidden_dict = cls.extract_init_dict(config, **kwargs) + + # Allow dtype to be specified on initialization + if "dtype" in unused_kwargs: + init_dict["dtype"] = unused_kwargs.pop("dtype") + + # add possible deprecated kwargs + for deprecated_kwarg in cls._deprecated_kwargs: + if deprecated_kwarg in unused_kwargs: + init_dict[deprecated_kwarg] = unused_kwargs.pop(deprecated_kwarg) + + # Return model and optionally state and/or unused_kwargs + model = cls(**init_dict) + + # make sure to also save config parameters that might be used for compatible classes + model.register_to_config(**hidden_dict) + + # add hidden kwargs of compatible classes to unused_kwargs + unused_kwargs = {**unused_kwargs, **hidden_dict} + + if return_unused_kwargs: + return (model, unused_kwargs) + else: + return model + + @classmethod + def get_config_dict(cls, *args, **kwargs): + deprecation_message = ( + f" The function get_config_dict is deprecated. Please use {cls}.load_config instead. This function will be" + " removed in version v1.0.0" + ) + deprecate("get_config_dict", "1.0.0", deprecation_message, standard_warn=False) + return cls.load_config(*args, **kwargs) + + @classmethod + def load_config( + cls, + pretrained_model_name_or_path: Union[str, os.PathLike], + return_unused_kwargs=False, + return_commit_hash=False, + **kwargs, + ) -> Tuple[Dict[str, Any], Dict[str, Any]]: + r""" + Load a model or scheduler configuration. + + Parameters: + pretrained_model_name_or_path (`str` or `os.PathLike`, *optional*): + Can be either: + + - A string, the *model id* (for example `google/ddpm-celebahq-256`) of a pretrained model hosted on + the Hub. + - A path to a *directory* (for example `./my_model_directory`) containing model weights saved with + [`~ConfigMixin.save_config`]. + + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory where a downloaded pretrained model configuration is cached if the standard cache + is not used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + output_loading_info(`bool`, *optional*, defaults to `False`): + Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. + local_files_only (`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to `True`, the model + won't be downloaded from the Hub. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from + `diffusers-cli login` (stored in `~/.huggingface`) is used. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + subfolder (`str`, *optional*, defaults to `""`): + The subfolder location of a model file within a larger model repository on the Hub or locally. + return_unused_kwargs (`bool`, *optional*, defaults to `False): + Whether unused keyword arguments of the config are returned. + return_commit_hash (`bool`, *optional*, defaults to `False): + Whether the `commit_hash` of the loaded configuration are returned. + + Returns: + `dict`: + A dictionary of all the parameters stored in a JSON configuration file. + + """ + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + force_download = kwargs.pop("force_download", False) + resume_download = kwargs.pop("resume_download", False) + proxies = kwargs.pop("proxies", None) + use_auth_token = kwargs.pop("use_auth_token", None) + local_files_only = kwargs.pop("local_files_only", False) + revision = kwargs.pop("revision", None) + _ = kwargs.pop("mirror", None) + subfolder = kwargs.pop("subfolder", None) + user_agent = kwargs.pop("user_agent", {}) + + user_agent = {**user_agent, "file_type": "config"} + user_agent = http_user_agent(user_agent) + + pretrained_model_name_or_path = str(pretrained_model_name_or_path) + + if cls.config_name is None: + raise ValueError( + "`self.config_name` is not defined. Note that one should not load a config from " + "`ConfigMixin`. Please make sure to define `config_name` in a class inheriting from `ConfigMixin`" + ) + + if os.path.isfile(pretrained_model_name_or_path): + config_file = pretrained_model_name_or_path + elif os.path.isdir(pretrained_model_name_or_path): + if os.path.isfile(os.path.join(pretrained_model_name_or_path, cls.config_name)): + # Load from a PyTorch checkpoint + config_file = os.path.join(pretrained_model_name_or_path, cls.config_name) + elif subfolder is not None and os.path.isfile( + os.path.join(pretrained_model_name_or_path, subfolder, cls.config_name) + ): + config_file = os.path.join(pretrained_model_name_or_path, subfolder, cls.config_name) + else: + raise EnvironmentError( + f"Error no file named {cls.config_name} found in directory {pretrained_model_name_or_path}." + ) + else: + try: + # Load from URL or cache if already cached + config_file = hf_hub_download( + pretrained_model_name_or_path, + filename=cls.config_name, + cache_dir=cache_dir, + force_download=force_download, + proxies=proxies, + resume_download=resume_download, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + user_agent=user_agent, + subfolder=subfolder, + revision=revision, + ) + except RepositoryNotFoundError: + raise EnvironmentError( + f"{pretrained_model_name_or_path} is not a local folder and is not a valid model identifier" + " listed on 'https://huggingface.co/models'\nIf this is a private repository, make sure to pass a" + " token having permission to this repo with `use_auth_token` or log in with `huggingface-cli" + " login`." + ) + except RevisionNotFoundError: + raise EnvironmentError( + f"{revision} is not a valid git identifier (branch name, tag name or commit id) that exists for" + " this model name. Check the model page at" + f" 'https://huggingface.co/{pretrained_model_name_or_path}' for available revisions." + ) + except EntryNotFoundError: + raise EnvironmentError( + f"{pretrained_model_name_or_path} does not appear to have a file named {cls.config_name}." + ) + except HTTPError as err: + raise EnvironmentError( + "There was a specific connection error when trying to load" + f" {pretrained_model_name_or_path}:\n{err}" + ) + except ValueError: + raise EnvironmentError( + f"We couldn't connect to '{HUGGINGFACE_CO_RESOLVE_ENDPOINT}' to load this model, couldn't find it" + f" in the cached files and it looks like {pretrained_model_name_or_path} is not the path to a" + f" directory containing a {cls.config_name} file.\nCheckout your internet connection or see how to" + " run the library in offline mode at" + " 'https://huggingface.co/docs/diffusers/installation#offline-mode'." + ) + except EnvironmentError: + raise EnvironmentError( + f"Can't load config for '{pretrained_model_name_or_path}'. If you were trying to load it from " + "'https://huggingface.co/models', make sure you don't have a local directory with the same name. " + f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory " + f"containing a {cls.config_name} file" + ) + + try: + # Load config dict + config_dict = cls._dict_from_json_file(config_file) + + commit_hash = extract_commit_hash(config_file) + except (json.JSONDecodeError, UnicodeDecodeError): + raise EnvironmentError(f"It looks like the config file at '{config_file}' is not a valid JSON file.") + + if not (return_unused_kwargs or return_commit_hash): + return config_dict + + outputs = (config_dict,) + + if return_unused_kwargs: + outputs += (kwargs,) + + if return_commit_hash: + outputs += (commit_hash,) + + return outputs + + @staticmethod + def _get_init_keys(cls): + return set(dict(inspect.signature(cls.__init__).parameters).keys()) + + @classmethod + def extract_init_dict(cls, config_dict, **kwargs): + # Skip keys that were not present in the original config, so default __init__ values were used + used_defaults = config_dict.get("_use_default_values", []) + config_dict = {k: v for k, v in config_dict.items() if k not in used_defaults and k != "_use_default_values"} + + # 0. Copy origin config dict + original_dict = dict(config_dict.items()) + + # 1. Retrieve expected config attributes from __init__ signature + expected_keys = cls._get_init_keys(cls) + expected_keys.remove("self") + # remove general kwargs if present in dict + if "kwargs" in expected_keys: + expected_keys.remove("kwargs") + # remove flax internal keys + if hasattr(cls, "_flax_internal_args"): + for arg in cls._flax_internal_args: + expected_keys.remove(arg) + + # 2. Remove attributes that cannot be expected from expected config attributes + # remove keys to be ignored + if len(cls.ignore_for_config) > 0: + expected_keys = expected_keys - set(cls.ignore_for_config) + + # load diffusers library to import compatible and original scheduler + diffusers_library = importlib.import_module(__name__.split(".")[0]) + + if cls.has_compatibles: + compatible_classes = [c for c in cls._get_compatibles() if not isinstance(c, DummyObject)] + else: + compatible_classes = [] + + expected_keys_comp_cls = set() + for c in compatible_classes: + expected_keys_c = cls._get_init_keys(c) + expected_keys_comp_cls = expected_keys_comp_cls.union(expected_keys_c) + expected_keys_comp_cls = expected_keys_comp_cls - cls._get_init_keys(cls) + config_dict = {k: v for k, v in config_dict.items() if k not in expected_keys_comp_cls} + + # remove attributes from orig class that cannot be expected + orig_cls_name = config_dict.pop("_class_name", cls.__name__) + if orig_cls_name != cls.__name__ and hasattr(diffusers_library, orig_cls_name): + orig_cls = getattr(diffusers_library, orig_cls_name) + unexpected_keys_from_orig = cls._get_init_keys(orig_cls) - expected_keys + config_dict = {k: v for k, v in config_dict.items() if k not in unexpected_keys_from_orig} + + # remove private attributes + config_dict = {k: v for k, v in config_dict.items() if not k.startswith("_")} + + # 3. Create keyword arguments that will be passed to __init__ from expected keyword arguments + init_dict = {} + for key in expected_keys: + # if config param is passed to kwarg and is present in config dict + # it should overwrite existing config dict key + if key in kwargs and key in config_dict: + config_dict[key] = kwargs.pop(key) + + if key in kwargs: + # overwrite key + init_dict[key] = kwargs.pop(key) + elif key in config_dict: + # use value from config dict + init_dict[key] = config_dict.pop(key) + + # 4. Give nice warning if unexpected values have been passed + if len(config_dict) > 0: + logger.warning( + f"The config attributes {config_dict} were passed to {cls.__name__}, " + "but are not expected and will be ignored. Please verify your " + f"{cls.config_name} configuration file." + ) + + # 5. Give nice info if config attributes are initiliazed to default because they have not been passed + passed_keys = set(init_dict.keys()) + if len(expected_keys - passed_keys) > 0: + logger.info( + f"{expected_keys - passed_keys} was not found in config. Values will be initialized to default values." + ) + + # 6. Define unused keyword arguments + unused_kwargs = {**config_dict, **kwargs} + + # 7. Define "hidden" config parameters that were saved for compatible classes + hidden_config_dict = {k: v for k, v in original_dict.items() if k not in init_dict} + + return init_dict, unused_kwargs, hidden_config_dict + + @classmethod + def _dict_from_json_file(cls, json_file: Union[str, os.PathLike]): + with open(json_file, "r", encoding="utf-8") as reader: + text = reader.read() + return json.loads(text) + + def __repr__(self): + return f"{self.__class__.__name__} {self.to_json_string()}" + + @property + def config(self) -> Dict[str, Any]: + """ + Returns the config of the class as a frozen dictionary + + Returns: + `Dict[str, Any]`: Config of the class. + """ + return self._internal_dict + + def to_json_string(self) -> str: + """ + Serializes the configuration instance to a JSON string. + + Returns: + `str`: + String containing all the attributes that make up the configuration instance in JSON format. + """ + config_dict = self._internal_dict if hasattr(self, "_internal_dict") else {} + config_dict["_class_name"] = self.__class__.__name__ + config_dict["_diffusers_version"] = __version__ + + def to_json_saveable(value): + if isinstance(value, np.ndarray): + value = value.tolist() + elif isinstance(value, PosixPath): + value = str(value) + return value + + config_dict = {k: to_json_saveable(v) for k, v in config_dict.items()} + # Don't save "_ignore_files" or "_use_default_values" + config_dict.pop("_ignore_files", None) + config_dict.pop("_use_default_values", None) + + return json.dumps(config_dict, indent=2, sort_keys=True) + "\n" + + def to_json_file(self, json_file_path: Union[str, os.PathLike]): + """ + Save the configuration instance's parameters to a JSON file. + + Args: + json_file_path (`str` or `os.PathLike`): + Path to the JSON file to save a configuration instance's parameters. + """ + with open(json_file_path, "w", encoding="utf-8") as writer: + writer.write(self.to_json_string()) + + +def register_to_config(init): + r""" + Decorator to apply on the init of classes inheriting from [`ConfigMixin`] so that all the arguments are + automatically sent to `self.register_for_config`. To ignore a specific argument accepted by the init but that + shouldn't be registered in the config, use the `ignore_for_config` class variable + + Warning: Once decorated, all private arguments (beginning with an underscore) are trashed and not sent to the init! + """ + + @functools.wraps(init) + def inner_init(self, *args, **kwargs): + # Ignore private kwargs in the init. + init_kwargs = {k: v for k, v in kwargs.items() if not k.startswith("_")} + config_init_kwargs = {k: v for k, v in kwargs.items() if k.startswith("_")} + if not isinstance(self, ConfigMixin): + raise RuntimeError( + f"`@register_for_config` was applied to {self.__class__.__name__} init method, but this class does " + "not inherit from `ConfigMixin`." + ) + + ignore = getattr(self, "ignore_for_config", []) + # Get positional arguments aligned with kwargs + new_kwargs = {} + signature = inspect.signature(init) + parameters = { + name: p.default for i, (name, p) in enumerate(signature.parameters.items()) if i > 0 and name not in ignore + } + for arg, name in zip(args, parameters.keys()): + new_kwargs[name] = arg + + # Then add all kwargs + new_kwargs.update( + { + k: init_kwargs.get(k, default) + for k, default in parameters.items() + if k not in ignore and k not in new_kwargs + } + ) + + # Take note of the parameters that were not present in the loaded config + if len(set(new_kwargs.keys()) - set(init_kwargs)) > 0: + new_kwargs["_use_default_values"] = list(set(new_kwargs.keys()) - set(init_kwargs)) + + new_kwargs = {**config_init_kwargs, **new_kwargs} + getattr(self, "register_to_config")(**new_kwargs) + init(self, *args, **init_kwargs) + + return inner_init + + +def flax_register_to_config(cls): + original_init = cls.__init__ + + @functools.wraps(original_init) + def init(self, *args, **kwargs): + if not isinstance(self, ConfigMixin): + raise RuntimeError( + f"`@register_for_config` was applied to {self.__class__.__name__} init method, but this class does " + "not inherit from `ConfigMixin`." + ) + + # Ignore private kwargs in the init. Retrieve all passed attributes + init_kwargs = dict(kwargs.items()) + + # Retrieve default values + fields = dataclasses.fields(self) + default_kwargs = {} + for field in fields: + # ignore flax specific attributes + if field.name in self._flax_internal_args: + continue + if type(field.default) == dataclasses._MISSING_TYPE: + default_kwargs[field.name] = None + else: + default_kwargs[field.name] = getattr(self, field.name) + + # Make sure init_kwargs override default kwargs + new_kwargs = {**default_kwargs, **init_kwargs} + # dtype should be part of `init_kwargs`, but not `new_kwargs` + if "dtype" in new_kwargs: + new_kwargs.pop("dtype") + + # Get positional arguments aligned with kwargs + for i, arg in enumerate(args): + name = fields[i].name + new_kwargs[name] = arg + + # Take note of the parameters that were not present in the loaded config + if len(set(new_kwargs.keys()) - set(init_kwargs)) > 0: + new_kwargs["_use_default_values"] = list(set(new_kwargs.keys()) - set(init_kwargs)) + + getattr(self, "register_to_config")(**new_kwargs) + original_init(self, *args, **kwargs) + + cls.__init__ = init + return cls diff --git a/gf/sd-webui-rich-text/diffusers_official/dependency_versions_check.py b/gf/sd-webui-rich-text/diffusers_official/dependency_versions_check.py new file mode 100644 index 0000000000000000000000000000000000000000..4f8578c52957bf6c06decb0d97d3139437f0078f --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/dependency_versions_check.py @@ -0,0 +1,47 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import sys + +from .dependency_versions_table import deps +from .utils.versions import require_version, require_version_core + + +# define which module versions we always want to check at run time +# (usually the ones defined in `install_requires` in setup.py) +# +# order specific notes: +# - tqdm must be checked before tokenizers + +pkgs_to_check_at_runtime = "python tqdm regex requests packaging filelock numpy tokenizers".split() +if sys.version_info < (3, 7): + pkgs_to_check_at_runtime.append("dataclasses") +if sys.version_info < (3, 8): + pkgs_to_check_at_runtime.append("importlib_metadata") + +for pkg in pkgs_to_check_at_runtime: + if pkg in deps: + if pkg == "tokenizers": + # must be loaded here, or else tqdm check may fail + from .utils import is_tokenizers_available + + if not is_tokenizers_available(): + continue # not required, check version only if installed + + require_version_core(deps[pkg]) + else: + raise ValueError(f"can't find {pkg} in {deps.keys()}, check dependency_versions_table.py") + + +def dep_version_check(pkg, hint=None): + require_version(deps[pkg], hint) diff --git a/gf/sd-webui-rich-text/diffusers_official/dependency_versions_table.py b/gf/sd-webui-rich-text/diffusers_official/dependency_versions_table.py new file mode 100644 index 0000000000000000000000000000000000000000..68c8e914d9405f4295d2d935b5e16a9d09bb4a63 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/dependency_versions_table.py @@ -0,0 +1,44 @@ +# THIS FILE HAS BEEN AUTOGENERATED. To update: +# 1. modify the `_deps` dict in setup.py +# 2. run `make deps_table_update`` +deps = { + "Pillow": "Pillow", + "accelerate": "accelerate>=0.11.0", + "compel": "compel==0.1.8", + "black": "black~=23.1", + "datasets": "datasets", + "filelock": "filelock", + "flax": "flax>=0.4.1", + "hf-doc-builder": "hf-doc-builder>=0.3.0", + "huggingface-hub": "huggingface-hub>=0.13.2", + "requests-mock": "requests-mock==1.10.0", + "importlib_metadata": "importlib_metadata", + "invisible-watermark": "invisible-watermark", + "isort": "isort>=5.5.4", + "jax": "jax>=0.2.8,!=0.3.2", + "jaxlib": "jaxlib>=0.1.65", + "Jinja2": "Jinja2", + "k-diffusion": "k-diffusion>=0.0.12", + "torchsde": "torchsde", + "note_seq": "note_seq", + "librosa": "librosa", + "numpy": "numpy", + "omegaconf": "omegaconf", + "parameterized": "parameterized", + "protobuf": "protobuf>=3.20.3,<4", + "pytest": "pytest", + "pytest-timeout": "pytest-timeout", + "pytest-xdist": "pytest-xdist", + "ruff": "ruff>=0.0.241", + "safetensors": "safetensors", + "sentencepiece": "sentencepiece>=0.1.91,!=0.1.92", + "scipy": "scipy", + "onnx": "onnx", + "regex": "regex!=2019.12.17", + "requests": "requests", + "tensorboard": "tensorboard", + "torch": "torch>=1.4", + "torchvision": "torchvision", + "transformers": "transformers>=4.25.1", + "urllib3": "urllib3<=2.0.0", +} diff --git a/gf/sd-webui-rich-text/diffusers_official/image_processor.py b/gf/sd-webui-rich-text/diffusers_official/image_processor.py new file mode 100644 index 0000000000000000000000000000000000000000..6ccf9b465ebd4cd6ce48a40dfe45bbc70d1f3416 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/image_processor.py @@ -0,0 +1,366 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import warnings +from typing import List, Optional, Union + +import numpy as np +import PIL +import torch +from PIL import Image + +from .configuration_utils import ConfigMixin, register_to_config +from .utils import CONFIG_NAME, PIL_INTERPOLATION, deprecate + + +class VaeImageProcessor(ConfigMixin): + """ + Image processor for VAE. + + Args: + do_resize (`bool`, *optional*, defaults to `True`): + Whether to downscale the image's (height, width) dimensions to multiples of `vae_scale_factor`. Can accept + `height` and `width` arguments from [`image_processor.VaeImageProcessor.preprocess`] method. + vae_scale_factor (`int`, *optional*, defaults to `8`): + VAE scale factor. If `do_resize` is `True`, the image is automatically resized to multiples of this factor. + resample (`str`, *optional*, defaults to `lanczos`): + Resampling filter to use when resizing the image. + do_normalize (`bool`, *optional*, defaults to `True`): + Whether to normalize the image to [-1,1]. + do_convert_rgb (`bool`, *optional*, defaults to be `False`): + Whether to convert the images to RGB format. + """ + + config_name = CONFIG_NAME + + @register_to_config + def __init__( + self, + do_resize: bool = True, + vae_scale_factor: int = 8, + resample: str = "lanczos", + do_normalize: bool = True, + do_convert_rgb: bool = False, + ): + super().__init__() + + @staticmethod + def numpy_to_pil(images: np.ndarray) -> PIL.Image.Image: + """ + Convert a numpy image or a batch of images to a PIL image. + """ + if images.ndim == 3: + images = images[None, ...] + images = (images * 255).round().astype("uint8") + if images.shape[-1] == 1: + # special case for grayscale (single channel) images + pil_images = [Image.fromarray(image.squeeze(), mode="L") for image in images] + else: + pil_images = [Image.fromarray(image) for image in images] + + return pil_images + + @staticmethod + def pil_to_numpy(images: Union[List[PIL.Image.Image], PIL.Image.Image]) -> np.ndarray: + """ + Convert a PIL image or a list of PIL images to NumPy arrays. + """ + if not isinstance(images, list): + images = [images] + images = [np.array(image).astype(np.float32) / 255.0 for image in images] + images = np.stack(images, axis=0) + + return images + + @staticmethod + def numpy_to_pt(images: np.ndarray) -> torch.FloatTensor: + """ + Convert a NumPy image to a PyTorch tensor. + """ + if images.ndim == 3: + images = images[..., None] + + images = torch.from_numpy(images.transpose(0, 3, 1, 2)) + return images + + @staticmethod + def pt_to_numpy(images: torch.FloatTensor) -> np.ndarray: + """ + Convert a PyTorch tensor to a NumPy image. + """ + images = images.cpu().permute(0, 2, 3, 1).float().numpy() + return images + + @staticmethod + def normalize(images): + """ + Normalize an image array to [-1,1]. + """ + return 2.0 * images - 1.0 + + @staticmethod + def denormalize(images): + """ + Denormalize an image array to [0,1]. + """ + return (images / 2 + 0.5).clamp(0, 1) + + @staticmethod + def convert_to_rgb(image: PIL.Image.Image) -> PIL.Image.Image: + """ + Converts an image to RGB format. + """ + image = image.convert("RGB") + return image + + def resize( + self, + image: PIL.Image.Image, + height: Optional[int] = None, + width: Optional[int] = None, + ) -> PIL.Image.Image: + """ + Resize a PIL image. Both height and width are downscaled to the next integer multiple of `vae_scale_factor`. + """ + if height is None: + height = image.height + if width is None: + width = image.width + + width, height = ( + x - x % self.config.vae_scale_factor for x in (width, height) + ) # resize to integer multiple of vae_scale_factor + image = image.resize((width, height), resample=PIL_INTERPOLATION[self.config.resample]) + return image + + def preprocess( + self, + image: Union[torch.FloatTensor, PIL.Image.Image, np.ndarray], + height: Optional[int] = None, + width: Optional[int] = None, + ) -> torch.Tensor: + """ + Preprocess the image input. Accepted formats are PIL images, NumPy arrays or PyTorch tensors. + """ + supported_formats = (PIL.Image.Image, np.ndarray, torch.Tensor) + if isinstance(image, supported_formats): + image = [image] + elif not (isinstance(image, list) and all(isinstance(i, supported_formats) for i in image)): + raise ValueError( + f"Input is in incorrect format: {[type(i) for i in image]}. Currently, we only support {', '.join(supported_formats)}" + ) + + if isinstance(image[0], PIL.Image.Image): + if self.config.do_convert_rgb: + image = [self.convert_to_rgb(i) for i in image] + if self.config.do_resize: + image = [self.resize(i, height, width) for i in image] + image = self.pil_to_numpy(image) # to np + image = self.numpy_to_pt(image) # to pt + + elif isinstance(image[0], np.ndarray): + image = np.concatenate(image, axis=0) if image[0].ndim == 4 else np.stack(image, axis=0) + image = self.numpy_to_pt(image) + _, _, height, width = image.shape + if self.config.do_resize and ( + height % self.config.vae_scale_factor != 0 or width % self.config.vae_scale_factor != 0 + ): + raise ValueError( + f"Currently we only support resizing for PIL image - please resize your numpy array to be divisible by {self.config.vae_scale_factor}" + f"currently the sizes are {height} and {width}. You can also pass a PIL image instead to use resize option in VAEImageProcessor" + ) + + elif isinstance(image[0], torch.Tensor): + image = torch.cat(image, axis=0) if image[0].ndim == 4 else torch.stack(image, axis=0) + _, channel, height, width = image.shape + + # don't need any preprocess if the image is latents + if channel == 4: + return image + + if self.config.do_resize and ( + height % self.config.vae_scale_factor != 0 or width % self.config.vae_scale_factor != 0 + ): + raise ValueError( + f"Currently we only support resizing for PIL image - please resize your pytorch tensor to be divisible by {self.config.vae_scale_factor}" + f"currently the sizes are {height} and {width}. You can also pass a PIL image instead to use resize option in VAEImageProcessor" + ) + + # expected range [0,1], normalize to [-1,1] + do_normalize = self.config.do_normalize + if image.min() < 0: + warnings.warn( + "Passing `image` as torch tensor with value range in [-1,1] is deprecated. The expected value range for image tensor is [0,1] " + f"when passing as pytorch tensor or numpy Array. You passed `image` with value range [{image.min()},{image.max()}]", + FutureWarning, + ) + do_normalize = False + + if do_normalize: + image = self.normalize(image) + + return image + + def postprocess( + self, + image: torch.FloatTensor, + output_type: str = "pil", + do_denormalize: Optional[List[bool]] = None, + ): + if not isinstance(image, torch.Tensor): + raise ValueError( + f"Input for postprocessing is in incorrect format: {type(image)}. We only support pytorch tensor" + ) + if output_type not in ["latent", "pt", "np", "pil"]: + deprecation_message = ( + f"the output_type {output_type} is outdated and has been set to `np`. Please make sure to set it to one of these instead: " + "`pil`, `np`, `pt`, `latent`" + ) + deprecate("Unsupported output_type", "1.0.0", deprecation_message, standard_warn=False) + output_type = "np" + + if output_type == "latent": + return image + + if do_denormalize is None: + do_denormalize = [self.config.do_normalize] * image.shape[0] + + image = torch.stack( + [self.denormalize(image[i]) if do_denormalize[i] else image[i] for i in range(image.shape[0])] + ) + + if output_type == "pt": + return image + + image = self.pt_to_numpy(image) + + if output_type == "np": + return image + + if output_type == "pil": + return self.numpy_to_pil(image) + + +class VaeImageProcessorLDM3D(VaeImageProcessor): + """ + Image processor for VAE LDM3D. + + Args: + do_resize (`bool`, *optional*, defaults to `True`): + Whether to downscale the image's (height, width) dimensions to multiples of `vae_scale_factor`. + vae_scale_factor (`int`, *optional*, defaults to `8`): + VAE scale factor. If `do_resize` is `True`, the image is automatically resized to multiples of this factor. + resample (`str`, *optional*, defaults to `lanczos`): + Resampling filter to use when resizing the image. + do_normalize (`bool`, *optional*, defaults to `True`): + Whether to normalize the image to [-1,1]. + """ + + config_name = CONFIG_NAME + + @register_to_config + def __init__( + self, + do_resize: bool = True, + vae_scale_factor: int = 8, + resample: str = "lanczos", + do_normalize: bool = True, + ): + super().__init__() + + @staticmethod + def numpy_to_pil(images): + """ + Convert a NumPy image or a batch of images to a PIL image. + """ + if images.ndim == 3: + images = images[None, ...] + images = (images * 255).round().astype("uint8") + if images.shape[-1] == 1: + # special case for grayscale (single channel) images + pil_images = [Image.fromarray(image.squeeze(), mode="L") for image in images] + else: + pil_images = [Image.fromarray(image[:, :, :3]) for image in images] + + return pil_images + + @staticmethod + def rgblike_to_depthmap(image): + """ + Args: + image: RGB-like depth image + + Returns: depth map + + """ + return image[:, :, 1] * 2**8 + image[:, :, 2] + + def numpy_to_depth(self, images): + """ + Convert a NumPy depth image or a batch of images to a PIL image. + """ + if images.ndim == 3: + images = images[None, ...] + images_depth = images[:, :, :, 3:] + if images.shape[-1] == 6: + images_depth = (images_depth * 255).round().astype("uint8") + pil_images = [ + Image.fromarray(self.rgblike_to_depthmap(image_depth), mode="I;16") for image_depth in images_depth + ] + elif images.shape[-1] == 4: + images_depth = (images_depth * 65535.0).astype(np.uint16) + pil_images = [Image.fromarray(image_depth, mode="I;16") for image_depth in images_depth] + else: + raise Exception("Not supported") + + return pil_images + + def postprocess( + self, + image: torch.FloatTensor, + output_type: str = "pil", + do_denormalize: Optional[List[bool]] = None, + ): + if not isinstance(image, torch.Tensor): + raise ValueError( + f"Input for postprocessing is in incorrect format: {type(image)}. We only support pytorch tensor" + ) + if output_type not in ["latent", "pt", "np", "pil"]: + deprecation_message = ( + f"the output_type {output_type} is outdated and has been set to `np`. Please make sure to set it to one of these instead: " + "`pil`, `np`, `pt`, `latent`" + ) + deprecate("Unsupported output_type", "1.0.0", deprecation_message, standard_warn=False) + output_type = "np" + + if do_denormalize is None: + do_denormalize = [self.config.do_normalize] * image.shape[0] + + image = torch.stack( + [self.denormalize(image[i]) if do_denormalize[i] else image[i] for i in range(image.shape[0])] + ) + + image = self.pt_to_numpy(image) + + if output_type == "np": + if image.shape[-1] == 6: + image_depth = np.stack([self.rgblike_to_depthmap(im[:, :, 3:]) for im in image], axis=0) + else: + image_depth = image[:, :, :, 3:] + return image[:, :, :, :3], image_depth + + if output_type == "pil": + return self.numpy_to_pil(image), self.numpy_to_depth(image) + else: + raise Exception(f"This type {output_type} is not supported") diff --git a/gf/sd-webui-rich-text/diffusers_official/loaders.py b/gf/sd-webui-rich-text/diffusers_official/loaders.py new file mode 100644 index 0000000000000000000000000000000000000000..5d452b3b683663b97d7ad19493fe9b857cc3ccd7 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/loaders.py @@ -0,0 +1,1492 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import warnings +from collections import defaultdict +from pathlib import Path +from typing import Callable, Dict, List, Optional, Union + +import torch +import torch.nn.functional as F +from huggingface_hub import hf_hub_download + +from .models.attention_processor import ( + AttnAddedKVProcessor, + AttnAddedKVProcessor2_0, + CustomDiffusionAttnProcessor, + CustomDiffusionXFormersAttnProcessor, + LoRAAttnAddedKVProcessor, + LoRAAttnProcessor, + LoRAAttnProcessor2_0, + LoRAXFormersAttnProcessor, + SlicedAttnAddedKVProcessor, + XFormersAttnProcessor, +) +from .utils import ( + DIFFUSERS_CACHE, + HF_HUB_OFFLINE, + TEXT_ENCODER_ATTN_MODULE, + _get_model_file, + deprecate, + is_safetensors_available, + is_transformers_available, + logging, +) + + +if is_safetensors_available(): + import safetensors + +if is_transformers_available(): + from transformers import PreTrainedModel, PreTrainedTokenizer + + +logger = logging.get_logger(__name__) + +TEXT_ENCODER_NAME = "text_encoder" +UNET_NAME = "unet" + +LORA_WEIGHT_NAME = "pytorch_lora_weights.bin" +LORA_WEIGHT_NAME_SAFE = "pytorch_lora_weights.safetensors" + +TEXT_INVERSION_NAME = "learned_embeds.bin" +TEXT_INVERSION_NAME_SAFE = "learned_embeds.safetensors" + +CUSTOM_DIFFUSION_WEIGHT_NAME = "pytorch_custom_diffusion_weights.bin" +CUSTOM_DIFFUSION_WEIGHT_NAME_SAFE = "pytorch_custom_diffusion_weights.safetensors" + + +class AttnProcsLayers(torch.nn.Module): + def __init__(self, state_dict: Dict[str, torch.Tensor]): + super().__init__() + self.layers = torch.nn.ModuleList(state_dict.values()) + self.mapping = dict(enumerate(state_dict.keys())) + self.rev_mapping = {v: k for k, v in enumerate(state_dict.keys())} + + # .processor for unet, .self_attn for text encoder + self.split_keys = [".processor", ".self_attn"] + + # we add a hook to state_dict() and load_state_dict() so that the + # naming fits with `unet.attn_processors` + def map_to(module, state_dict, *args, **kwargs): + new_state_dict = {} + for key, value in state_dict.items(): + num = int(key.split(".")[1]) # 0 is always "layers" + new_key = key.replace(f"layers.{num}", module.mapping[num]) + new_state_dict[new_key] = value + + return new_state_dict + + def remap_key(key, state_dict): + for k in self.split_keys: + if k in key: + return key.split(k)[0] + k + + raise ValueError( + f"There seems to be a problem with the state_dict: {set(state_dict.keys())}. {key} has to have one of {self.split_keys}." + ) + + def map_from(module, state_dict, *args, **kwargs): + all_keys = list(state_dict.keys()) + for key in all_keys: + replace_key = remap_key(key, state_dict) + new_key = key.replace(replace_key, f"layers.{module.rev_mapping[replace_key]}") + state_dict[new_key] = state_dict[key] + del state_dict[key] + + self._register_state_dict_hook(map_to) + self._register_load_state_dict_pre_hook(map_from, with_module=True) + + +class UNet2DConditionLoadersMixin: + text_encoder_name = TEXT_ENCODER_NAME + unet_name = UNET_NAME + + def load_attn_procs(self, pretrained_model_name_or_path_or_dict: Union[str, Dict[str, torch.Tensor]], **kwargs): + r""" + Load pretrained attention processor layers into [`UNet2DConditionModel`]. Attention processor layers have to be + defined in + [`cross_attention.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/cross_attention.py) + and be a `torch.nn.Module` class. + + Parameters: + pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`): + Can be either: + + - A string, the model id (for example `google/ddpm-celebahq-256`) of a pretrained model hosted on + the Hub. + - A path to a directory (for example `./my_model_directory`) containing the model weights saved + with [`ModelMixin.save_pretrained`]. + - A [torch state + dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict). + + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory where a downloaded pretrained model configuration is cached if the standard cache + is not used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + local_files_only (`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to `True`, the model + won't be downloaded from the Hub. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from + `diffusers-cli login` (stored in `~/.huggingface`) is used. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + subfolder (`str`, *optional*, defaults to `""`): + The subfolder location of a model file within a larger model repository on the Hub or locally. + mirror (`str`, *optional*): + Mirror source to resolve accessibility issues if you’re downloading a model in China. We do not + guarantee the timeliness or safety of the source, and you should refer to the mirror site for more + information. + + """ + + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + force_download = kwargs.pop("force_download", False) + resume_download = kwargs.pop("resume_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", HF_HUB_OFFLINE) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + subfolder = kwargs.pop("subfolder", None) + weight_name = kwargs.pop("weight_name", None) + use_safetensors = kwargs.pop("use_safetensors", None) + # This value has the same meaning as the `--network_alpha` option in the kohya-ss trainer script. + # See https://github.com/darkstorm2150/sd-scripts/blob/main/docs/train_network_README-en.md#execute-learning + network_alpha = kwargs.pop("network_alpha", None) + + if use_safetensors and not is_safetensors_available(): + raise ValueError( + "`use_safetensors`=True but safetensors is not installed. Please install safetensors with `pip install safetensors" + ) + + allow_pickle = False + if use_safetensors is None: + use_safetensors = is_safetensors_available() + allow_pickle = True + + user_agent = { + "file_type": "attn_procs_weights", + "framework": "pytorch", + } + + model_file = None + if not isinstance(pretrained_model_name_or_path_or_dict, dict): + # Let's first try to load .safetensors weights + if (use_safetensors and weight_name is None) or ( + weight_name is not None and weight_name.endswith(".safetensors") + ): + try: + model_file = _get_model_file( + pretrained_model_name_or_path_or_dict, + weights_name=weight_name or LORA_WEIGHT_NAME_SAFE, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + ) + state_dict = safetensors.torch.load_file(model_file, device="cpu") + except IOError as e: + if not allow_pickle: + raise e + # try loading non-safetensors weights + pass + if model_file is None: + model_file = _get_model_file( + pretrained_model_name_or_path_or_dict, + weights_name=weight_name or LORA_WEIGHT_NAME, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + ) + state_dict = torch.load(model_file, map_location="cpu") + else: + state_dict = pretrained_model_name_or_path_or_dict + + # fill attn processors + attn_processors = {} + + is_lora = all("lora" in k for k in state_dict.keys()) + is_custom_diffusion = any("custom_diffusion" in k for k in state_dict.keys()) + + if is_lora: + is_new_lora_format = all( + key.startswith(self.unet_name) or key.startswith(self.text_encoder_name) for key in state_dict.keys() + ) + if is_new_lora_format: + # Strip the `"unet"` prefix. + is_text_encoder_present = any(key.startswith(self.text_encoder_name) for key in state_dict.keys()) + if is_text_encoder_present: + warn_message = "The state_dict contains LoRA params corresponding to the text encoder which are not being used here. To use both UNet and text encoder related LoRA params, use [`pipe.load_lora_weights()`](https://huggingface.co/docs/diffusers/main/en/api/loaders#diffusers.loaders.LoraLoaderMixin.load_lora_weights)." + warnings.warn(warn_message) + unet_keys = [k for k in state_dict.keys() if k.startswith(self.unet_name)] + state_dict = {k.replace(f"{self.unet_name}.", ""): v for k, v in state_dict.items() if k in unet_keys} + + lora_grouped_dict = defaultdict(dict) + for key, value in state_dict.items(): + attn_processor_key, sub_key = ".".join(key.split(".")[:-3]), ".".join(key.split(".")[-3:]) + lora_grouped_dict[attn_processor_key][sub_key] = value + + for key, value_dict in lora_grouped_dict.items(): + rank = value_dict["to_k_lora.down.weight"].shape[0] + hidden_size = value_dict["to_k_lora.up.weight"].shape[0] + + attn_processor = self + for sub_key in key.split("."): + attn_processor = getattr(attn_processor, sub_key) + + if isinstance( + attn_processor, (AttnAddedKVProcessor, SlicedAttnAddedKVProcessor, AttnAddedKVProcessor2_0) + ): + cross_attention_dim = value_dict["add_k_proj_lora.down.weight"].shape[1] + attn_processor_class = LoRAAttnAddedKVProcessor + else: + cross_attention_dim = value_dict["to_k_lora.down.weight"].shape[1] + if isinstance(attn_processor, (XFormersAttnProcessor, LoRAXFormersAttnProcessor)): + attn_processor_class = LoRAXFormersAttnProcessor + else: + attn_processor_class = ( + LoRAAttnProcessor2_0 if hasattr(F, "scaled_dot_product_attention") else LoRAAttnProcessor + ) + + attn_processors[key] = attn_processor_class( + hidden_size=hidden_size, + cross_attention_dim=cross_attention_dim, + rank=rank, + network_alpha=network_alpha, + ) + attn_processors[key].load_state_dict(value_dict) + elif is_custom_diffusion: + custom_diffusion_grouped_dict = defaultdict(dict) + for key, value in state_dict.items(): + if len(value) == 0: + custom_diffusion_grouped_dict[key] = {} + else: + if "to_out" in key: + attn_processor_key, sub_key = ".".join(key.split(".")[:-3]), ".".join(key.split(".")[-3:]) + else: + attn_processor_key, sub_key = ".".join(key.split(".")[:-2]), ".".join(key.split(".")[-2:]) + custom_diffusion_grouped_dict[attn_processor_key][sub_key] = value + + for key, value_dict in custom_diffusion_grouped_dict.items(): + if len(value_dict) == 0: + attn_processors[key] = CustomDiffusionAttnProcessor( + train_kv=False, train_q_out=False, hidden_size=None, cross_attention_dim=None + ) + else: + cross_attention_dim = value_dict["to_k_custom_diffusion.weight"].shape[1] + hidden_size = value_dict["to_k_custom_diffusion.weight"].shape[0] + train_q_out = True if "to_q_custom_diffusion.weight" in value_dict else False + attn_processors[key] = CustomDiffusionAttnProcessor( + train_kv=True, + train_q_out=train_q_out, + hidden_size=hidden_size, + cross_attention_dim=cross_attention_dim, + ) + attn_processors[key].load_state_dict(value_dict) + else: + raise ValueError( + f"{model_file} does not seem to be in the correct format expected by LoRA or Custom Diffusion training." + ) + + # set correct dtype & device + attn_processors = {k: v.to(device=self.device, dtype=self.dtype) for k, v in attn_processors.items()} + + # set layers + self.set_attn_processor(attn_processors) + + def save_attn_procs( + self, + save_directory: Union[str, os.PathLike], + is_main_process: bool = True, + weight_name: str = None, + save_function: Callable = None, + safe_serialization: bool = False, + **kwargs, + ): + r""" + Save an attention processor to a directory so that it can be reloaded using the + [`~loaders.UNet2DConditionLoadersMixin.load_attn_procs`] method. + + Arguments: + save_directory (`str` or `os.PathLike`): + Directory to save an attention processor to. Will be created if it doesn't exist. + is_main_process (`bool`, *optional*, defaults to `True`): + Whether the process calling this is the main process or not. Useful during distributed training and you + need to call this function on all processes. In this case, set `is_main_process=True` only on the main + process to avoid race conditions. + save_function (`Callable`): + The function to use to save the state dictionary. Useful during distributed training when you need to + replace `torch.save` with another method. Can be configured with the environment variable + `DIFFUSERS_SAVE_MODE`. + + """ + weight_name = weight_name or deprecate( + "weights_name", + "0.20.0", + "`weights_name` is deprecated, please use `weight_name` instead.", + take_from=kwargs, + ) + if os.path.isfile(save_directory): + logger.error(f"Provided path ({save_directory}) should be a directory, not a file") + return + + if save_function is None: + if safe_serialization: + + def save_function(weights, filename): + return safetensors.torch.save_file(weights, filename, metadata={"format": "pt"}) + + else: + save_function = torch.save + + os.makedirs(save_directory, exist_ok=True) + + is_custom_diffusion = any( + isinstance(x, (CustomDiffusionAttnProcessor, CustomDiffusionXFormersAttnProcessor)) + for (_, x) in self.attn_processors.items() + ) + if is_custom_diffusion: + model_to_save = AttnProcsLayers( + { + y: x + for (y, x) in self.attn_processors.items() + if isinstance(x, (CustomDiffusionAttnProcessor, CustomDiffusionXFormersAttnProcessor)) + } + ) + state_dict = model_to_save.state_dict() + for name, attn in self.attn_processors.items(): + if len(attn.state_dict()) == 0: + state_dict[name] = {} + else: + model_to_save = AttnProcsLayers(self.attn_processors) + state_dict = model_to_save.state_dict() + + if weight_name is None: + if safe_serialization: + weight_name = CUSTOM_DIFFUSION_WEIGHT_NAME_SAFE if is_custom_diffusion else LORA_WEIGHT_NAME_SAFE + else: + weight_name = CUSTOM_DIFFUSION_WEIGHT_NAME if is_custom_diffusion else LORA_WEIGHT_NAME + + # Save the model + save_function(state_dict, os.path.join(save_directory, weight_name)) + logger.info(f"Model weights saved in {os.path.join(save_directory, weight_name)}") + + +class TextualInversionLoaderMixin: + r""" + Load textual inversion tokens and embeddings to the tokenizer and text encoder. + """ + + def maybe_convert_prompt(self, prompt: Union[str, List[str]], tokenizer: "PreTrainedTokenizer"): + r""" + Processes prompts that include a special token corresponding to a multi-vector textual inversion embedding to + be replaced with multiple special tokens each corresponding to one of the vectors. If the prompt has no textual + inversion token or if the textual inversion token is a single vector, the input prompt is returned. + + Parameters: + prompt (`str` or list of `str`): + The prompt or prompts to guide the image generation. + tokenizer (`PreTrainedTokenizer`): + The tokenizer responsible for encoding the prompt into input tokens. + + Returns: + `str` or list of `str`: The converted prompt + """ + if not isinstance(prompt, List): + prompts = [prompt] + else: + prompts = prompt + + prompts = [self._maybe_convert_prompt(p, tokenizer) for p in prompts] + + if not isinstance(prompt, List): + return prompts[0] + + return prompts + + def _maybe_convert_prompt(self, prompt: str, tokenizer: "PreTrainedTokenizer"): + r""" + Maybe convert a prompt into a "multi vector"-compatible prompt. If the prompt includes a token that corresponds + to a multi-vector textual inversion embedding, this function will process the prompt so that the special token + is replaced with multiple special tokens each corresponding to one of the vectors. If the prompt has no textual + inversion token or a textual inversion token that is a single vector, the input prompt is simply returned. + + Parameters: + prompt (`str`): + The prompt to guide the image generation. + tokenizer (`PreTrainedTokenizer`): + The tokenizer responsible for encoding the prompt into input tokens. + + Returns: + `str`: The converted prompt + """ + tokens = tokenizer.tokenize(prompt) + unique_tokens = set(tokens) + for token in unique_tokens: + if token in tokenizer.added_tokens_encoder: + replacement = token + i = 1 + while f"{token}_{i}" in tokenizer.added_tokens_encoder: + replacement += f" {token}_{i}" + i += 1 + + prompt = prompt.replace(token, replacement) + + return prompt + + def load_textual_inversion( + self, + pretrained_model_name_or_path: Union[str, List[str], Dict[str, torch.Tensor], List[Dict[str, torch.Tensor]]], + token: Optional[Union[str, List[str]]] = None, + **kwargs, + ): + r""" + Load textual inversion embeddings into the text encoder of [`StableDiffusionPipeline`] (both 🤗 Diffusers and + Automatic1111 formats are supported). + + Parameters: + pretrained_model_name_or_path (`str` or `os.PathLike` or `List[str or os.PathLike]` or `Dict` or `List[Dict]`): + Can be either one of the following or a list of them: + + - A string, the *model id* (for example `sd-concepts-library/low-poly-hd-logos-icons`) of a + pretrained model hosted on the Hub. + - A path to a *directory* (for example `./my_text_inversion_directory/`) containing the textual + inversion weights. + - A path to a *file* (for example `./my_text_inversions.pt`) containing textual inversion weights. + - A [torch state + dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict). + + token (`str` or `List[str]`, *optional*): + Override the token to use for the textual inversion weights. If `pretrained_model_name_or_path` is a + list, then `token` must also be a list of equal length. + weight_name (`str`, *optional*): + Name of a custom weight file. This should be used when: + + - The saved textual inversion file is in 🤗 Diffusers format, but was saved under a specific weight + name such as `text_inv.bin`. + - The saved textual inversion file is in the Automatic1111 format. + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory where a downloaded pretrained model configuration is cached if the standard cache + is not used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + local_files_only (`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to `True`, the model + won't be downloaded from the Hub. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from + `diffusers-cli login` (stored in `~/.huggingface`) is used. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + subfolder (`str`, *optional*, defaults to `""`): + The subfolder location of a model file within a larger model repository on the Hub or locally. + mirror (`str`, *optional*): + Mirror source to resolve accessibility issues if you're downloading a model in China. We do not + guarantee the timeliness or safety of the source, and you should refer to the mirror site for more + information. + + Example: + + To load a textual inversion embedding vector in 🤗 Diffusers format: + + ```py + from diffusers import StableDiffusionPipeline + import torch + + model_id = "runwayml/stable-diffusion-v1-5" + pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda") + + pipe.load_textual_inversion("sd-concepts-library/cat-toy") + + prompt = "A backpack" + + image = pipe(prompt, num_inference_steps=50).images[0] + image.save("cat-backpack.png") + ``` + + To load a textual inversion embedding vector in Automatic1111 format, make sure to download the vector first + (for example from [civitAI](https://civitai.com/models/3036?modelVersionId=9857)) and then load the vector + locally: + + ```py + from diffusers import StableDiffusionPipeline + import torch + + model_id = "runwayml/stable-diffusion-v1-5" + pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda") + + pipe.load_textual_inversion("./charturnerv2.pt", token="charturnerv2") + + prompt = "charturnerv2, multiple views of the same character in the same outfit, a character turnaround of a woman wearing a black jacket and red shirt, best quality, intricate details." + + image = pipe(prompt, num_inference_steps=50).images[0] + image.save("character.png") + ``` + + """ + if not hasattr(self, "tokenizer") or not isinstance(self.tokenizer, PreTrainedTokenizer): + raise ValueError( + f"{self.__class__.__name__} requires `self.tokenizer` of type `PreTrainedTokenizer` for calling" + f" `{self.load_textual_inversion.__name__}`" + ) + + if not hasattr(self, "text_encoder") or not isinstance(self.text_encoder, PreTrainedModel): + raise ValueError( + f"{self.__class__.__name__} requires `self.text_encoder` of type `PreTrainedModel` for calling" + f" `{self.load_textual_inversion.__name__}`" + ) + + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + force_download = kwargs.pop("force_download", False) + resume_download = kwargs.pop("resume_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", HF_HUB_OFFLINE) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + subfolder = kwargs.pop("subfolder", None) + weight_name = kwargs.pop("weight_name", None) + use_safetensors = kwargs.pop("use_safetensors", None) + + if use_safetensors and not is_safetensors_available(): + raise ValueError( + "`use_safetensors`=True but safetensors is not installed. Please install safetensors with `pip install safetensors" + ) + + allow_pickle = False + if use_safetensors is None: + use_safetensors = is_safetensors_available() + allow_pickle = True + + user_agent = { + "file_type": "text_inversion", + "framework": "pytorch", + } + + if not isinstance(pretrained_model_name_or_path, list): + pretrained_model_name_or_paths = [pretrained_model_name_or_path] + else: + pretrained_model_name_or_paths = pretrained_model_name_or_path + + if isinstance(token, str): + tokens = [token] + elif token is None: + tokens = [None] * len(pretrained_model_name_or_paths) + else: + tokens = token + + if len(pretrained_model_name_or_paths) != len(tokens): + raise ValueError( + f"You have passed a list of models of length {len(pretrained_model_name_or_paths)}, and list of tokens of length {len(tokens)}" + f"Make sure both lists have the same length." + ) + + valid_tokens = [t for t in tokens if t is not None] + if len(set(valid_tokens)) < len(valid_tokens): + raise ValueError(f"You have passed a list of tokens that contains duplicates: {tokens}") + + token_ids_and_embeddings = [] + + for pretrained_model_name_or_path, token in zip(pretrained_model_name_or_paths, tokens): + if not isinstance(pretrained_model_name_or_path, dict): + # 1. Load textual inversion file + model_file = None + # Let's first try to load .safetensors weights + if (use_safetensors and weight_name is None) or ( + weight_name is not None and weight_name.endswith(".safetensors") + ): + try: + model_file = _get_model_file( + pretrained_model_name_or_path, + weights_name=weight_name or TEXT_INVERSION_NAME_SAFE, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + ) + state_dict = safetensors.torch.load_file(model_file, device="cpu") + except Exception as e: + if not allow_pickle: + raise e + + model_file = None + + if model_file is None: + model_file = _get_model_file( + pretrained_model_name_or_path, + weights_name=weight_name or TEXT_INVERSION_NAME, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + ) + state_dict = torch.load(model_file, map_location="cpu") + else: + state_dict = pretrained_model_name_or_path + + # 2. Load token and embedding correcly from file + loaded_token = None + if isinstance(state_dict, torch.Tensor): + if token is None: + raise ValueError( + "You are trying to load a textual inversion embedding that has been saved as a PyTorch tensor. Make sure to pass the name of the corresponding token in this case: `token=...`." + ) + embedding = state_dict + elif len(state_dict) == 1: + # diffusers + loaded_token, embedding = next(iter(state_dict.items())) + elif "string_to_param" in state_dict: + # A1111 + loaded_token = state_dict["name"] + embedding = state_dict["string_to_param"]["*"] + + if token is not None and loaded_token != token: + logger.info(f"The loaded token: {loaded_token} is overwritten by the passed token {token}.") + else: + token = loaded_token + + embedding = embedding.to(dtype=self.text_encoder.dtype, device=self.text_encoder.device) + + # 3. Make sure we don't mess up the tokenizer or text encoder + vocab = self.tokenizer.get_vocab() + if token in vocab: + raise ValueError( + f"Token {token} already in tokenizer vocabulary. Please choose a different token name or remove {token} and embedding from the tokenizer and text encoder." + ) + elif f"{token}_1" in vocab: + multi_vector_tokens = [token] + i = 1 + while f"{token}_{i}" in self.tokenizer.added_tokens_encoder: + multi_vector_tokens.append(f"{token}_{i}") + i += 1 + + raise ValueError( + f"Multi-vector Token {multi_vector_tokens} already in tokenizer vocabulary. Please choose a different token name or remove the {multi_vector_tokens} and embedding from the tokenizer and text encoder." + ) + + is_multi_vector = len(embedding.shape) > 1 and embedding.shape[0] > 1 + + if is_multi_vector: + tokens = [token] + [f"{token}_{i}" for i in range(1, embedding.shape[0])] + embeddings = [e for e in embedding] # noqa: C416 + else: + tokens = [token] + embeddings = [embedding[0]] if len(embedding.shape) > 1 else [embedding] + + # add tokens and get ids + self.tokenizer.add_tokens(tokens) + token_ids = self.tokenizer.convert_tokens_to_ids(tokens) + token_ids_and_embeddings += zip(token_ids, embeddings) + + logger.info(f"Loaded textual inversion embedding for {token}.") + + # resize token embeddings and set all new embeddings + self.text_encoder.resize_token_embeddings(len(self.tokenizer)) + for token_id, embedding in token_ids_and_embeddings: + self.text_encoder.get_input_embeddings().weight.data[token_id] = embedding + + +class LoraLoaderMixin: + r""" + Load LoRA layers into [`UNet2DConditionModel`] and + [`CLIPTextModel`](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel). + """ + text_encoder_name = TEXT_ENCODER_NAME + unet_name = UNET_NAME + + def load_lora_weights(self, pretrained_model_name_or_path_or_dict: Union[str, Dict[str, torch.Tensor]], **kwargs): + r""" + Load pretrained LoRA attention processor layers into [`UNet2DConditionModel`] and + [`CLIPTextModel`](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel). + + Parameters: + pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`): + Can be either: + + - A string, the *model id* (for example `google/ddpm-celebahq-256`) of a pretrained model hosted on + the Hub. + - A path to a *directory* (for example `./my_model_directory`) containing the model weights saved + with [`ModelMixin.save_pretrained`]. + - A [torch state + dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict). + + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory where a downloaded pretrained model configuration is cached if the standard cache + is not used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + local_files_only (`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to `True`, the model + won't be downloaded from the Hub. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from + `diffusers-cli login` (stored in `~/.huggingface`) is used. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + subfolder (`str`, *optional*, defaults to `""`): + The subfolder location of a model file within a larger model repository on the Hub or locally. + mirror (`str`, *optional*): + Mirror source to resolve accessibility issues if you're downloading a model in China. We do not + guarantee the timeliness or safety of the source, and you should refer to the mirror site for more + information. + + """ + # Load the main state dict first which has the LoRA layers for either of + # UNet and text encoder or both. + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + force_download = kwargs.pop("force_download", False) + resume_download = kwargs.pop("resume_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", HF_HUB_OFFLINE) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + subfolder = kwargs.pop("subfolder", None) + weight_name = kwargs.pop("weight_name", None) + use_safetensors = kwargs.pop("use_safetensors", None) + + # set lora scale to a reasonable default + self._lora_scale = 1.0 + + if use_safetensors and not is_safetensors_available(): + raise ValueError( + "`use_safetensors`=True but safetensors is not installed. Please install safetensors with `pip install safetensors" + ) + + allow_pickle = False + if use_safetensors is None: + use_safetensors = is_safetensors_available() + allow_pickle = True + + user_agent = { + "file_type": "attn_procs_weights", + "framework": "pytorch", + } + + model_file = None + if not isinstance(pretrained_model_name_or_path_or_dict, dict): + # Let's first try to load .safetensors weights + if (use_safetensors and weight_name is None) or ( + weight_name is not None and weight_name.endswith(".safetensors") + ): + try: + model_file = _get_model_file( + pretrained_model_name_or_path_or_dict, + weights_name=weight_name or LORA_WEIGHT_NAME_SAFE, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + ) + state_dict = safetensors.torch.load_file(model_file, device="cpu") + except IOError as e: + if not allow_pickle: + raise e + # try loading non-safetensors weights + pass + if model_file is None: + model_file = _get_model_file( + pretrained_model_name_or_path_or_dict, + weights_name=weight_name or LORA_WEIGHT_NAME, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + ) + state_dict = torch.load(model_file, map_location="cpu") + else: + state_dict = pretrained_model_name_or_path_or_dict + + # Convert kohya-ss Style LoRA attn procs to diffusers attn procs + network_alpha = None + if all((k.startswith("lora_te_") or k.startswith("lora_unet_")) for k in state_dict.keys()): + state_dict, network_alpha = self._convert_kohya_lora_to_diffusers(state_dict) + + # If the serialization format is new (introduced in https://github.com/huggingface/diffusers/pull/2918), + # then the `state_dict` keys should have `self.unet_name` and/or `self.text_encoder_name` as + # their prefixes. + keys = list(state_dict.keys()) + if all(key.startswith(self.unet_name) or key.startswith(self.text_encoder_name) for key in keys): + # Load the layers corresponding to UNet. + unet_keys = [k for k in keys if k.startswith(self.unet_name)] + logger.info(f"Loading {self.unet_name}.") + unet_lora_state_dict = { + k.replace(f"{self.unet_name}.", ""): v for k, v in state_dict.items() if k in unet_keys + } + self.unet.load_attn_procs(unet_lora_state_dict, network_alpha=network_alpha) + + # Load the layers corresponding to text encoder and make necessary adjustments. + text_encoder_keys = [k for k in keys if k.startswith(self.text_encoder_name)] + text_encoder_lora_state_dict = { + k.replace(f"{self.text_encoder_name}.", ""): v for k, v in state_dict.items() if k in text_encoder_keys + } + if len(text_encoder_lora_state_dict) > 0: + logger.info(f"Loading {self.text_encoder_name}.") + attn_procs_text_encoder = self._load_text_encoder_attn_procs( + text_encoder_lora_state_dict, network_alpha=network_alpha + ) + self._modify_text_encoder(attn_procs_text_encoder) + + # save lora attn procs of text encoder so that it can be easily retrieved + self._text_encoder_lora_attn_procs = attn_procs_text_encoder + + # Otherwise, we're dealing with the old format. This means the `state_dict` should only + # contain the module names of the `unet` as its keys WITHOUT any prefix. + elif not all( + key.startswith(self.unet_name) or key.startswith(self.text_encoder_name) for key in state_dict.keys() + ): + self.unet.load_attn_procs(state_dict) + warn_message = "You have saved the LoRA weights using the old format. To convert the old LoRA weights to the new format, you can first load them in a dictionary and then create a new dictionary like the following: `new_state_dict = {f'unet'.{module_name}: params for module_name, params in old_state_dict.items()}`." + warnings.warn(warn_message) + + @property + def lora_scale(self) -> float: + # property function that returns the lora scale which can be set at run time by the pipeline. + # if _lora_scale has not been set, return 1 + return self._lora_scale if hasattr(self, "_lora_scale") else 1.0 + + @property + def text_encoder_lora_attn_procs(self): + if hasattr(self, "_text_encoder_lora_attn_procs"): + return self._text_encoder_lora_attn_procs + return + + def _remove_text_encoder_monkey_patch(self): + # Loop over the CLIPAttention module of text_encoder + for name, attn_module in self.text_encoder.named_modules(): + if name.endswith(TEXT_ENCODER_ATTN_MODULE): + # Loop over the LoRA layers + for _, text_encoder_attr in self._lora_attn_processor_attr_to_text_encoder_attr.items(): + # Retrieve the q/k/v/out projection of CLIPAttention + module = attn_module.get_submodule(text_encoder_attr) + if hasattr(module, "old_forward"): + # restore original `forward` to remove monkey-patch + module.forward = module.old_forward + delattr(module, "old_forward") + + def _modify_text_encoder(self, attn_processors: Dict[str, LoRAAttnProcessor]): + r""" + Monkey-patches the forward passes of attention modules of the text encoder. + + Parameters: + attn_processors: Dict[str, `LoRAAttnProcessor`]: + A dictionary mapping the module names and their corresponding [`~LoRAAttnProcessor`]. + """ + + # First, remove any monkey-patch that might have been applied before + self._remove_text_encoder_monkey_patch() + + # Loop over the CLIPAttention module of text_encoder + for name, attn_module in self.text_encoder.named_modules(): + if name.endswith(TEXT_ENCODER_ATTN_MODULE): + # Loop over the LoRA layers + for attn_proc_attr, text_encoder_attr in self._lora_attn_processor_attr_to_text_encoder_attr.items(): + # Retrieve the q/k/v/out projection of CLIPAttention and its corresponding LoRA layer. + module = attn_module.get_submodule(text_encoder_attr) + lora_layer = attn_processors[name].get_submodule(attn_proc_attr) + + # save old_forward to module that can be used to remove monkey-patch + old_forward = module.old_forward = module.forward + + # create a new scope that locks in the old_forward, lora_layer value for each new_forward function + # for more detail, see https://github.com/huggingface/diffusers/pull/3490#issuecomment-1555059060 + def make_new_forward(old_forward, lora_layer): + def new_forward(x): + result = old_forward(x) + self.lora_scale * lora_layer(x) + return result + + return new_forward + + # Monkey-patch. + module.forward = make_new_forward(old_forward, lora_layer) + + @property + def _lora_attn_processor_attr_to_text_encoder_attr(self): + return { + "to_q_lora": "q_proj", + "to_k_lora": "k_proj", + "to_v_lora": "v_proj", + "to_out_lora": "out_proj", + } + + def _load_text_encoder_attn_procs( + self, pretrained_model_name_or_path_or_dict: Union[str, Dict[str, torch.Tensor]], **kwargs + ): + r""" + Load pretrained attention processor layers for + [`CLIPTextModel`](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel). + + + + This function is experimental and might change in the future. + + + + Parameters: + pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`): + Can be either: + + - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. + Valid model ids should have an organization name, like `google/ddpm-celebahq-256`. + - A path to a *directory* containing model weights saved using [`~ModelMixin.save_config`], e.g., + `./my_model_directory/`. + - A [torch state + dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict). + + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory in which a downloaded pretrained model configuration should be cached if the + standard cache should not be used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to delete incompletely received files. Will attempt to resume the download if such a + file exists. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + local_files_only (`bool`, *optional*, defaults to `False`): + Whether or not to only look at local files (i.e., do not try to download the model). + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated + when running `diffusers-cli login` (stored in `~/.huggingface`). + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a + git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any + identifier allowed by git. + subfolder (`str`, *optional*, defaults to `""`): + In case the relevant files are located inside a subfolder of the model repo (either remote in + huggingface.co or downloaded locally), you can specify the folder name here. + mirror (`str`, *optional*): + Mirror source to accelerate downloads in China. If you are from China and have an accessibility + problem, you can set this option to resolve it. Note that we do not guarantee the timeliness or safety. + Please refer to the mirror site for more information. + + Returns: + `Dict[name, LoRAAttnProcessor]`: Mapping between the module names and their corresponding + [`LoRAAttnProcessor`]. + + + + It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated + models](https://huggingface.co/docs/hub/models-gated#gated-models). + + + """ + + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + force_download = kwargs.pop("force_download", False) + resume_download = kwargs.pop("resume_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", HF_HUB_OFFLINE) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + subfolder = kwargs.pop("subfolder", None) + weight_name = kwargs.pop("weight_name", None) + use_safetensors = kwargs.pop("use_safetensors", None) + network_alpha = kwargs.pop("network_alpha", None) + + if use_safetensors and not is_safetensors_available(): + raise ValueError( + "`use_safetensors`=True but safetensors is not installed. Please install safetensors with `pip install safetensors" + ) + + allow_pickle = False + if use_safetensors is None: + use_safetensors = is_safetensors_available() + allow_pickle = True + + user_agent = { + "file_type": "attn_procs_weights", + "framework": "pytorch", + } + + model_file = None + if not isinstance(pretrained_model_name_or_path_or_dict, dict): + # Let's first try to load .safetensors weights + if (use_safetensors and weight_name is None) or ( + weight_name is not None and weight_name.endswith(".safetensors") + ): + try: + model_file = _get_model_file( + pretrained_model_name_or_path_or_dict, + weights_name=weight_name or LORA_WEIGHT_NAME_SAFE, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + ) + state_dict = safetensors.torch.load_file(model_file, device="cpu") + except IOError as e: + if not allow_pickle: + raise e + # try loading non-safetensors weights + pass + if model_file is None: + model_file = _get_model_file( + pretrained_model_name_or_path_or_dict, + weights_name=weight_name or LORA_WEIGHT_NAME, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + ) + state_dict = torch.load(model_file, map_location="cpu") + else: + state_dict = pretrained_model_name_or_path_or_dict + + # fill attn processors + attn_processors = {} + + is_lora = all("lora" in k for k in state_dict.keys()) + + if is_lora: + lora_grouped_dict = defaultdict(dict) + for key, value in state_dict.items(): + attn_processor_key, sub_key = ".".join(key.split(".")[:-3]), ".".join(key.split(".")[-3:]) + lora_grouped_dict[attn_processor_key][sub_key] = value + + for key, value_dict in lora_grouped_dict.items(): + rank = value_dict["to_k_lora.down.weight"].shape[0] + cross_attention_dim = value_dict["to_k_lora.down.weight"].shape[1] + hidden_size = value_dict["to_k_lora.up.weight"].shape[0] + + attn_processor_class = ( + LoRAAttnProcessor2_0 if hasattr(F, "scaled_dot_product_attention") else LoRAAttnProcessor + ) + attn_processors[key] = attn_processor_class( + hidden_size=hidden_size, + cross_attention_dim=cross_attention_dim, + rank=rank, + network_alpha=network_alpha, + ) + attn_processors[key].load_state_dict(value_dict) + + else: + raise ValueError(f"{model_file} does not seem to be in the correct format expected by LoRA training.") + + # set correct dtype & device + attn_processors = { + k: v.to(device=self.device, dtype=self.text_encoder.dtype) for k, v in attn_processors.items() + } + return attn_processors + + @classmethod + def save_lora_weights( + self, + save_directory: Union[str, os.PathLike], + unet_lora_layers: Dict[str, Union[torch.nn.Module, torch.Tensor]] = None, + text_encoder_lora_layers: Dict[str, torch.nn.Module] = None, + is_main_process: bool = True, + weight_name: str = None, + save_function: Callable = None, + safe_serialization: bool = False, + ): + r""" + Save the LoRA parameters corresponding to the UNet and text encoder. + + Arguments: + save_directory (`str` or `os.PathLike`): + Directory to save LoRA parameters to. Will be created if it doesn't exist. + unet_lora_layers (`Dict[str, torch.nn.Module]` or `Dict[str, torch.Tensor]`): + State dict of the LoRA layers corresponding to the UNet. + text_encoder_lora_layers (`Dict[str, torch.nn.Module] or `Dict[str, torch.Tensor]`): + State dict of the LoRA layers corresponding to the `text_encoder`. Must explicitly pass the text + encoder LoRA state dict because it comes 🤗 Transformers. + is_main_process (`bool`, *optional*, defaults to `True`): + Whether the process calling this is the main process or not. Useful during distributed training and you + need to call this function on all processes. In this case, set `is_main_process=True` only on the main + process to avoid race conditions. + save_function (`Callable`): + The function to use to save the state dictionary. Useful during distributed training when you need to + replace `torch.save` with another method. Can be configured with the environment variable + `DIFFUSERS_SAVE_MODE`. + """ + if os.path.isfile(save_directory): + logger.error(f"Provided path ({save_directory}) should be a directory, not a file") + return + + if save_function is None: + if safe_serialization: + + def save_function(weights, filename): + return safetensors.torch.save_file(weights, filename, metadata={"format": "pt"}) + + else: + save_function = torch.save + + os.makedirs(save_directory, exist_ok=True) + + # Create a flat dictionary. + state_dict = {} + if unet_lora_layers is not None: + weights = ( + unet_lora_layers.state_dict() if isinstance(unet_lora_layers, torch.nn.Module) else unet_lora_layers + ) + + unet_lora_state_dict = {f"{self.unet_name}.{module_name}": param for module_name, param in weights.items()} + state_dict.update(unet_lora_state_dict) + + if text_encoder_lora_layers is not None: + weights = ( + text_encoder_lora_layers.state_dict() + if isinstance(text_encoder_lora_layers, torch.nn.Module) + else text_encoder_lora_layers + ) + + text_encoder_lora_state_dict = { + f"{self.text_encoder_name}.{module_name}": param for module_name, param in weights.items() + } + state_dict.update(text_encoder_lora_state_dict) + + # Save the model + if weight_name is None: + if safe_serialization: + weight_name = LORA_WEIGHT_NAME_SAFE + else: + weight_name = LORA_WEIGHT_NAME + + save_function(state_dict, os.path.join(save_directory, weight_name)) + logger.info(f"Model weights saved in {os.path.join(save_directory, weight_name)}") + + def _convert_kohya_lora_to_diffusers(self, state_dict): + unet_state_dict = {} + te_state_dict = {} + network_alpha = None + + for key, value in state_dict.items(): + if "lora_down" in key: + lora_name = key.split(".")[0] + lora_name_up = lora_name + ".lora_up.weight" + lora_name_alpha = lora_name + ".alpha" + if lora_name_alpha in state_dict: + alpha = state_dict[lora_name_alpha].item() + if network_alpha is None: + network_alpha = alpha + elif network_alpha != alpha: + raise ValueError("Network alpha is not consistent") + + if lora_name.startswith("lora_unet_"): + diffusers_name = key.replace("lora_unet_", "").replace("_", ".") + diffusers_name = diffusers_name.replace("down.blocks", "down_blocks") + diffusers_name = diffusers_name.replace("mid.block", "mid_block") + diffusers_name = diffusers_name.replace("up.blocks", "up_blocks") + diffusers_name = diffusers_name.replace("transformer.blocks", "transformer_blocks") + diffusers_name = diffusers_name.replace("to.q.lora", "to_q_lora") + diffusers_name = diffusers_name.replace("to.k.lora", "to_k_lora") + diffusers_name = diffusers_name.replace("to.v.lora", "to_v_lora") + diffusers_name = diffusers_name.replace("to.out.0.lora", "to_out_lora") + if "transformer_blocks" in diffusers_name: + if "attn1" in diffusers_name or "attn2" in diffusers_name: + diffusers_name = diffusers_name.replace("attn1", "attn1.processor") + diffusers_name = diffusers_name.replace("attn2", "attn2.processor") + unet_state_dict[diffusers_name] = value + unet_state_dict[diffusers_name.replace(".down.", ".up.")] = state_dict[lora_name_up] + elif lora_name.startswith("lora_te_"): + diffusers_name = key.replace("lora_te_", "").replace("_", ".") + diffusers_name = diffusers_name.replace("text.model", "text_model") + diffusers_name = diffusers_name.replace("self.attn", "self_attn") + diffusers_name = diffusers_name.replace("q.proj.lora", "to_q_lora") + diffusers_name = diffusers_name.replace("k.proj.lora", "to_k_lora") + diffusers_name = diffusers_name.replace("v.proj.lora", "to_v_lora") + diffusers_name = diffusers_name.replace("out.proj.lora", "to_out_lora") + if "self_attn" in diffusers_name: + te_state_dict[diffusers_name] = value + te_state_dict[diffusers_name.replace(".down.", ".up.")] = state_dict[lora_name_up] + + unet_state_dict = {f"{UNET_NAME}.{module_name}": params for module_name, params in unet_state_dict.items()} + te_state_dict = {f"{TEXT_ENCODER_NAME}.{module_name}": params for module_name, params in te_state_dict.items()} + new_state_dict = {**unet_state_dict, **te_state_dict} + return new_state_dict, network_alpha + + +class FromSingleFileMixin: + """ + Load model weights saved in the `.ckpt` format into a [`DiffusionPipeline`]. + """ + + @classmethod + def from_ckpt(cls, *args, **kwargs): + deprecation_message = "The function `from_ckpt` is deprecated in favor of `from_single_file` and will be removed in diffusers v.0.21. Please make sure to use `StableDiffusionPipeline.from_single_file(...)` instead." + deprecate("from_ckpt", "0.21.0", deprecation_message, standard_warn=False) + return cls.from_single_file(*args, **kwargs) + + @classmethod + def from_single_file(cls, pretrained_model_link_or_path, **kwargs): + r""" + Instantiate a [`DiffusionPipeline`] from pretrained pipeline weights saved in the `.ckpt` format. The pipeline + is set in evaluation mode (`model.eval()`) by default. + + Parameters: + pretrained_model_link_or_path (`str` or `os.PathLike`, *optional*): + Can be either: + - A link to the `.ckpt` file (for example + `"https://huggingface.co//blob/main/.ckpt"`) on the Hub. + - A path to a *file* containing all pipeline weights. + torch_dtype (`str` or `torch.dtype`, *optional*): + Override the default `torch.dtype` and load the model with another dtype. If `"auto"` is passed, the + dtype is automatically derived from the model's weights. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory where a downloaded pretrained model configuration is cached if the standard cache + is not used. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + local_files_only (`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to True, the model + won't be downloaded from the Hub. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from + `diffusers-cli login` (stored in `~/.huggingface`) is used. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + use_safetensors (`bool`, *optional*, defaults to `None`): + If set to `None`, the safetensors weights are downloaded if they're available **and** if the + safetensors library is installed. If set to `True`, the model is forcibly loaded from safetensors + weights. If set to `False`, safetensors weights are not loaded. + extract_ema (`bool`, *optional*, defaults to `False`): + Whether to extract the EMA weights or not. Pass `True` to extract the EMA weights which usually yield + higher quality images for inference. Non-EMA weights are usually better to continue finetuning. + upcast_attention (`bool`, *optional*, defaults to `None`): + Whether the attention computation should always be upcasted. + image_size (`int`, *optional*, defaults to 512): + The image size the model was trained on. Use 512 for all Stable Diffusion v1 models and the Stable + Diffusion v2 base model. Use 768 for Stable Diffusion v2. + prediction_type (`str`, *optional*): + The prediction type the model was trained on. Use `'epsilon'` for all Stable Diffusion v1 models and + the Stable Diffusion v2 base model. Use `'v_prediction'` for Stable Diffusion v2. + num_in_channels (`int`, *optional*, defaults to `None`): + The number of input channels. If `None`, it will be automatically inferred. + scheduler_type (`str`, *optional*, defaults to `"pndm"`): + Type of scheduler to use. Should be one of `["pndm", "lms", "heun", "euler", "euler-ancestral", "dpm", + "ddim"]`. + load_safety_checker (`bool`, *optional*, defaults to `True`): + Whether to load the safety checker or not. + text_encoder (`CLIPTextModel`, *optional*, defaults to `None`): + An instance of + [CLIP](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel) to use, + specifically the [clip-vit-large-patch14](https://huggingface.co/openai/clip-vit-large-patch14) + variant. If this parameter is `None`, the function will load a new instance of [CLIP] by itself, if + needed. + tokenizer (`CLIPTokenizer`, *optional*, defaults to `None`): + An instance of + [CLIPTokenizer](https://huggingface.co/docs/transformers/v4.21.0/en/model_doc/clip#transformers.CLIPTokenizer) + to use. If this parameter is `None`, the function will load a new instance of [CLIPTokenizer] by + itself, if needed. + kwargs (remaining dictionary of keyword arguments, *optional*): + Can be used to overwrite load and saveable variables (for example the pipeline components of the + specific pipeline class). The overwritten components are directly passed to the pipelines `__init__` + method. See example below for more information. + + Examples: + + ```py + >>> from diffusers import StableDiffusionPipeline + + >>> # Download pipeline from huggingface.co and cache. + >>> pipeline = StableDiffusionPipeline.from_single_file( + ... "https://huggingface.co/WarriorMama777/OrangeMixs/blob/main/Models/AbyssOrangeMix/AbyssOrangeMix.safetensors" + ... ) + + >>> # Download pipeline from local file + >>> # file is downloaded under ./v1-5-pruned-emaonly.ckpt + >>> pipeline = StableDiffusionPipeline.from_single_file("./v1-5-pruned-emaonly") + + >>> # Enable float16 and move to GPU + >>> pipeline = StableDiffusionPipeline.from_single_file( + ... "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.ckpt", + ... torch_dtype=torch.float16, + ... ) + >>> pipeline.to("cuda") + ``` + """ + # import here to avoid circular dependency + from .pipelines.stable_diffusion.convert_from_ckpt import download_from_original_stable_diffusion_ckpt + + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + resume_download = kwargs.pop("resume_download", False) + force_download = kwargs.pop("force_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", HF_HUB_OFFLINE) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + extract_ema = kwargs.pop("extract_ema", False) + image_size = kwargs.pop("image_size", None) + scheduler_type = kwargs.pop("scheduler_type", "pndm") + num_in_channels = kwargs.pop("num_in_channels", None) + upcast_attention = kwargs.pop("upcast_attention", None) + load_safety_checker = kwargs.pop("load_safety_checker", True) + prediction_type = kwargs.pop("prediction_type", None) + text_encoder = kwargs.pop("text_encoder", None) + tokenizer = kwargs.pop("tokenizer", None) + + torch_dtype = kwargs.pop("torch_dtype", None) + + use_safetensors = kwargs.pop("use_safetensors", None if is_safetensors_available() else False) + + pipeline_name = cls.__name__ + file_extension = pretrained_model_link_or_path.rsplit(".", 1)[-1] + from_safetensors = file_extension == "safetensors" + + if from_safetensors and use_safetensors is False: + raise ValueError("Make sure to install `safetensors` with `pip install safetensors`.") + + # TODO: For now we only support stable diffusion + stable_unclip = None + model_type = None + controlnet = False + + if pipeline_name == "StableDiffusionControlNetPipeline": + # Model type will be inferred from the checkpoint. + controlnet = True + elif "StableDiffusion" in pipeline_name: + # Model type will be inferred from the checkpoint. + pass + elif pipeline_name == "StableUnCLIPPipeline": + model_type = "FrozenOpenCLIPEmbedder" + stable_unclip = "txt2img" + elif pipeline_name == "StableUnCLIPImg2ImgPipeline": + model_type = "FrozenOpenCLIPEmbedder" + stable_unclip = "img2img" + elif pipeline_name == "PaintByExamplePipeline": + model_type = "PaintByExample" + elif pipeline_name == "LDMTextToImagePipeline": + model_type = "LDMTextToImage" + else: + raise ValueError(f"Unhandled pipeline class: {pipeline_name}") + + # remove huggingface url + for prefix in ["https://huggingface.co/", "huggingface.co/", "hf.co/", "https://hf.co/"]: + if pretrained_model_link_or_path.startswith(prefix): + pretrained_model_link_or_path = pretrained_model_link_or_path[len(prefix) :] + + # Code based on diffusers.pipelines.pipeline_utils.DiffusionPipeline.from_pretrained + ckpt_path = Path(pretrained_model_link_or_path) + if not ckpt_path.is_file(): + # get repo_id and (potentially nested) file path of ckpt in repo + repo_id = "/".join(ckpt_path.parts[:2]) + file_path = "/".join(ckpt_path.parts[2:]) + + if file_path.startswith("blob/"): + file_path = file_path[len("blob/") :] + + if file_path.startswith("main/"): + file_path = file_path[len("main/") :] + + pretrained_model_link_or_path = hf_hub_download( + repo_id, + filename=file_path, + cache_dir=cache_dir, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + force_download=force_download, + ) + + pipe = download_from_original_stable_diffusion_ckpt( + pretrained_model_link_or_path, + pipeline_class=cls, + model_type=model_type, + stable_unclip=stable_unclip, + controlnet=controlnet, + from_safetensors=from_safetensors, + extract_ema=extract_ema, + image_size=image_size, + scheduler_type=scheduler_type, + num_in_channels=num_in_channels, + upcast_attention=upcast_attention, + load_safety_checker=load_safety_checker, + prediction_type=prediction_type, + text_encoder=text_encoder, + tokenizer=tokenizer, + ) + + if torch_dtype is not None: + pipe.to(torch_dtype=torch_dtype) + + return pipe diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__init__.py b/gf/sd-webui-rich-text/diffusers_official/models/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..23839c84af45777df5bfb2cfa1a727dfefdc1897 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/__init__.py @@ -0,0 +1,35 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from ..utils import is_flax_available, is_torch_available + + +if is_torch_available(): + from .autoencoder_kl import AutoencoderKL + from .controlnet import ControlNetModel + from .dual_transformer_2d import DualTransformer2DModel + from .modeling_utils import ModelMixin + from .prior_transformer import PriorTransformer + from .t5_film_transformer import T5FilmDecoder + from .transformer_2d import Transformer2DModel + from .unet_1d import UNet1DModel + from .unet_2d import UNet2DModel + from .unet_2d_condition import UNet2DConditionModel + from .unet_3d_condition import UNet3DConditionModel + from .vq_model import VQModel + +if is_flax_available(): + from .controlnet_flax import FlaxControlNetModel + from .unet_2d_condition_flax import FlaxUNet2DConditionModel + from .vae_flax import FlaxAutoencoderKL diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/__init__.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/__init__.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..f832228350c462a59b6d1e4a83f26f04ef40143a Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/__init__.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/activations.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/activations.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..e09c8f705cc3983775e0cd0350ac0df1b7344f92 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/activations.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/attention.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/attention.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..730d0339b4ced740c1ea968e5ffc7454351d69b3 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/attention.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/attention_processor.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/attention_processor.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..80761a00c95c7b67fe0279b941363dae262f8272 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/attention_processor.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/autoencoder_kl.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/autoencoder_kl.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..5205f46584413de2450ce0ed97f6c6054c57f547 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/autoencoder_kl.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/controlnet.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/controlnet.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..7def0049ffd90d36eacd1f9ed153d070d1c09c38 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/controlnet.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/dual_transformer_2d.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/dual_transformer_2d.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..83d6c7820cb6eedec052d515aa7869f7e4d451f3 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/dual_transformer_2d.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/embeddings.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/embeddings.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..984a32cf85dc0d3d99886d373709ab2c7f92f6d2 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/embeddings.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/modeling_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/modeling_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..651fc9af151d0b142ccc2a6ff5500a475a4d5cea Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/modeling_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/prior_transformer.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/prior_transformer.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..8549ebd5e8070f2469fb9126c6caeb0a340fdb73 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/prior_transformer.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/resnet.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/resnet.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..2bb6400028cdef184871c0fa93845f1ffb6d9d9e Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/resnet.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/t5_film_transformer.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/t5_film_transformer.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..e08695c89d6dee5ffb7334f84fc901cd2da57198 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/t5_film_transformer.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/transformer_2d.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/transformer_2d.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..e0f5e73b44e7687c306e951ce11adf194d36daa2 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/transformer_2d.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/transformer_temporal.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/transformer_temporal.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..7a67aa28ecbd2da18744fe239786ade683b78217 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/transformer_temporal.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_1d.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_1d.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..786ad9da3e35a19860a54293cfdbe559c7d2904e Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_1d.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_1d_blocks.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_1d_blocks.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..c8178dd885718dcffbecc5da83e1a7c6ee690532 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_1d_blocks.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..d28746ddbf05daab140354b6986aad4c32520a88 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d_blocks.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d_blocks.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..d41166fb4ebc37b4c6ba2d42636f6a4d2a84decb Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d_blocks.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d_condition.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d_condition.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..848b24ab3878b9c69e3450924955249a1b1ec081 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_2d_condition.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_3d_blocks.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_3d_blocks.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..3780d514bea878203cf92f85b1e9aa98633b5f35 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_3d_blocks.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_3d_condition.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_3d_condition.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..1756e4eafc03120e7ae580ef5ac6ae8da8ea7bd6 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/unet_3d_condition.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/vae.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/vae.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..26885e20984824af09214dfe67fef1c90a089c8c Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/vae.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/vq_model.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/vq_model.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..e7eb1b5b8a6b96a2936cfca2a5a2837f99ddb699 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/models/__pycache__/vq_model.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/models/activations.py b/gf/sd-webui-rich-text/diffusers_official/models/activations.py new file mode 100644 index 0000000000000000000000000000000000000000..64759b706e2f108803e51ccd50f9dff67ad49722 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/activations.py @@ -0,0 +1,12 @@ +from torch import nn + + +def get_activation(act_fn): + if act_fn in ["swish", "silu"]: + return nn.SiLU() + elif act_fn == "mish": + return nn.Mish() + elif act_fn == "gelu": + return nn.GELU() + else: + raise ValueError(f"Unsupported activation function: {act_fn}") diff --git a/gf/sd-webui-rich-text/diffusers_official/models/attention.py b/gf/sd-webui-rich-text/diffusers_official/models/attention.py new file mode 100644 index 0000000000000000000000000000000000000000..6b05bf35e87f503df3e265bd587d1ca3f32f2bc5 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/attention.py @@ -0,0 +1,389 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Any, Dict, Optional + +import torch +import torch.nn.functional as F +from torch import nn + +from ..utils import maybe_allow_in_graph +from .activations import get_activation +from .attention_processor import Attention +from .embeddings import CombinedTimestepLabelEmbeddings + + +@maybe_allow_in_graph +class BasicTransformerBlock(nn.Module): + r""" + A basic Transformer block. + + Parameters: + dim (`int`): The number of channels in the input and output. + num_attention_heads (`int`): The number of heads to use for multi-head attention. + attention_head_dim (`int`): The number of channels in each head. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The size of the encoder_hidden_states vector for cross attention. + only_cross_attention (`bool`, *optional*): + Whether to use only cross-attention layers. In this case two cross attention layers are used. + double_self_attention (`bool`, *optional*): + Whether to use two self-attention layers. In this case no cross attention layers are used. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward. + num_embeds_ada_norm (: + obj: `int`, *optional*): The number of diffusion steps used during training. See `Transformer2DModel`. + attention_bias (: + obj: `bool`, *optional*, defaults to `False`): Configure if the attentions should contain a bias parameter. + """ + + def __init__( + self, + dim: int, + num_attention_heads: int, + attention_head_dim: int, + dropout=0.0, + cross_attention_dim: Optional[int] = None, + activation_fn: str = "geglu", + num_embeds_ada_norm: Optional[int] = None, + attention_bias: bool = False, + only_cross_attention: bool = False, + double_self_attention: bool = False, + upcast_attention: bool = False, + norm_elementwise_affine: bool = True, + norm_type: str = "layer_norm", + final_dropout: bool = False, + ): + super().__init__() + self.only_cross_attention = only_cross_attention + + self.use_ada_layer_norm_zero = (num_embeds_ada_norm is not None) and norm_type == "ada_norm_zero" + self.use_ada_layer_norm = (num_embeds_ada_norm is not None) and norm_type == "ada_norm" + + if norm_type in ("ada_norm", "ada_norm_zero") and num_embeds_ada_norm is None: + raise ValueError( + f"`norm_type` is set to {norm_type}, but `num_embeds_ada_norm` is not defined. Please make sure to" + f" define `num_embeds_ada_norm` if setting `norm_type` to {norm_type}." + ) + + # Define 3 blocks. Each block has its own normalization layer. + # 1. Self-Attn + if self.use_ada_layer_norm: + self.norm1 = AdaLayerNorm(dim, num_embeds_ada_norm) + elif self.use_ada_layer_norm_zero: + self.norm1 = AdaLayerNormZero(dim, num_embeds_ada_norm) + else: + self.norm1 = nn.LayerNorm(dim, elementwise_affine=norm_elementwise_affine) + self.attn1 = Attention( + query_dim=dim, + heads=num_attention_heads, + dim_head=attention_head_dim, + dropout=dropout, + bias=attention_bias, + cross_attention_dim=cross_attention_dim if only_cross_attention else None, + upcast_attention=upcast_attention, + ) + + # 2. Cross-Attn + if cross_attention_dim is not None or double_self_attention: + # We currently only use AdaLayerNormZero for self attention where there will only be one attention block. + # I.e. the number of returned modulation chunks from AdaLayerZero would not make sense if returned during + # the second cross attention block. + self.norm2 = ( + AdaLayerNorm(dim, num_embeds_ada_norm) + if self.use_ada_layer_norm + else nn.LayerNorm(dim, elementwise_affine=norm_elementwise_affine) + ) + self.attn2 = Attention( + query_dim=dim, + cross_attention_dim=cross_attention_dim if not double_self_attention else None, + heads=num_attention_heads, + dim_head=attention_head_dim, + dropout=dropout, + bias=attention_bias, + upcast_attention=upcast_attention, + ) # is self-attn if encoder_hidden_states is none + else: + self.norm2 = None + self.attn2 = None + + # 3. Feed-forward + self.norm3 = nn.LayerNorm(dim, elementwise_affine=norm_elementwise_affine) + self.ff = FeedForward(dim, dropout=dropout, activation_fn=activation_fn, final_dropout=final_dropout) + + # let chunk size default to None + self._chunk_size = None + self._chunk_dim = 0 + + def set_chunk_feed_forward(self, chunk_size: Optional[int], dim: int): + # Sets chunk feed-forward + self._chunk_size = chunk_size + self._chunk_dim = dim + + def forward( + self, + hidden_states: torch.FloatTensor, + attention_mask: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + timestep: Optional[torch.LongTensor] = None, + cross_attention_kwargs: Dict[str, Any] = None, + class_labels: Optional[torch.LongTensor] = None, + ): + # Notice that normalization is always applied before the real computation in the following blocks. + # 1. Self-Attention + if self.use_ada_layer_norm: + norm_hidden_states = self.norm1(hidden_states, timestep) + elif self.use_ada_layer_norm_zero: + norm_hidden_states, gate_msa, shift_mlp, scale_mlp, gate_mlp = self.norm1( + hidden_states, timestep, class_labels, hidden_dtype=hidden_states.dtype + ) + else: + norm_hidden_states = self.norm1(hidden_states) + + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + attn_output = self.attn1( + norm_hidden_states, + encoder_hidden_states=encoder_hidden_states if self.only_cross_attention else None, + attention_mask=attention_mask, + **cross_attention_kwargs, + ) + if self.use_ada_layer_norm_zero: + attn_output = gate_msa.unsqueeze(1) * attn_output + hidden_states = attn_output + hidden_states + + # 2. Cross-Attention + if self.attn2 is not None: + norm_hidden_states = ( + self.norm2(hidden_states, timestep) if self.use_ada_layer_norm else self.norm2(hidden_states) + ) + + attn_output = self.attn2( + norm_hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=encoder_attention_mask, + **cross_attention_kwargs, + ) + hidden_states = attn_output + hidden_states + + # 3. Feed-forward + norm_hidden_states = self.norm3(hidden_states) + + if self.use_ada_layer_norm_zero: + norm_hidden_states = norm_hidden_states * (1 + scale_mlp[:, None]) + shift_mlp[:, None] + + if self._chunk_size is not None: + # "feed_forward_chunk_size" can be used to save memory + if norm_hidden_states.shape[self._chunk_dim] % self._chunk_size != 0: + raise ValueError( + f"`hidden_states` dimension to be chunked: {norm_hidden_states.shape[self._chunk_dim]} has to be divisible by chunk size: {self._chunk_size}. Make sure to set an appropriate `chunk_size` when calling `unet.enable_forward_chunking`." + ) + + num_chunks = norm_hidden_states.shape[self._chunk_dim] // self._chunk_size + ff_output = torch.cat( + [self.ff(hid_slice) for hid_slice in norm_hidden_states.chunk(num_chunks, dim=self._chunk_dim)], + dim=self._chunk_dim, + ) + else: + ff_output = self.ff(norm_hidden_states) + + if self.use_ada_layer_norm_zero: + ff_output = gate_mlp.unsqueeze(1) * ff_output + + hidden_states = ff_output + hidden_states + + return hidden_states + + +class FeedForward(nn.Module): + r""" + A feed-forward layer. + + Parameters: + dim (`int`): The number of channels in the input. + dim_out (`int`, *optional*): The number of channels in the output. If not given, defaults to `dim`. + mult (`int`, *optional*, defaults to 4): The multiplier to use for the hidden dimension. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward. + final_dropout (`bool` *optional*, defaults to False): Apply a final dropout. + """ + + def __init__( + self, + dim: int, + dim_out: Optional[int] = None, + mult: int = 4, + dropout: float = 0.0, + activation_fn: str = "geglu", + final_dropout: bool = False, + ): + super().__init__() + inner_dim = int(dim * mult) + dim_out = dim_out if dim_out is not None else dim + + if activation_fn == "gelu": + act_fn = GELU(dim, inner_dim) + if activation_fn == "gelu-approximate": + act_fn = GELU(dim, inner_dim, approximate="tanh") + elif activation_fn == "geglu": + act_fn = GEGLU(dim, inner_dim) + elif activation_fn == "geglu-approximate": + act_fn = ApproximateGELU(dim, inner_dim) + + self.net = nn.ModuleList([]) + # project in + self.net.append(act_fn) + # project dropout + self.net.append(nn.Dropout(dropout)) + # project out + self.net.append(nn.Linear(inner_dim, dim_out)) + # FF as used in Vision Transformer, MLP-Mixer, etc. have a final dropout + if final_dropout: + self.net.append(nn.Dropout(dropout)) + + def forward(self, hidden_states): + for module in self.net: + hidden_states = module(hidden_states) + return hidden_states + + +class GELU(nn.Module): + r""" + GELU activation function with tanh approximation support with `approximate="tanh"`. + """ + + def __init__(self, dim_in: int, dim_out: int, approximate: str = "none"): + super().__init__() + self.proj = nn.Linear(dim_in, dim_out) + self.approximate = approximate + + def gelu(self, gate): + if gate.device.type != "mps": + return F.gelu(gate, approximate=self.approximate) + # mps: gelu is not implemented for float16 + return F.gelu(gate.to(dtype=torch.float32), approximate=self.approximate).to(dtype=gate.dtype) + + def forward(self, hidden_states): + hidden_states = self.proj(hidden_states) + hidden_states = self.gelu(hidden_states) + return hidden_states + + +class GEGLU(nn.Module): + r""" + A variant of the gated linear unit activation function from https://arxiv.org/abs/2002.05202. + + Parameters: + dim_in (`int`): The number of channels in the input. + dim_out (`int`): The number of channels in the output. + """ + + def __init__(self, dim_in: int, dim_out: int): + super().__init__() + self.proj = nn.Linear(dim_in, dim_out * 2) + + def gelu(self, gate): + if gate.device.type != "mps": + return F.gelu(gate) + # mps: gelu is not implemented for float16 + return F.gelu(gate.to(dtype=torch.float32)).to(dtype=gate.dtype) + + def forward(self, hidden_states): + hidden_states, gate = self.proj(hidden_states).chunk(2, dim=-1) + return hidden_states * self.gelu(gate) + + +class ApproximateGELU(nn.Module): + """ + The approximate form of Gaussian Error Linear Unit (GELU) + + For more details, see section 2: https://arxiv.org/abs/1606.08415 + """ + + def __init__(self, dim_in: int, dim_out: int): + super().__init__() + self.proj = nn.Linear(dim_in, dim_out) + + def forward(self, x): + x = self.proj(x) + return x * torch.sigmoid(1.702 * x) + + +class AdaLayerNorm(nn.Module): + """ + Norm layer modified to incorporate timestep embeddings. + """ + + def __init__(self, embedding_dim, num_embeddings): + super().__init__() + self.emb = nn.Embedding(num_embeddings, embedding_dim) + self.silu = nn.SiLU() + self.linear = nn.Linear(embedding_dim, embedding_dim * 2) + self.norm = nn.LayerNorm(embedding_dim, elementwise_affine=False) + + def forward(self, x, timestep): + emb = self.linear(self.silu(self.emb(timestep))) + scale, shift = torch.chunk(emb, 2) + x = self.norm(x) * (1 + scale) + shift + return x + + +class AdaLayerNormZero(nn.Module): + """ + Norm layer adaptive layer norm zero (adaLN-Zero). + """ + + def __init__(self, embedding_dim, num_embeddings): + super().__init__() + + self.emb = CombinedTimestepLabelEmbeddings(num_embeddings, embedding_dim) + + self.silu = nn.SiLU() + self.linear = nn.Linear(embedding_dim, 6 * embedding_dim, bias=True) + self.norm = nn.LayerNorm(embedding_dim, elementwise_affine=False, eps=1e-6) + + def forward(self, x, timestep, class_labels, hidden_dtype=None): + emb = self.linear(self.silu(self.emb(timestep, class_labels, hidden_dtype=hidden_dtype))) + shift_msa, scale_msa, gate_msa, shift_mlp, scale_mlp, gate_mlp = emb.chunk(6, dim=1) + x = self.norm(x) * (1 + scale_msa[:, None]) + shift_msa[:, None] + return x, gate_msa, shift_mlp, scale_mlp, gate_mlp + + +class AdaGroupNorm(nn.Module): + """ + GroupNorm layer modified to incorporate timestep embeddings. + """ + + def __init__( + self, embedding_dim: int, out_dim: int, num_groups: int, act_fn: Optional[str] = None, eps: float = 1e-5 + ): + super().__init__() + self.num_groups = num_groups + self.eps = eps + + if act_fn is None: + self.act = None + else: + self.act = get_activation(act_fn) + + self.linear = nn.Linear(embedding_dim, out_dim * 2) + + def forward(self, x, emb): + if self.act: + emb = self.act(emb) + emb = self.linear(emb) + emb = emb[:, :, None, None] + scale, shift = emb.chunk(2, dim=1) + + x = F.group_norm(x, self.num_groups, eps=self.eps) + x = x * (1 + scale) + shift + return x diff --git a/gf/sd-webui-rich-text/diffusers_official/models/attention_flax.py b/gf/sd-webui-rich-text/diffusers_official/models/attention_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..0b160d2384311c1fb426b87c11e5fa1572584070 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/attention_flax.py @@ -0,0 +1,446 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import functools +import math + +import flax.linen as nn +import jax +import jax.numpy as jnp + + +def _query_chunk_attention(query, key, value, precision, key_chunk_size: int = 4096): + """Multi-head dot product attention with a limited number of queries.""" + num_kv, num_heads, k_features = key.shape[-3:] + v_features = value.shape[-1] + key_chunk_size = min(key_chunk_size, num_kv) + query = query / jnp.sqrt(k_features) + + @functools.partial(jax.checkpoint, prevent_cse=False) + def summarize_chunk(query, key, value): + attn_weights = jnp.einsum("...qhd,...khd->...qhk", query, key, precision=precision) + + max_score = jnp.max(attn_weights, axis=-1, keepdims=True) + max_score = jax.lax.stop_gradient(max_score) + exp_weights = jnp.exp(attn_weights - max_score) + + exp_values = jnp.einsum("...vhf,...qhv->...qhf", value, exp_weights, precision=precision) + max_score = jnp.einsum("...qhk->...qh", max_score) + + return (exp_values, exp_weights.sum(axis=-1), max_score) + + def chunk_scanner(chunk_idx): + # julienne key array + key_chunk = jax.lax.dynamic_slice( + operand=key, + start_indices=[0] * (key.ndim - 3) + [chunk_idx, 0, 0], # [...,k,h,d] + slice_sizes=list(key.shape[:-3]) + [key_chunk_size, num_heads, k_features], # [...,k,h,d] + ) + + # julienne value array + value_chunk = jax.lax.dynamic_slice( + operand=value, + start_indices=[0] * (value.ndim - 3) + [chunk_idx, 0, 0], # [...,v,h,d] + slice_sizes=list(value.shape[:-3]) + [key_chunk_size, num_heads, v_features], # [...,v,h,d] + ) + + return summarize_chunk(query, key_chunk, value_chunk) + + chunk_values, chunk_weights, chunk_max = jax.lax.map(f=chunk_scanner, xs=jnp.arange(0, num_kv, key_chunk_size)) + + global_max = jnp.max(chunk_max, axis=0, keepdims=True) + max_diffs = jnp.exp(chunk_max - global_max) + + chunk_values *= jnp.expand_dims(max_diffs, axis=-1) + chunk_weights *= max_diffs + + all_values = chunk_values.sum(axis=0) + all_weights = jnp.expand_dims(chunk_weights, -1).sum(axis=0) + + return all_values / all_weights + + +def jax_memory_efficient_attention( + query, key, value, precision=jax.lax.Precision.HIGHEST, query_chunk_size: int = 1024, key_chunk_size: int = 4096 +): + r""" + Flax Memory-efficient multi-head dot product attention. https://arxiv.org/abs/2112.05682v2 + https://github.com/AminRezaei0x443/memory-efficient-attention + + Args: + query (`jnp.ndarray`): (batch..., query_length, head, query_key_depth_per_head) + key (`jnp.ndarray`): (batch..., key_value_length, head, query_key_depth_per_head) + value (`jnp.ndarray`): (batch..., key_value_length, head, value_depth_per_head) + precision (`jax.lax.Precision`, *optional*, defaults to `jax.lax.Precision.HIGHEST`): + numerical precision for computation + query_chunk_size (`int`, *optional*, defaults to 1024): + chunk size to divide query array value must divide query_length equally without remainder + key_chunk_size (`int`, *optional*, defaults to 4096): + chunk size to divide key and value array value must divide key_value_length equally without remainder + + Returns: + (`jnp.ndarray`) with shape of (batch..., query_length, head, value_depth_per_head) + """ + num_q, num_heads, q_features = query.shape[-3:] + + def chunk_scanner(chunk_idx, _): + # julienne query array + query_chunk = jax.lax.dynamic_slice( + operand=query, + start_indices=([0] * (query.ndim - 3)) + [chunk_idx, 0, 0], # [...,q,h,d] + slice_sizes=list(query.shape[:-3]) + [min(query_chunk_size, num_q), num_heads, q_features], # [...,q,h,d] + ) + + return ( + chunk_idx + query_chunk_size, # unused ignore it + _query_chunk_attention( + query=query_chunk, key=key, value=value, precision=precision, key_chunk_size=key_chunk_size + ), + ) + + _, res = jax.lax.scan( + f=chunk_scanner, init=0, xs=None, length=math.ceil(num_q / query_chunk_size) # start counter # stop counter + ) + + return jnp.concatenate(res, axis=-3) # fuse the chunked result back + + +class FlaxAttention(nn.Module): + r""" + A Flax multi-head attention module as described in: https://arxiv.org/abs/1706.03762 + + Parameters: + query_dim (:obj:`int`): + Input hidden states dimension + heads (:obj:`int`, *optional*, defaults to 8): + Number of heads + dim_head (:obj:`int`, *optional*, defaults to 64): + Hidden states dimension inside each head + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + use_memory_efficient_attention (`bool`, *optional*, defaults to `False`): + enable memory efficient attention https://arxiv.org/abs/2112.05682 + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + + """ + query_dim: int + heads: int = 8 + dim_head: int = 64 + dropout: float = 0.0 + use_memory_efficient_attention: bool = False + dtype: jnp.dtype = jnp.float32 + + def setup(self): + inner_dim = self.dim_head * self.heads + self.scale = self.dim_head**-0.5 + + # Weights were exported with old names {to_q, to_k, to_v, to_out} + self.query = nn.Dense(inner_dim, use_bias=False, dtype=self.dtype, name="to_q") + self.key = nn.Dense(inner_dim, use_bias=False, dtype=self.dtype, name="to_k") + self.value = nn.Dense(inner_dim, use_bias=False, dtype=self.dtype, name="to_v") + + self.proj_attn = nn.Dense(self.query_dim, dtype=self.dtype, name="to_out_0") + self.dropout_layer = nn.Dropout(rate=self.dropout) + + def reshape_heads_to_batch_dim(self, tensor): + batch_size, seq_len, dim = tensor.shape + head_size = self.heads + tensor = tensor.reshape(batch_size, seq_len, head_size, dim // head_size) + tensor = jnp.transpose(tensor, (0, 2, 1, 3)) + tensor = tensor.reshape(batch_size * head_size, seq_len, dim // head_size) + return tensor + + def reshape_batch_dim_to_heads(self, tensor): + batch_size, seq_len, dim = tensor.shape + head_size = self.heads + tensor = tensor.reshape(batch_size // head_size, head_size, seq_len, dim) + tensor = jnp.transpose(tensor, (0, 2, 1, 3)) + tensor = tensor.reshape(batch_size // head_size, seq_len, dim * head_size) + return tensor + + def __call__(self, hidden_states, context=None, deterministic=True): + context = hidden_states if context is None else context + + query_proj = self.query(hidden_states) + key_proj = self.key(context) + value_proj = self.value(context) + + query_states = self.reshape_heads_to_batch_dim(query_proj) + key_states = self.reshape_heads_to_batch_dim(key_proj) + value_states = self.reshape_heads_to_batch_dim(value_proj) + + if self.use_memory_efficient_attention: + query_states = query_states.transpose(1, 0, 2) + key_states = key_states.transpose(1, 0, 2) + value_states = value_states.transpose(1, 0, 2) + + # this if statement create a chunk size for each layer of the unet + # the chunk size is equal to the query_length dimension of the deepest layer of the unet + + flatten_latent_dim = query_states.shape[-3] + if flatten_latent_dim % 64 == 0: + query_chunk_size = int(flatten_latent_dim / 64) + elif flatten_latent_dim % 16 == 0: + query_chunk_size = int(flatten_latent_dim / 16) + elif flatten_latent_dim % 4 == 0: + query_chunk_size = int(flatten_latent_dim / 4) + else: + query_chunk_size = int(flatten_latent_dim) + + hidden_states = jax_memory_efficient_attention( + query_states, key_states, value_states, query_chunk_size=query_chunk_size, key_chunk_size=4096 * 4 + ) + + hidden_states = hidden_states.transpose(1, 0, 2) + else: + # compute attentions + attention_scores = jnp.einsum("b i d, b j d->b i j", query_states, key_states) + attention_scores = attention_scores * self.scale + attention_probs = nn.softmax(attention_scores, axis=2) + + # attend to values + hidden_states = jnp.einsum("b i j, b j d -> b i d", attention_probs, value_states) + + hidden_states = self.reshape_batch_dim_to_heads(hidden_states) + hidden_states = self.proj_attn(hidden_states) + return self.dropout_layer(hidden_states, deterministic=deterministic) + + +class FlaxBasicTransformerBlock(nn.Module): + r""" + A Flax transformer block layer with `GLU` (Gated Linear Unit) activation function as described in: + https://arxiv.org/abs/1706.03762 + + + Parameters: + dim (:obj:`int`): + Inner hidden states dimension + n_heads (:obj:`int`): + Number of heads + d_head (:obj:`int`): + Hidden states dimension inside each head + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + only_cross_attention (`bool`, defaults to `False`): + Whether to only apply cross attention. + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + use_memory_efficient_attention (`bool`, *optional*, defaults to `False`): + enable memory efficient attention https://arxiv.org/abs/2112.05682 + """ + dim: int + n_heads: int + d_head: int + dropout: float = 0.0 + only_cross_attention: bool = False + dtype: jnp.dtype = jnp.float32 + use_memory_efficient_attention: bool = False + + def setup(self): + # self attention (or cross_attention if only_cross_attention is True) + self.attn1 = FlaxAttention( + self.dim, self.n_heads, self.d_head, self.dropout, self.use_memory_efficient_attention, dtype=self.dtype + ) + # cross attention + self.attn2 = FlaxAttention( + self.dim, self.n_heads, self.d_head, self.dropout, self.use_memory_efficient_attention, dtype=self.dtype + ) + self.ff = FlaxFeedForward(dim=self.dim, dropout=self.dropout, dtype=self.dtype) + self.norm1 = nn.LayerNorm(epsilon=1e-5, dtype=self.dtype) + self.norm2 = nn.LayerNorm(epsilon=1e-5, dtype=self.dtype) + self.norm3 = nn.LayerNorm(epsilon=1e-5, dtype=self.dtype) + self.dropout_layer = nn.Dropout(rate=self.dropout) + + def __call__(self, hidden_states, context, deterministic=True): + # self attention + residual = hidden_states + if self.only_cross_attention: + hidden_states = self.attn1(self.norm1(hidden_states), context, deterministic=deterministic) + else: + hidden_states = self.attn1(self.norm1(hidden_states), deterministic=deterministic) + hidden_states = hidden_states + residual + + # cross attention + residual = hidden_states + hidden_states = self.attn2(self.norm2(hidden_states), context, deterministic=deterministic) + hidden_states = hidden_states + residual + + # feed forward + residual = hidden_states + hidden_states = self.ff(self.norm3(hidden_states), deterministic=deterministic) + hidden_states = hidden_states + residual + + return self.dropout_layer(hidden_states, deterministic=deterministic) + + +class FlaxTransformer2DModel(nn.Module): + r""" + A Spatial Transformer layer with Gated Linear Unit (GLU) activation function as described in: + https://arxiv.org/pdf/1506.02025.pdf + + + Parameters: + in_channels (:obj:`int`): + Input number of channels + n_heads (:obj:`int`): + Number of heads + d_head (:obj:`int`): + Hidden states dimension inside each head + depth (:obj:`int`, *optional*, defaults to 1): + Number of transformers block + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + use_linear_projection (`bool`, defaults to `False`): tbd + only_cross_attention (`bool`, defaults to `False`): tbd + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + use_memory_efficient_attention (`bool`, *optional*, defaults to `False`): + enable memory efficient attention https://arxiv.org/abs/2112.05682 + """ + in_channels: int + n_heads: int + d_head: int + depth: int = 1 + dropout: float = 0.0 + use_linear_projection: bool = False + only_cross_attention: bool = False + dtype: jnp.dtype = jnp.float32 + use_memory_efficient_attention: bool = False + + def setup(self): + self.norm = nn.GroupNorm(num_groups=32, epsilon=1e-5) + + inner_dim = self.n_heads * self.d_head + if self.use_linear_projection: + self.proj_in = nn.Dense(inner_dim, dtype=self.dtype) + else: + self.proj_in = nn.Conv( + inner_dim, + kernel_size=(1, 1), + strides=(1, 1), + padding="VALID", + dtype=self.dtype, + ) + + self.transformer_blocks = [ + FlaxBasicTransformerBlock( + inner_dim, + self.n_heads, + self.d_head, + dropout=self.dropout, + only_cross_attention=self.only_cross_attention, + dtype=self.dtype, + use_memory_efficient_attention=self.use_memory_efficient_attention, + ) + for _ in range(self.depth) + ] + + if self.use_linear_projection: + self.proj_out = nn.Dense(inner_dim, dtype=self.dtype) + else: + self.proj_out = nn.Conv( + inner_dim, + kernel_size=(1, 1), + strides=(1, 1), + padding="VALID", + dtype=self.dtype, + ) + + self.dropout_layer = nn.Dropout(rate=self.dropout) + + def __call__(self, hidden_states, context, deterministic=True): + batch, height, width, channels = hidden_states.shape + residual = hidden_states + hidden_states = self.norm(hidden_states) + if self.use_linear_projection: + hidden_states = hidden_states.reshape(batch, height * width, channels) + hidden_states = self.proj_in(hidden_states) + else: + hidden_states = self.proj_in(hidden_states) + hidden_states = hidden_states.reshape(batch, height * width, channels) + + for transformer_block in self.transformer_blocks: + hidden_states = transformer_block(hidden_states, context, deterministic=deterministic) + + if self.use_linear_projection: + hidden_states = self.proj_out(hidden_states) + hidden_states = hidden_states.reshape(batch, height, width, channels) + else: + hidden_states = hidden_states.reshape(batch, height, width, channels) + hidden_states = self.proj_out(hidden_states) + + hidden_states = hidden_states + residual + return self.dropout_layer(hidden_states, deterministic=deterministic) + + +class FlaxFeedForward(nn.Module): + r""" + Flax module that encapsulates two Linear layers separated by a non-linearity. It is the counterpart of PyTorch's + [`FeedForward`] class, with the following simplifications: + - The activation function is currently hardcoded to a gated linear unit from: + https://arxiv.org/abs/2002.05202 + - `dim_out` is equal to `dim`. + - The number of hidden dimensions is hardcoded to `dim * 4` in [`FlaxGELU`]. + + Parameters: + dim (:obj:`int`): + Inner hidden states dimension + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + dim: int + dropout: float = 0.0 + dtype: jnp.dtype = jnp.float32 + + def setup(self): + # The second linear layer needs to be called + # net_2 for now to match the index of the Sequential layer + self.net_0 = FlaxGEGLU(self.dim, self.dropout, self.dtype) + self.net_2 = nn.Dense(self.dim, dtype=self.dtype) + + def __call__(self, hidden_states, deterministic=True): + hidden_states = self.net_0(hidden_states, deterministic=deterministic) + hidden_states = self.net_2(hidden_states) + return hidden_states + + +class FlaxGEGLU(nn.Module): + r""" + Flax implementation of a Linear layer followed by the variant of the gated linear unit activation function from + https://arxiv.org/abs/2002.05202. + + Parameters: + dim (:obj:`int`): + Input hidden states dimension + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + dim: int + dropout: float = 0.0 + dtype: jnp.dtype = jnp.float32 + + def setup(self): + inner_dim = self.dim * 4 + self.proj = nn.Dense(inner_dim * 2, dtype=self.dtype) + self.dropout_layer = nn.Dropout(rate=self.dropout) + + def __call__(self, hidden_states, deterministic=True): + hidden_states = self.proj(hidden_states) + hidden_linear, hidden_gelu = jnp.split(hidden_states, 2, axis=2) + return self.dropout_layer(hidden_linear * nn.gelu(hidden_gelu), deterministic=deterministic) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/attention_processor.py b/gf/sd-webui-rich-text/diffusers_official/models/attention_processor.py new file mode 100644 index 0000000000000000000000000000000000000000..5b6a161f84664a697e11e793e7ed96066d301ace --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/attention_processor.py @@ -0,0 +1,1647 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Callable, Optional, Union + +import torch +import torch.nn.functional as F +from torch import nn + +from ..utils import deprecate, logging, maybe_allow_in_graph +from ..utils.import_utils import is_xformers_available + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +if is_xformers_available(): + import xformers + import xformers.ops +else: + xformers = None + + +@maybe_allow_in_graph +class Attention(nn.Module): + r""" + A cross attention layer. + + Parameters: + query_dim (`int`): The number of channels in the query. + cross_attention_dim (`int`, *optional*): + The number of channels in the encoder_hidden_states. If not given, defaults to `query_dim`. + heads (`int`, *optional*, defaults to 8): The number of heads to use for multi-head attention. + dim_head (`int`, *optional*, defaults to 64): The number of channels in each head. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + bias (`bool`, *optional*, defaults to False): + Set to `True` for the query, key, and value linear layers to contain a bias parameter. + """ + + def __init__( + self, + query_dim: int, + cross_attention_dim: Optional[int] = None, + heads: int = 8, + dim_head: int = 64, + dropout: float = 0.0, + bias=False, + upcast_attention: bool = False, + upcast_softmax: bool = False, + cross_attention_norm: Optional[str] = None, + cross_attention_norm_num_groups: int = 32, + added_kv_proj_dim: Optional[int] = None, + norm_num_groups: Optional[int] = None, + spatial_norm_dim: Optional[int] = None, + out_bias: bool = True, + scale_qk: bool = True, + only_cross_attention: bool = False, + eps: float = 1e-5, + rescale_output_factor: float = 1.0, + residual_connection: bool = False, + _from_deprecated_attn_block=False, + processor: Optional["AttnProcessor"] = None, + ): + super().__init__() + inner_dim = dim_head * heads + cross_attention_dim = cross_attention_dim if cross_attention_dim is not None else query_dim + self.upcast_attention = upcast_attention + self.upcast_softmax = upcast_softmax + self.rescale_output_factor = rescale_output_factor + self.residual_connection = residual_connection + self.dropout = dropout + + # we make use of this private variable to know whether this class is loaded + # with an deprecated state dict so that we can convert it on the fly + self._from_deprecated_attn_block = _from_deprecated_attn_block + + self.scale_qk = scale_qk + self.scale = dim_head**-0.5 if self.scale_qk else 1.0 + + self.heads = heads + # for slice_size > 0 the attention score computation + # is split across the batch axis to save memory + # You can set slice_size with `set_attention_slice` + self.sliceable_head_dim = heads + + self.added_kv_proj_dim = added_kv_proj_dim + self.only_cross_attention = only_cross_attention + + if self.added_kv_proj_dim is None and self.only_cross_attention: + raise ValueError( + "`only_cross_attention` can only be set to True if `added_kv_proj_dim` is not None. Make sure to set either `only_cross_attention=False` or define `added_kv_proj_dim`." + ) + + if norm_num_groups is not None: + self.group_norm = nn.GroupNorm(num_channels=query_dim, num_groups=norm_num_groups, eps=eps, affine=True) + else: + self.group_norm = None + + if spatial_norm_dim is not None: + self.spatial_norm = SpatialNorm(f_channels=query_dim, zq_channels=spatial_norm_dim) + else: + self.spatial_norm = None + + if cross_attention_norm is None: + self.norm_cross = None + elif cross_attention_norm == "layer_norm": + self.norm_cross = nn.LayerNorm(cross_attention_dim) + elif cross_attention_norm == "group_norm": + if self.added_kv_proj_dim is not None: + # The given `encoder_hidden_states` are initially of shape + # (batch_size, seq_len, added_kv_proj_dim) before being projected + # to (batch_size, seq_len, cross_attention_dim). The norm is applied + # before the projection, so we need to use `added_kv_proj_dim` as + # the number of channels for the group norm. + norm_cross_num_channels = added_kv_proj_dim + else: + norm_cross_num_channels = cross_attention_dim + + self.norm_cross = nn.GroupNorm( + num_channels=norm_cross_num_channels, num_groups=cross_attention_norm_num_groups, eps=1e-5, affine=True + ) + else: + raise ValueError( + f"unknown cross_attention_norm: {cross_attention_norm}. Should be None, 'layer_norm' or 'group_norm'" + ) + + self.to_q = nn.Linear(query_dim, inner_dim, bias=bias) + + if not self.only_cross_attention: + # only relevant for the `AddedKVProcessor` classes + self.to_k = nn.Linear(cross_attention_dim, inner_dim, bias=bias) + self.to_v = nn.Linear(cross_attention_dim, inner_dim, bias=bias) + else: + self.to_k = None + self.to_v = None + + if self.added_kv_proj_dim is not None: + self.add_k_proj = nn.Linear(added_kv_proj_dim, inner_dim) + self.add_v_proj = nn.Linear(added_kv_proj_dim, inner_dim) + + self.to_out = nn.ModuleList([]) + self.to_out.append(nn.Linear(inner_dim, query_dim, bias=out_bias)) + self.to_out.append(nn.Dropout(dropout)) + + # set attention processor + # We use the AttnProcessor2_0 by default when torch 2.x is used which uses + # torch.nn.functional.scaled_dot_product_attention for native Flash/memory_efficient_attention + # but only if it has the default `scale` argument. TODO remove scale_qk check when we move to torch 2.1 + if processor is None: + processor = ( + AttnProcessor2_0() if hasattr(F, "scaled_dot_product_attention") and self.scale_qk else AttnProcessor() + ) + self.set_processor(processor) + + def set_use_memory_efficient_attention_xformers( + self, use_memory_efficient_attention_xformers: bool, attention_op: Optional[Callable] = None + ): + is_lora = hasattr(self, "processor") and isinstance( + self.processor, + (LoRAAttnProcessor, LoRAAttnProcessor2_0, LoRAXFormersAttnProcessor, LoRAAttnAddedKVProcessor), + ) + is_custom_diffusion = hasattr(self, "processor") and isinstance( + self.processor, (CustomDiffusionAttnProcessor, CustomDiffusionXFormersAttnProcessor) + ) + is_added_kv_processor = hasattr(self, "processor") and isinstance( + self.processor, + ( + AttnAddedKVProcessor, + AttnAddedKVProcessor2_0, + SlicedAttnAddedKVProcessor, + XFormersAttnAddedKVProcessor, + LoRAAttnAddedKVProcessor, + ), + ) + + if use_memory_efficient_attention_xformers: + if is_added_kv_processor and (is_lora or is_custom_diffusion): + raise NotImplementedError( + f"Memory efficient attention is currently not supported for LoRA or custom diffuson for attention processor type {self.processor}" + ) + if not is_xformers_available(): + raise ModuleNotFoundError( + ( + "Refer to https://github.com/facebookresearch/xformers for more information on how to install" + " xformers" + ), + name="xformers", + ) + elif not torch.cuda.is_available(): + raise ValueError( + "torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is" + " only available for GPU " + ) + else: + try: + # Make sure we can run the memory efficient attention + _ = xformers.ops.memory_efficient_attention( + torch.randn((1, 2, 40), device="cuda"), + torch.randn((1, 2, 40), device="cuda"), + torch.randn((1, 2, 40), device="cuda"), + ) + except Exception as e: + raise e + + if is_lora: + # TODO (sayakpaul): should we throw a warning if someone wants to use the xformers + # variant when using PT 2.0 now that we have LoRAAttnProcessor2_0? + processor = LoRAXFormersAttnProcessor( + hidden_size=self.processor.hidden_size, + cross_attention_dim=self.processor.cross_attention_dim, + rank=self.processor.rank, + attention_op=attention_op, + ) + processor.load_state_dict(self.processor.state_dict()) + processor.to(self.processor.to_q_lora.up.weight.device) + elif is_custom_diffusion: + processor = CustomDiffusionXFormersAttnProcessor( + train_kv=self.processor.train_kv, + train_q_out=self.processor.train_q_out, + hidden_size=self.processor.hidden_size, + cross_attention_dim=self.processor.cross_attention_dim, + attention_op=attention_op, + ) + processor.load_state_dict(self.processor.state_dict()) + if hasattr(self.processor, "to_k_custom_diffusion"): + processor.to(self.processor.to_k_custom_diffusion.weight.device) + elif is_added_kv_processor: + # TODO(Patrick, Suraj, William) - currently xformers doesn't work for UnCLIP + # which uses this type of cross attention ONLY because the attention mask of format + # [0, ..., -10.000, ..., 0, ...,] is not supported + # throw warning + logger.info( + "Memory efficient attention with `xformers` might currently not work correctly if an attention mask is required for the attention operation." + ) + processor = XFormersAttnAddedKVProcessor(attention_op=attention_op) + else: + processor = XFormersAttnProcessor(attention_op=attention_op) + else: + if is_lora: + attn_processor_class = ( + LoRAAttnProcessor2_0 if hasattr(F, "scaled_dot_product_attention") else LoRAAttnProcessor + ) + processor = attn_processor_class( + hidden_size=self.processor.hidden_size, + cross_attention_dim=self.processor.cross_attention_dim, + rank=self.processor.rank, + ) + processor.load_state_dict(self.processor.state_dict()) + processor.to(self.processor.to_q_lora.up.weight.device) + elif is_custom_diffusion: + processor = CustomDiffusionAttnProcessor( + train_kv=self.processor.train_kv, + train_q_out=self.processor.train_q_out, + hidden_size=self.processor.hidden_size, + cross_attention_dim=self.processor.cross_attention_dim, + ) + processor.load_state_dict(self.processor.state_dict()) + if hasattr(self.processor, "to_k_custom_diffusion"): + processor.to(self.processor.to_k_custom_diffusion.weight.device) + else: + # set attention processor + # We use the AttnProcessor2_0 by default when torch 2.x is used which uses + # torch.nn.functional.scaled_dot_product_attention for native Flash/memory_efficient_attention + # but only if it has the default `scale` argument. TODO remove scale_qk check when we move to torch 2.1 + processor = ( + AttnProcessor2_0() + if hasattr(F, "scaled_dot_product_attention") and self.scale_qk + else AttnProcessor() + ) + + self.set_processor(processor) + + def set_attention_slice(self, slice_size): + if slice_size is not None and slice_size > self.sliceable_head_dim: + raise ValueError(f"slice_size {slice_size} has to be smaller or equal to {self.sliceable_head_dim}.") + + if slice_size is not None and self.added_kv_proj_dim is not None: + processor = SlicedAttnAddedKVProcessor(slice_size) + elif slice_size is not None: + processor = SlicedAttnProcessor(slice_size) + elif self.added_kv_proj_dim is not None: + processor = AttnAddedKVProcessor() + else: + # set attention processor + # We use the AttnProcessor2_0 by default when torch 2.x is used which uses + # torch.nn.functional.scaled_dot_product_attention for native Flash/memory_efficient_attention + # but only if it has the default `scale` argument. TODO remove scale_qk check when we move to torch 2.1 + processor = ( + AttnProcessor2_0() if hasattr(F, "scaled_dot_product_attention") and self.scale_qk else AttnProcessor() + ) + + self.set_processor(processor) + + def set_processor(self, processor: "AttnProcessor"): + # if current processor is in `self._modules` and if passed `processor` is not, we need to + # pop `processor` from `self._modules` + if ( + hasattr(self, "processor") + and isinstance(self.processor, torch.nn.Module) + and not isinstance(processor, torch.nn.Module) + ): + logger.info(f"You are removing possibly trained weights of {self.processor} with {processor}") + self._modules.pop("processor") + + self.processor = processor + + def forward(self, hidden_states, encoder_hidden_states=None, attention_mask=None, **cross_attention_kwargs): + # The `Attention` class can call different attention processors / attention functions + # here we simply pass along all tensors to the selected processor class + # For standard processors that are defined here, `**cross_attention_kwargs` is empty + return self.processor( + self, + hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + **cross_attention_kwargs, + ) + + def batch_to_head_dim(self, tensor): + head_size = self.heads + batch_size, seq_len, dim = tensor.shape + tensor = tensor.reshape(batch_size // head_size, head_size, seq_len, dim) + tensor = tensor.permute(0, 2, 1, 3).reshape(batch_size // head_size, seq_len, dim * head_size) + return tensor + + def head_to_batch_dim(self, tensor, out_dim=3): + head_size = self.heads + batch_size, seq_len, dim = tensor.shape + tensor = tensor.reshape(batch_size, seq_len, head_size, dim // head_size) + tensor = tensor.permute(0, 2, 1, 3) + + if out_dim == 3: + tensor = tensor.reshape(batch_size * head_size, seq_len, dim // head_size) + + return tensor + + def get_attention_scores(self, query, key, attention_mask=None): + dtype = query.dtype + if self.upcast_attention: + query = query.float() + key = key.float() + + if attention_mask is None: + baddbmm_input = torch.empty( + query.shape[0], query.shape[1], key.shape[1], dtype=query.dtype, device=query.device + ) + beta = 0 + else: + baddbmm_input = attention_mask + beta = 1 + + attention_scores = torch.baddbmm( + baddbmm_input, + query, + key.transpose(-1, -2), + beta=beta, + alpha=self.scale, + ) + del baddbmm_input + + if self.upcast_softmax: + attention_scores = attention_scores.float() + + attention_probs = attention_scores.softmax(dim=-1) + del attention_scores + + attention_probs = attention_probs.to(dtype) + + return attention_probs + + def prepare_attention_mask(self, attention_mask, target_length, batch_size=None, out_dim=3): + if batch_size is None: + deprecate( + "batch_size=None", + "0.0.15", + ( + "Not passing the `batch_size` parameter to `prepare_attention_mask` can lead to incorrect" + " attention mask preparation and is deprecated behavior. Please make sure to pass `batch_size` to" + " `prepare_attention_mask` when preparing the attention_mask." + ), + ) + batch_size = 1 + + head_size = self.heads + if attention_mask is None: + return attention_mask + + current_length: int = attention_mask.shape[-1] + if current_length != target_length: + if attention_mask.device.type == "mps": + # HACK: MPS: Does not support padding by greater than dimension of input tensor. + # Instead, we can manually construct the padding tensor. + padding_shape = (attention_mask.shape[0], attention_mask.shape[1], target_length) + padding = torch.zeros(padding_shape, dtype=attention_mask.dtype, device=attention_mask.device) + attention_mask = torch.cat([attention_mask, padding], dim=2) + else: + # TODO: for pipelines such as stable-diffusion, padding cross-attn mask: + # we want to instead pad by (0, remaining_length), where remaining_length is: + # remaining_length: int = target_length - current_length + # TODO: re-enable tests/models/test_models_unet_2d_condition.py#test_model_xattn_padding + attention_mask = F.pad(attention_mask, (0, target_length), value=0.0) + + if out_dim == 3: + if attention_mask.shape[0] < batch_size * head_size: + attention_mask = attention_mask.repeat_interleave(head_size, dim=0) + elif out_dim == 4: + attention_mask = attention_mask.unsqueeze(1) + attention_mask = attention_mask.repeat_interleave(head_size, dim=1) + + return attention_mask + + def norm_encoder_hidden_states(self, encoder_hidden_states): + assert self.norm_cross is not None, "self.norm_cross must be defined to call self.norm_encoder_hidden_states" + + if isinstance(self.norm_cross, nn.LayerNorm): + encoder_hidden_states = self.norm_cross(encoder_hidden_states) + elif isinstance(self.norm_cross, nn.GroupNorm): + # Group norm norms along the channels dimension and expects + # input to be in the shape of (N, C, *). In this case, we want + # to norm along the hidden dimension, so we need to move + # (batch_size, sequence_length, hidden_size) -> + # (batch_size, hidden_size, sequence_length) + encoder_hidden_states = encoder_hidden_states.transpose(1, 2) + encoder_hidden_states = self.norm_cross(encoder_hidden_states) + encoder_hidden_states = encoder_hidden_states.transpose(1, 2) + else: + assert False + + return encoder_hidden_states + + +class AttnProcessor: + r""" + Default processor for performing attention-related computations. + """ + + def __call__( + self, + attn: Attention, + hidden_states, + encoder_hidden_states=None, + attention_mask=None, + temb=None, + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + query = attn.head_to_batch_dim(query) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class LoRALinearLayer(nn.Module): + def __init__(self, in_features, out_features, rank=4, network_alpha=None): + super().__init__() + + if rank > min(in_features, out_features): + raise ValueError(f"LoRA rank {rank} must be less or equal than {min(in_features, out_features)}") + + self.down = nn.Linear(in_features, rank, bias=False) + self.up = nn.Linear(rank, out_features, bias=False) + # This value has the same meaning as the `--network_alpha` option in the kohya-ss trainer script. + # See https://github.com/darkstorm2150/sd-scripts/blob/main/docs/train_network_README-en.md#execute-learning + self.network_alpha = network_alpha + self.rank = rank + + nn.init.normal_(self.down.weight, std=1 / rank) + nn.init.zeros_(self.up.weight) + + def forward(self, hidden_states): + orig_dtype = hidden_states.dtype + dtype = self.down.weight.dtype + + down_hidden_states = self.down(hidden_states.to(dtype)) + up_hidden_states = self.up(down_hidden_states) + + if self.network_alpha is not None: + up_hidden_states *= self.network_alpha / self.rank + + return up_hidden_states.to(orig_dtype) + + +class LoRAAttnProcessor(nn.Module): + r""" + Processor for implementing the LoRA attention mechanism. + + Args: + hidden_size (`int`, *optional*): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*): + The number of channels in the `encoder_hidden_states`. + rank (`int`, defaults to 4): + The dimension of the LoRA update matrices. + network_alpha (`int`, *optional*): + Equivalent to `alpha` but it's usage is specific to Kohya (A1111) style LoRAs. + """ + + def __init__(self, hidden_size, cross_attention_dim=None, rank=4, network_alpha=None): + super().__init__() + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.rank = rank + + self.to_q_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_k_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_v_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_out_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + + def __call__( + self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None, scale=1.0, temb=None + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + scale * self.to_q_lora(hidden_states) + query = attn.head_to_batch_dim(query) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + scale * self.to_k_lora(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + scale * self.to_v_lora(encoder_hidden_states) + + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + scale * self.to_out_lora(hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class CustomDiffusionAttnProcessor(nn.Module): + r""" + Processor for implementing attention for the Custom Diffusion method. + + Args: + train_kv (`bool`, defaults to `True`): + Whether to newly train the key and value matrices corresponding to the text features. + train_q_out (`bool`, defaults to `True`): + Whether to newly train query matrices corresponding to the latent image features. + hidden_size (`int`, *optional*, defaults to `None`): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*, defaults to `None`): + The number of channels in the `encoder_hidden_states`. + out_bias (`bool`, defaults to `True`): + Whether to include the bias parameter in `train_q_out`. + dropout (`float`, *optional*, defaults to 0.0): + The dropout probability to use. + """ + + def __init__( + self, + train_kv=True, + train_q_out=True, + hidden_size=None, + cross_attention_dim=None, + out_bias=True, + dropout=0.0, + ): + super().__init__() + self.train_kv = train_kv + self.train_q_out = train_q_out + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + + # `_custom_diffusion` id for easy serialization and loading. + if self.train_kv: + self.to_k_custom_diffusion = nn.Linear(cross_attention_dim or hidden_size, hidden_size, bias=False) + self.to_v_custom_diffusion = nn.Linear(cross_attention_dim or hidden_size, hidden_size, bias=False) + if self.train_q_out: + self.to_q_custom_diffusion = nn.Linear(hidden_size, hidden_size, bias=False) + self.to_out_custom_diffusion = nn.ModuleList([]) + self.to_out_custom_diffusion.append(nn.Linear(hidden_size, hidden_size, bias=out_bias)) + self.to_out_custom_diffusion.append(nn.Dropout(dropout)) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + batch_size, sequence_length, _ = hidden_states.shape + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + if self.train_q_out: + query = self.to_q_custom_diffusion(hidden_states) + else: + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + crossattn = False + encoder_hidden_states = hidden_states + else: + crossattn = True + if attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + if self.train_kv: + key = self.to_k_custom_diffusion(encoder_hidden_states) + value = self.to_v_custom_diffusion(encoder_hidden_states) + else: + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + if crossattn: + detach = torch.ones_like(key) + detach[:, :1, :] = detach[:, :1, :] * 0.0 + key = detach * key + (1 - detach) * key.detach() + value = detach * value + (1 - detach) * value.detach() + + query = attn.head_to_batch_dim(query) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + if self.train_q_out: + # linear proj + hidden_states = self.to_out_custom_diffusion[0](hidden_states) + # dropout + hidden_states = self.to_out_custom_diffusion[1](hidden_states) + else: + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + return hidden_states + + +class AttnAddedKVProcessor: + r""" + Processor for performing attention-related computations with extra learnable key and value matrices for the text + encoder. + """ + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + residual = hidden_states + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + query = attn.head_to_batch_dim(query) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + value = attn.to_v(hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=1) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=1) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +class AttnAddedKVProcessor2_0: + r""" + Processor for performing scaled dot-product attention (enabled by default if you're using PyTorch 2.0), with extra + learnable key and value matrices for the text encoder. + """ + + def __init__(self): + if not hasattr(F, "scaled_dot_product_attention"): + raise ImportError( + "AttnAddedKVProcessor2_0 requires PyTorch 2.0, to use it, please upgrade PyTorch to 2.0." + ) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + residual = hidden_states + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size, out_dim=4) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + query = attn.head_to_batch_dim(query, out_dim=4) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj, out_dim=4) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj, out_dim=4) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + value = attn.to_v(hidden_states) + key = attn.head_to_batch_dim(key, out_dim=4) + value = attn.head_to_batch_dim(value, out_dim=4) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=2) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=2) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + # the output of sdp = (batch, num_heads, seq_len, head_dim) + # TODO: add support for attn.scale when we move to Torch 2.1 + hidden_states = F.scaled_dot_product_attention( + query, key, value, attn_mask=attention_mask, dropout_p=0.0, is_causal=False + ) + hidden_states = hidden_states.transpose(1, 2).reshape(batch_size, -1, residual.shape[1]) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +class LoRAAttnAddedKVProcessor(nn.Module): + r""" + Processor for implementing the LoRA attention mechanism with extra learnable key and value matrices for the text + encoder. + + Args: + hidden_size (`int`, *optional*): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*, defaults to `None`): + The number of channels in the `encoder_hidden_states`. + rank (`int`, defaults to 4): + The dimension of the LoRA update matrices. + + """ + + def __init__(self, hidden_size, cross_attention_dim=None, rank=4, network_alpha=None): + super().__init__() + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.rank = rank + + self.to_q_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.add_k_proj_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.add_v_proj_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_k_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_v_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_out_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None, scale=1.0): + residual = hidden_states + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + scale * self.to_q_lora(hidden_states) + query = attn.head_to_batch_dim(query) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + scale * self.add_k_proj_lora( + encoder_hidden_states + ) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + scale * self.add_v_proj_lora( + encoder_hidden_states + ) + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + scale * self.to_k_lora(hidden_states) + value = attn.to_v(hidden_states) + scale * self.to_v_lora(hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=1) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=1) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + scale * self.to_out_lora(hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +class XFormersAttnAddedKVProcessor: + r""" + Processor for implementing memory efficient attention using xFormers. + + Args: + attention_op (`Callable`, *optional*, defaults to `None`): + The base + [operator](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.AttentionOpBase) to + use as the attention operator. It is recommended to set to `None`, and allow xFormers to choose the best + operator. + """ + + def __init__(self, attention_op: Optional[Callable] = None): + self.attention_op = attention_op + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + residual = hidden_states + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + query = attn.head_to_batch_dim(query) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + value = attn.to_v(hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=1) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=1) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + hidden_states = xformers.ops.memory_efficient_attention( + query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=attn.scale + ) + hidden_states = hidden_states.to(query.dtype) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +class XFormersAttnProcessor: + r""" + Processor for implementing memory efficient attention using xFormers. + + Args: + attention_op (`Callable`, *optional*, defaults to `None`): + The base + [operator](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.AttentionOpBase) to + use as the attention operator. It is recommended to set to `None`, and allow xFormers to choose the best + operator. + """ + + def __init__(self, attention_op: Optional[Callable] = None): + self.attention_op = attention_op + + def __call__( + self, + attn: Attention, + hidden_states: torch.FloatTensor, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + temb: Optional[torch.FloatTensor] = None, + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, key_tokens, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + + attention_mask = attn.prepare_attention_mask(attention_mask, key_tokens, batch_size) + if attention_mask is not None: + # expand our mask's singleton query_tokens dimension: + # [batch*heads, 1, key_tokens] -> + # [batch*heads, query_tokens, key_tokens] + # so that it can be added as a bias onto the attention scores that xformers computes: + # [batch*heads, query_tokens, key_tokens] + # we do this explicitly because xformers doesn't broadcast the singleton dimension for us. + _, query_tokens, _ = hidden_states.shape + attention_mask = attention_mask.expand(-1, query_tokens, -1) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + query = attn.head_to_batch_dim(query).contiguous() + key = attn.head_to_batch_dim(key).contiguous() + value = attn.head_to_batch_dim(value).contiguous() + + hidden_states = xformers.ops.memory_efficient_attention( + query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=attn.scale + ) + hidden_states = hidden_states.to(query.dtype) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class AttnProcessor2_0: + r""" + Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0). + """ + + def __init__(self): + if not hasattr(F, "scaled_dot_product_attention"): + raise ImportError("AttnProcessor2_0 requires PyTorch 2.0, to use it, please upgrade PyTorch to 2.0.") + + def __call__( + self, + attn: Attention, + hidden_states, + encoder_hidden_states=None, + attention_mask=None, + temb=None, + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + inner_dim = hidden_states.shape[-1] + + if attention_mask is not None: + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + # scaled_dot_product_attention expects attention_mask shape to be + # (batch, heads, source_length, target_length) + attention_mask = attention_mask.view(batch_size, attn.heads, -1, attention_mask.shape[-1]) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + head_dim = inner_dim // attn.heads + + query = query.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + + key = key.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + value = value.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + + # the output of sdp = (batch, num_heads, seq_len, head_dim) + # TODO: add support for attn.scale when we move to Torch 2.1 + hidden_states = F.scaled_dot_product_attention( + query, key, value, attn_mask=attention_mask, dropout_p=0.0, is_causal=False + ) + + hidden_states = hidden_states.transpose(1, 2).reshape(batch_size, -1, attn.heads * head_dim) + hidden_states = hidden_states.to(query.dtype) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class LoRAXFormersAttnProcessor(nn.Module): + r""" + Processor for implementing the LoRA attention mechanism with memory efficient attention using xFormers. + + Args: + hidden_size (`int`, *optional*): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*): + The number of channels in the `encoder_hidden_states`. + rank (`int`, defaults to 4): + The dimension of the LoRA update matrices. + attention_op (`Callable`, *optional*, defaults to `None`): + The base + [operator](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.AttentionOpBase) to + use as the attention operator. It is recommended to set to `None`, and allow xFormers to choose the best + operator. + network_alpha (`int`, *optional*): + Equivalent to `alpha` but it's usage is specific to Kohya (A1111) style LoRAs. + + """ + + def __init__( + self, hidden_size, cross_attention_dim, rank=4, attention_op: Optional[Callable] = None, network_alpha=None + ): + super().__init__() + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.rank = rank + self.attention_op = attention_op + + self.to_q_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_k_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_v_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_out_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + + def __call__( + self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None, scale=1.0, temb=None + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + scale * self.to_q_lora(hidden_states) + query = attn.head_to_batch_dim(query).contiguous() + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + scale * self.to_k_lora(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + scale * self.to_v_lora(encoder_hidden_states) + + key = attn.head_to_batch_dim(key).contiguous() + value = attn.head_to_batch_dim(value).contiguous() + + hidden_states = xformers.ops.memory_efficient_attention( + query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=attn.scale + ) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + scale * self.to_out_lora(hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class LoRAAttnProcessor2_0(nn.Module): + r""" + Processor for implementing the LoRA attention mechanism using PyTorch 2.0's memory-efficient scaled dot-product + attention. + + Args: + hidden_size (`int`): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*): + The number of channels in the `encoder_hidden_states`. + rank (`int`, defaults to 4): + The dimension of the LoRA update matrices. + network_alpha (`int`, *optional*): + Equivalent to `alpha` but it's usage is specific to Kohya (A1111) style LoRAs. + """ + + def __init__(self, hidden_size, cross_attention_dim=None, rank=4, network_alpha=None): + super().__init__() + if not hasattr(F, "scaled_dot_product_attention"): + raise ImportError("AttnProcessor2_0 requires PyTorch 2.0, to use it, please upgrade PyTorch to 2.0.") + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.rank = rank + + self.to_q_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_k_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_v_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_out_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None, scale=1.0): + residual = hidden_states + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + inner_dim = hidden_states.shape[-1] + + if attention_mask is not None: + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + # scaled_dot_product_attention expects attention_mask shape to be + # (batch, heads, source_length, target_length) + attention_mask = attention_mask.view(batch_size, attn.heads, -1, attention_mask.shape[-1]) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + scale * self.to_q_lora(hidden_states) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + scale * self.to_k_lora(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + scale * self.to_v_lora(encoder_hidden_states) + + head_dim = inner_dim // attn.heads + query = query.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + key = key.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + value = value.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + + # TODO: add support for attn.scale when we move to Torch 2.1 + hidden_states = F.scaled_dot_product_attention( + query, key, value, attn_mask=attention_mask, dropout_p=0.0, is_causal=False + ) + hidden_states = hidden_states.transpose(1, 2).reshape(batch_size, -1, attn.heads * head_dim) + hidden_states = hidden_states.to(query.dtype) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + scale * self.to_out_lora(hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class CustomDiffusionXFormersAttnProcessor(nn.Module): + r""" + Processor for implementing memory efficient attention using xFormers for the Custom Diffusion method. + + Args: + train_kv (`bool`, defaults to `True`): + Whether to newly train the key and value matrices corresponding to the text features. + train_q_out (`bool`, defaults to `True`): + Whether to newly train query matrices corresponding to the latent image features. + hidden_size (`int`, *optional*, defaults to `None`): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*, defaults to `None`): + The number of channels in the `encoder_hidden_states`. + out_bias (`bool`, defaults to `True`): + Whether to include the bias parameter in `train_q_out`. + dropout (`float`, *optional*, defaults to 0.0): + The dropout probability to use. + attention_op (`Callable`, *optional*, defaults to `None`): + The base + [operator](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.AttentionOpBase) to use + as the attention operator. It is recommended to set to `None`, and allow xFormers to choose the best operator. + """ + + def __init__( + self, + train_kv=True, + train_q_out=False, + hidden_size=None, + cross_attention_dim=None, + out_bias=True, + dropout=0.0, + attention_op: Optional[Callable] = None, + ): + super().__init__() + self.train_kv = train_kv + self.train_q_out = train_q_out + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.attention_op = attention_op + + # `_custom_diffusion` id for easy serialization and loading. + if self.train_kv: + self.to_k_custom_diffusion = nn.Linear(cross_attention_dim or hidden_size, hidden_size, bias=False) + self.to_v_custom_diffusion = nn.Linear(cross_attention_dim or hidden_size, hidden_size, bias=False) + if self.train_q_out: + self.to_q_custom_diffusion = nn.Linear(hidden_size, hidden_size, bias=False) + self.to_out_custom_diffusion = nn.ModuleList([]) + self.to_out_custom_diffusion.append(nn.Linear(hidden_size, hidden_size, bias=out_bias)) + self.to_out_custom_diffusion.append(nn.Dropout(dropout)) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if self.train_q_out: + query = self.to_q_custom_diffusion(hidden_states) + else: + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + crossattn = False + encoder_hidden_states = hidden_states + else: + crossattn = True + if attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + if self.train_kv: + key = self.to_k_custom_diffusion(encoder_hidden_states) + value = self.to_v_custom_diffusion(encoder_hidden_states) + else: + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + if crossattn: + detach = torch.ones_like(key) + detach[:, :1, :] = detach[:, :1, :] * 0.0 + key = detach * key + (1 - detach) * key.detach() + value = detach * value + (1 - detach) * value.detach() + + query = attn.head_to_batch_dim(query).contiguous() + key = attn.head_to_batch_dim(key).contiguous() + value = attn.head_to_batch_dim(value).contiguous() + + hidden_states = xformers.ops.memory_efficient_attention( + query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=attn.scale + ) + hidden_states = hidden_states.to(query.dtype) + hidden_states = attn.batch_to_head_dim(hidden_states) + + if self.train_q_out: + # linear proj + hidden_states = self.to_out_custom_diffusion[0](hidden_states) + # dropout + hidden_states = self.to_out_custom_diffusion[1](hidden_states) + else: + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + return hidden_states + + +class SlicedAttnProcessor: + r""" + Processor for implementing sliced attention. + + Args: + slice_size (`int`, *optional*): + The number of steps to compute attention. Uses as many slices as `attention_head_dim // slice_size`, and + `attention_head_dim` must be a multiple of the `slice_size`. + """ + + def __init__(self, slice_size): + self.slice_size = slice_size + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + residual = hidden_states + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + dim = query.shape[-1] + query = attn.head_to_batch_dim(query) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + batch_size_attention, query_tokens, _ = query.shape + hidden_states = torch.zeros( + (batch_size_attention, query_tokens, dim // attn.heads), device=query.device, dtype=query.dtype + ) + + for i in range(batch_size_attention // self.slice_size): + start_idx = i * self.slice_size + end_idx = (i + 1) * self.slice_size + + query_slice = query[start_idx:end_idx] + key_slice = key[start_idx:end_idx] + attn_mask_slice = attention_mask[start_idx:end_idx] if attention_mask is not None else None + + attn_slice = attn.get_attention_scores(query_slice, key_slice, attn_mask_slice) + + attn_slice = torch.bmm(attn_slice, value[start_idx:end_idx]) + + hidden_states[start_idx:end_idx] = attn_slice + + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class SlicedAttnAddedKVProcessor: + r""" + Processor for implementing sliced attention with extra learnable key and value matrices for the text encoder. + + Args: + slice_size (`int`, *optional*): + The number of steps to compute attention. Uses as many slices as `attention_head_dim // slice_size`, and + `attention_head_dim` must be a multiple of the `slice_size`. + """ + + def __init__(self, slice_size): + self.slice_size = slice_size + + def __call__(self, attn: "Attention", hidden_states, encoder_hidden_states=None, attention_mask=None, temb=None): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + dim = query.shape[-1] + query = attn.head_to_batch_dim(query) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + value = attn.to_v(hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=1) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=1) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + batch_size_attention, query_tokens, _ = query.shape + hidden_states = torch.zeros( + (batch_size_attention, query_tokens, dim // attn.heads), device=query.device, dtype=query.dtype + ) + + for i in range(batch_size_attention // self.slice_size): + start_idx = i * self.slice_size + end_idx = (i + 1) * self.slice_size + + query_slice = query[start_idx:end_idx] + key_slice = key[start_idx:end_idx] + attn_mask_slice = attention_mask[start_idx:end_idx] if attention_mask is not None else None + + attn_slice = attn.get_attention_scores(query_slice, key_slice, attn_mask_slice) + + attn_slice = torch.bmm(attn_slice, value[start_idx:end_idx]) + + hidden_states[start_idx:end_idx] = attn_slice + + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +AttentionProcessor = Union[ + AttnProcessor, + AttnProcessor2_0, + XFormersAttnProcessor, + SlicedAttnProcessor, + AttnAddedKVProcessor, + SlicedAttnAddedKVProcessor, + AttnAddedKVProcessor2_0, + XFormersAttnAddedKVProcessor, + LoRAAttnProcessor, + LoRAXFormersAttnProcessor, + LoRAAttnProcessor2_0, + LoRAAttnAddedKVProcessor, + CustomDiffusionAttnProcessor, + CustomDiffusionXFormersAttnProcessor, +] + + +class SpatialNorm(nn.Module): + """ + Spatially conditioned normalization as defined in https://arxiv.org/abs/2209.09002 + """ + + def __init__( + self, + f_channels, + zq_channels, + ): + super().__init__() + self.norm_layer = nn.GroupNorm(num_channels=f_channels, num_groups=32, eps=1e-6, affine=True) + self.conv_y = nn.Conv2d(zq_channels, f_channels, kernel_size=1, stride=1, padding=0) + self.conv_b = nn.Conv2d(zq_channels, f_channels, kernel_size=1, stride=1, padding=0) + + def forward(self, f, zq): + f_size = f.shape[-2:] + zq = F.interpolate(zq, size=f_size, mode="nearest") + norm_f = self.norm_layer(f) + new_f = norm_f * self.conv_y(zq) + self.conv_b(zq) + return new_f diff --git a/gf/sd-webui-rich-text/diffusers_official/models/autoencoder_kl.py b/gf/sd-webui-rich-text/diffusers_official/models/autoencoder_kl.py new file mode 100644 index 0000000000000000000000000000000000000000..ddb9bde0ee0af0033c3fde7ac67a10c2171fef42 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/autoencoder_kl.py @@ -0,0 +1,411 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Dict, Optional, Tuple, Union + +import torch +import torch.nn as nn + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, apply_forward_hook +from .attention_processor import AttentionProcessor, AttnProcessor +from .modeling_utils import ModelMixin +from .vae import Decoder, DecoderOutput, DiagonalGaussianDistribution, Encoder + + +@dataclass +class AutoencoderKLOutput(BaseOutput): + """ + Output of AutoencoderKL encoding method. + + Args: + latent_dist (`DiagonalGaussianDistribution`): + Encoded outputs of `Encoder` represented as the mean and logvar of `DiagonalGaussianDistribution`. + `DiagonalGaussianDistribution` allows for sampling latents from the distribution. + """ + + latent_dist: "DiagonalGaussianDistribution" + + +class AutoencoderKL(ModelMixin, ConfigMixin): + r""" + A VAE model with KL loss for encoding images into latents and decoding latent representations into images. + + This model inherits from [`ModelMixin`]. Check the superclass documentation for it's generic methods implemented + for all models (such as downloading or saving). + + Parameters: + in_channels (int, *optional*, defaults to 3): Number of channels in the input image. + out_channels (int, *optional*, defaults to 3): Number of channels in the output. + down_block_types (`Tuple[str]`, *optional*, defaults to `("DownEncoderBlock2D",)`): + Tuple of downsample block types. + up_block_types (`Tuple[str]`, *optional*, defaults to `("UpDecoderBlock2D",)`): + Tuple of upsample block types. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(64,)`): + Tuple of block output channels. + act_fn (`str`, *optional*, defaults to `"silu"`): The activation function to use. + latent_channels (`int`, *optional*, defaults to 4): Number of channels in the latent space. + sample_size (`int`, *optional*, defaults to `32`): Sample input size. + scaling_factor (`float`, *optional*, defaults to 0.18215): + The component-wise standard deviation of the trained latent space computed using the first batch of the + training set. This is used to scale the latent space to have unit variance when training the diffusion + model. The latents are scaled with the formula `z = z * scaling_factor` before being passed to the + diffusion model. When decoding, the latents are scaled back to the original scale with the formula: `z = 1 + / scaling_factor * z`. For more details, refer to sections 4.3.2 and D.1 of the [High-Resolution Image + Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752) paper. + """ + + _supports_gradient_checkpointing = True + + @register_to_config + def __init__( + self, + in_channels: int = 3, + out_channels: int = 3, + down_block_types: Tuple[str] = ("DownEncoderBlock2D",), + up_block_types: Tuple[str] = ("UpDecoderBlock2D",), + block_out_channels: Tuple[int] = (64,), + layers_per_block: int = 1, + act_fn: str = "silu", + latent_channels: int = 4, + norm_num_groups: int = 32, + sample_size: int = 32, + scaling_factor: float = 0.18215, + ): + super().__init__() + + # pass init params to Encoder + self.encoder = Encoder( + in_channels=in_channels, + out_channels=latent_channels, + down_block_types=down_block_types, + block_out_channels=block_out_channels, + layers_per_block=layers_per_block, + act_fn=act_fn, + norm_num_groups=norm_num_groups, + double_z=True, + ) + + # pass init params to Decoder + self.decoder = Decoder( + in_channels=latent_channels, + out_channels=out_channels, + up_block_types=up_block_types, + block_out_channels=block_out_channels, + layers_per_block=layers_per_block, + norm_num_groups=norm_num_groups, + act_fn=act_fn, + ) + + self.quant_conv = nn.Conv2d(2 * latent_channels, 2 * latent_channels, 1) + self.post_quant_conv = nn.Conv2d(latent_channels, latent_channels, 1) + + self.use_slicing = False + self.use_tiling = False + + # only relevant if vae tiling is enabled + self.tile_sample_min_size = self.config.sample_size + sample_size = ( + self.config.sample_size[0] + if isinstance(self.config.sample_size, (list, tuple)) + else self.config.sample_size + ) + self.tile_latent_min_size = int(sample_size / (2 ** (len(self.config.block_out_channels) - 1))) + self.tile_overlap_factor = 0.25 + + def _set_gradient_checkpointing(self, module, value=False): + if isinstance(module, (Encoder, Decoder)): + module.gradient_checkpointing = value + + def enable_tiling(self, use_tiling: bool = True): + r""" + Enable tiled VAE decoding. When this option is enabled, the VAE will split the input tensor into tiles to + compute decoding and encoding in several steps. This is useful for saving a large amount of memory and to allow + processing larger images. + """ + self.use_tiling = use_tiling + + def disable_tiling(self): + r""" + Disable tiled VAE decoding. If `enable_tiling` was previously enabled, this method will go back to computing + decoding in one step. + """ + self.enable_tiling(False) + + def enable_slicing(self): + r""" + Enable sliced VAE decoding. When this option is enabled, the VAE will split the input tensor in slices to + compute decoding in several steps. This is useful to save some memory and allow larger batch sizes. + """ + self.use_slicing = True + + def disable_slicing(self): + r""" + Disable sliced VAE decoding. If `enable_slicing` was previously enabled, this method will go back to computing + decoding in one step. + """ + self.use_slicing = False + + @property + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.attn_processors + def attn_processors(self) -> Dict[str, AttentionProcessor]: + r""" + Returns: + `dict` of attention processors: A dictionary containing all attention processors used in the model with + indexed by its weight name. + """ + # set recursively + processors = {} + + def fn_recursive_add_processors(name: str, module: torch.nn.Module, processors: Dict[str, AttentionProcessor]): + if hasattr(module, "set_processor"): + processors[f"{name}.processor"] = module.processor + + for sub_name, child in module.named_children(): + fn_recursive_add_processors(f"{name}.{sub_name}", child, processors) + + return processors + + for name, module in self.named_children(): + fn_recursive_add_processors(name, module, processors) + + return processors + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_attn_processor + def set_attn_processor(self, processor: Union[AttentionProcessor, Dict[str, AttentionProcessor]]): + r""" + Sets the attention processor to use to compute attention. + + Parameters: + processor (`dict` of `AttentionProcessor` or only `AttentionProcessor`): + The instantiated processor class or a dictionary of processor classes that will be set as the processor + for **all** `Attention` layers. + + If `processor` is a dict, the key needs to define the path to the corresponding cross attention + processor. This is strongly recommended when setting trainable attention processors. + + """ + count = len(self.attn_processors.keys()) + + if isinstance(processor, dict) and len(processor) != count: + raise ValueError( + f"A dict of processors was passed, but the number of processors {len(processor)} does not match the" + f" number of attention layers: {count}. Please make sure to pass {count} processor classes." + ) + + def fn_recursive_attn_processor(name: str, module: torch.nn.Module, processor): + if hasattr(module, "set_processor"): + if not isinstance(processor, dict): + module.set_processor(processor) + else: + module.set_processor(processor.pop(f"{name}.processor")) + + for sub_name, child in module.named_children(): + fn_recursive_attn_processor(f"{name}.{sub_name}", child, processor) + + for name, module in self.named_children(): + fn_recursive_attn_processor(name, module, processor) + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_default_attn_processor + def set_default_attn_processor(self): + """ + Disables custom attention processors and sets the default attention implementation. + """ + self.set_attn_processor(AttnProcessor()) + + @apply_forward_hook + def encode(self, x: torch.FloatTensor, return_dict: bool = True) -> AutoencoderKLOutput: + if self.use_tiling and (x.shape[-1] > self.tile_sample_min_size or x.shape[-2] > self.tile_sample_min_size): + return self.tiled_encode(x, return_dict=return_dict) + + if self.use_slicing and x.shape[0] > 1: + encoded_slices = [self.encoder(x_slice) for x_slice in x.split(1)] + h = torch.cat(encoded_slices) + else: + h = self.encoder(x) + + moments = self.quant_conv(h) + posterior = DiagonalGaussianDistribution(moments) + + if not return_dict: + return (posterior,) + + return AutoencoderKLOutput(latent_dist=posterior) + + def _decode(self, z: torch.FloatTensor, return_dict: bool = True) -> Union[DecoderOutput, torch.FloatTensor]: + if self.use_tiling and (z.shape[-1] > self.tile_latent_min_size or z.shape[-2] > self.tile_latent_min_size): + return self.tiled_decode(z, return_dict=return_dict) + + z = self.post_quant_conv(z) + dec = self.decoder(z) + + if not return_dict: + return (dec,) + + return DecoderOutput(sample=dec) + + @apply_forward_hook + def decode(self, z: torch.FloatTensor, return_dict: bool = True) -> Union[DecoderOutput, torch.FloatTensor]: + if self.use_slicing and z.shape[0] > 1: + decoded_slices = [self._decode(z_slice).sample for z_slice in z.split(1)] + decoded = torch.cat(decoded_slices) + else: + decoded = self._decode(z).sample + + if not return_dict: + return (decoded,) + + return DecoderOutput(sample=decoded) + + def blend_v(self, a, b, blend_extent): + blend_extent = min(a.shape[2], b.shape[2], blend_extent) + for y in range(blend_extent): + b[:, :, y, :] = a[:, :, -blend_extent + y, :] * (1 - y / blend_extent) + b[:, :, y, :] * (y / blend_extent) + return b + + def blend_h(self, a, b, blend_extent): + blend_extent = min(a.shape[3], b.shape[3], blend_extent) + for x in range(blend_extent): + b[:, :, :, x] = a[:, :, :, -blend_extent + x] * (1 - x / blend_extent) + b[:, :, :, x] * (x / blend_extent) + return b + + def tiled_encode(self, x: torch.FloatTensor, return_dict: bool = True) -> AutoencoderKLOutput: + r"""Encode a batch of images using a tiled encoder. + + When this option is enabled, the VAE will split the input tensor into tiles to compute encoding in several + steps. This is useful to keep memory use constant regardless of image size. The end result of tiled encoding is + different from non-tiled encoding because each tile uses a different encoder. To avoid tiling artifacts, the + tiles overlap and are blended together to form a smooth output. You may still see tile-sized changes in the + output, but they should be much less noticeable. + + Args: + x (`torch.FloatTensor`): Input batch of images. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.autoencoder_kl.AutoencoderKLOutput`] instead of a plain tuple. + + Returns: + [`~models.autoencoder_kl.AutoencoderKLOutput`] or `tuple`: + If return_dict is True, a [`~models.autoencoder_kl.AutoencoderKLOutput`] is returned, otherwise a plain + `tuple` is returned. + """ + overlap_size = int(self.tile_sample_min_size * (1 - self.tile_overlap_factor)) + blend_extent = int(self.tile_latent_min_size * self.tile_overlap_factor) + row_limit = self.tile_latent_min_size - blend_extent + + # Split the image into 512x512 tiles and encode them separately. + rows = [] + for i in range(0, x.shape[2], overlap_size): + row = [] + for j in range(0, x.shape[3], overlap_size): + tile = x[:, :, i : i + self.tile_sample_min_size, j : j + self.tile_sample_min_size] + tile = self.encoder(tile) + tile = self.quant_conv(tile) + row.append(tile) + rows.append(row) + result_rows = [] + for i, row in enumerate(rows): + result_row = [] + for j, tile in enumerate(row): + # blend the above tile and the left tile + # to the current tile and add the current tile to the result row + if i > 0: + tile = self.blend_v(rows[i - 1][j], tile, blend_extent) + if j > 0: + tile = self.blend_h(row[j - 1], tile, blend_extent) + result_row.append(tile[:, :, :row_limit, :row_limit]) + result_rows.append(torch.cat(result_row, dim=3)) + + moments = torch.cat(result_rows, dim=2) + posterior = DiagonalGaussianDistribution(moments) + + if not return_dict: + return (posterior,) + + return AutoencoderKLOutput(latent_dist=posterior) + + def tiled_decode(self, z: torch.FloatTensor, return_dict: bool = True) -> Union[DecoderOutput, torch.FloatTensor]: + r""" + Decode a batch of images using a tiled decoder. + + Args: + z (`torch.FloatTensor`): Input batch of latent vectors. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.vae.DecoderOutput`] instead of a plain tuple. + + Returns: + [`~models.vae.DecoderOutput`] or `tuple`: + If return_dict is True, a [`~models.vae.DecoderOutput`] is returned, otherwise a plain `tuple` is + returned. + """ + overlap_size = int(self.tile_latent_min_size * (1 - self.tile_overlap_factor)) + blend_extent = int(self.tile_sample_min_size * self.tile_overlap_factor) + row_limit = self.tile_sample_min_size - blend_extent + + # Split z into overlapping 64x64 tiles and decode them separately. + # The tiles have an overlap to avoid seams between tiles. + rows = [] + for i in range(0, z.shape[2], overlap_size): + row = [] + for j in range(0, z.shape[3], overlap_size): + tile = z[:, :, i : i + self.tile_latent_min_size, j : j + self.tile_latent_min_size] + tile = self.post_quant_conv(tile) + decoded = self.decoder(tile) + row.append(decoded) + rows.append(row) + result_rows = [] + for i, row in enumerate(rows): + result_row = [] + for j, tile in enumerate(row): + # blend the above tile and the left tile + # to the current tile and add the current tile to the result row + if i > 0: + tile = self.blend_v(rows[i - 1][j], tile, blend_extent) + if j > 0: + tile = self.blend_h(row[j - 1], tile, blend_extent) + result_row.append(tile[:, :, :row_limit, :row_limit]) + result_rows.append(torch.cat(result_row, dim=3)) + + dec = torch.cat(result_rows, dim=2) + if not return_dict: + return (dec,) + + return DecoderOutput(sample=dec) + + def forward( + self, + sample: torch.FloatTensor, + sample_posterior: bool = False, + return_dict: bool = True, + generator: Optional[torch.Generator] = None, + ) -> Union[DecoderOutput, torch.FloatTensor]: + r""" + Args: + sample (`torch.FloatTensor`): Input sample. + sample_posterior (`bool`, *optional*, defaults to `False`): + Whether to sample from the posterior. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`DecoderOutput`] instead of a plain tuple. + """ + x = sample + posterior = self.encode(x).latent_dist + if sample_posterior: + z = posterior.sample(generator=generator) + else: + z = posterior.mode() + dec = self.decode(z).sample + + if not return_dict: + return (dec,) + + return DecoderOutput(sample=dec) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/controlnet.py b/gf/sd-webui-rich-text/diffusers_official/models/controlnet.py new file mode 100644 index 0000000000000000000000000000000000000000..b0f5660200799af51bb6fb45de0dcf119171b1e9 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/controlnet.py @@ -0,0 +1,705 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Any, Dict, List, Optional, Tuple, Union + +import torch +from torch import nn +from torch.nn import functional as F + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, logging +from .attention_processor import AttentionProcessor, AttnProcessor +from .embeddings import TimestepEmbedding, Timesteps +from .modeling_utils import ModelMixin +from .unet_2d_blocks import ( + CrossAttnDownBlock2D, + DownBlock2D, + UNetMidBlock2DCrossAttn, + get_down_block, +) +from .unet_2d_condition import UNet2DConditionModel + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +@dataclass +class ControlNetOutput(BaseOutput): + """ + The output of [`ControlNetModel`]. + + Args: + down_block_res_samples (`tuple[torch.Tensor]`): + A tuple of downsample activations at different resolutions for each downsampling block. Each tensor should + be of shape `(batch_size, channel * resolution, height //resolution, width // resolution)`. Output can be + used to condition the original UNet's downsampling activations. + mid_down_block_re_sample (`torch.Tensor`): + The activation of the midde block (the lowest sample resolution). Each tensor should be of shape + `(batch_size, channel * lowest_resolution, height // lowest_resolution, width // lowest_resolution)`. + Output can be used to condition the original UNet's middle block activation. + """ + + down_block_res_samples: Tuple[torch.Tensor] + mid_block_res_sample: torch.Tensor + + +class ControlNetConditioningEmbedding(nn.Module): + """ + Quoting from https://arxiv.org/abs/2302.05543: "Stable Diffusion uses a pre-processing method similar to VQ-GAN + [11] to convert the entire dataset of 512 × 512 images into smaller 64 × 64 “latent images” for stabilized + training. This requires ControlNets to convert image-based conditions to 64 × 64 feature space to match the + convolution size. We use a tiny network E(·) of four convolution layers with 4 × 4 kernels and 2 × 2 strides + (activated by ReLU, channels are 16, 32, 64, 128, initialized with Gaussian weights, trained jointly with the full + model) to encode image-space conditions ... into feature maps ..." + """ + + def __init__( + self, + conditioning_embedding_channels: int, + conditioning_channels: int = 3, + block_out_channels: Tuple[int] = (16, 32, 96, 256), + ): + super().__init__() + + self.conv_in = nn.Conv2d(conditioning_channels, block_out_channels[0], kernel_size=3, padding=1) + + self.blocks = nn.ModuleList([]) + + for i in range(len(block_out_channels) - 1): + channel_in = block_out_channels[i] + channel_out = block_out_channels[i + 1] + self.blocks.append(nn.Conv2d(channel_in, channel_in, kernel_size=3, padding=1)) + self.blocks.append(nn.Conv2d(channel_in, channel_out, kernel_size=3, padding=1, stride=2)) + + self.conv_out = zero_module( + nn.Conv2d(block_out_channels[-1], conditioning_embedding_channels, kernel_size=3, padding=1) + ) + + def forward(self, conditioning): + embedding = self.conv_in(conditioning) + embedding = F.silu(embedding) + + for block in self.blocks: + embedding = block(embedding) + embedding = F.silu(embedding) + + embedding = self.conv_out(embedding) + + return embedding + + +class ControlNetModel(ModelMixin, ConfigMixin): + """ + A ControlNet model. + + Args: + in_channels (`int`, defaults to 4): + The number of channels in the input sample. + flip_sin_to_cos (`bool`, defaults to `True`): + Whether to flip the sin to cos in the time embedding. + freq_shift (`int`, defaults to 0): + The frequency shift to apply to the time embedding. + down_block_types (`tuple[str]`, defaults to `("CrossAttnDownBlock2D", "CrossAttnDownBlock2D", "CrossAttnDownBlock2D", "DownBlock2D")`): + The tuple of downsample blocks to use. + only_cross_attention (`Union[bool, Tuple[bool]]`, defaults to `False`): + block_out_channels (`tuple[int]`, defaults to `(320, 640, 1280, 1280)`): + The tuple of output channels for each block. + layers_per_block (`int`, defaults to 2): + The number of layers per block. + downsample_padding (`int`, defaults to 1): + The padding to use for the downsampling convolution. + mid_block_scale_factor (`float`, defaults to 1): + The scale factor to use for the mid block. + act_fn (`str`, defaults to "silu"): + The activation function to use. + norm_num_groups (`int`, *optional*, defaults to 32): + The number of groups to use for the normalization. If None, normalization and activation layers is skipped + in post-processing. + norm_eps (`float`, defaults to 1e-5): + The epsilon to use for the normalization. + cross_attention_dim (`int`, defaults to 1280): + The dimension of the cross attention features. + attention_head_dim (`Union[int, Tuple[int]]`, defaults to 8): + The dimension of the attention heads. + use_linear_projection (`bool`, defaults to `False`): + class_embed_type (`str`, *optional*, defaults to `None`): + The type of class embedding to use which is ultimately summed with the time embeddings. Choose from None, + `"timestep"`, `"identity"`, `"projection"`, or `"simple_projection"`. + num_class_embeds (`int`, *optional*, defaults to 0): + Input dimension of the learnable embedding matrix to be projected to `time_embed_dim`, when performing + class conditioning with `class_embed_type` equal to `None`. + upcast_attention (`bool`, defaults to `False`): + resnet_time_scale_shift (`str`, defaults to `"default"`): + Time scale shift config for ResNet blocks (see `ResnetBlock2D`). Choose from `default` or `scale_shift`. + projection_class_embeddings_input_dim (`int`, *optional*, defaults to `None`): + The dimension of the `class_labels` input when `class_embed_type="projection"`. Required when + `class_embed_type="projection"`. + controlnet_conditioning_channel_order (`str`, defaults to `"rgb"`): + The channel order of conditional image. Will convert to `rgb` if it's `bgr`. + conditioning_embedding_out_channels (`tuple[int]`, *optional*, defaults to `(16, 32, 96, 256)`): + The tuple of output channel for each block in the `conditioning_embedding` layer. + global_pool_conditions (`bool`, defaults to `False`): + """ + + _supports_gradient_checkpointing = True + + @register_to_config + def __init__( + self, + in_channels: int = 4, + conditioning_channels: int = 3, + flip_sin_to_cos: bool = True, + freq_shift: int = 0, + down_block_types: Tuple[str] = ( + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "DownBlock2D", + ), + only_cross_attention: Union[bool, Tuple[bool]] = False, + block_out_channels: Tuple[int] = (320, 640, 1280, 1280), + layers_per_block: int = 2, + downsample_padding: int = 1, + mid_block_scale_factor: float = 1, + act_fn: str = "silu", + norm_num_groups: Optional[int] = 32, + norm_eps: float = 1e-5, + cross_attention_dim: int = 1280, + attention_head_dim: Union[int, Tuple[int]] = 8, + num_attention_heads: Optional[Union[int, Tuple[int]]] = None, + use_linear_projection: bool = False, + class_embed_type: Optional[str] = None, + num_class_embeds: Optional[int] = None, + upcast_attention: bool = False, + resnet_time_scale_shift: str = "default", + projection_class_embeddings_input_dim: Optional[int] = None, + controlnet_conditioning_channel_order: str = "rgb", + conditioning_embedding_out_channels: Optional[Tuple[int]] = (16, 32, 96, 256), + global_pool_conditions: bool = False, + ): + super().__init__() + + # If `num_attention_heads` is not defined (which is the case for most models) + # it will default to `attention_head_dim`. This looks weird upon first reading it and it is. + # The reason for this behavior is to correct for incorrectly named variables that were introduced + # when this library was created. The incorrect naming was only discovered much later in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131 + # Changing `attention_head_dim` to `num_attention_heads` for 40,000+ configurations is too backwards breaking + # which is why we correct for the naming here. + num_attention_heads = num_attention_heads or attention_head_dim + + # Check inputs + if len(block_out_channels) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `block_out_channels` as `down_block_types`. `block_out_channels`: {block_out_channels}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(only_cross_attention, bool) and len(only_cross_attention) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `only_cross_attention` as `down_block_types`. `only_cross_attention`: {only_cross_attention}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(num_attention_heads, int) and len(num_attention_heads) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `num_attention_heads` as `down_block_types`. `num_attention_heads`: {num_attention_heads}. `down_block_types`: {down_block_types}." + ) + + # input + conv_in_kernel = 3 + conv_in_padding = (conv_in_kernel - 1) // 2 + self.conv_in = nn.Conv2d( + in_channels, block_out_channels[0], kernel_size=conv_in_kernel, padding=conv_in_padding + ) + + # time + time_embed_dim = block_out_channels[0] * 4 + + self.time_proj = Timesteps(block_out_channels[0], flip_sin_to_cos, freq_shift) + timestep_input_dim = block_out_channels[0] + + self.time_embedding = TimestepEmbedding( + timestep_input_dim, + time_embed_dim, + act_fn=act_fn, + ) + + # class embedding + if class_embed_type is None and num_class_embeds is not None: + self.class_embedding = nn.Embedding(num_class_embeds, time_embed_dim) + elif class_embed_type == "timestep": + self.class_embedding = TimestepEmbedding(timestep_input_dim, time_embed_dim) + elif class_embed_type == "identity": + self.class_embedding = nn.Identity(time_embed_dim, time_embed_dim) + elif class_embed_type == "projection": + if projection_class_embeddings_input_dim is None: + raise ValueError( + "`class_embed_type`: 'projection' requires `projection_class_embeddings_input_dim` be set" + ) + # The projection `class_embed_type` is the same as the timestep `class_embed_type` except + # 1. the `class_labels` inputs are not first converted to sinusoidal embeddings + # 2. it projects from an arbitrary input dimension. + # + # Note that `TimestepEmbedding` is quite general, being mainly linear layers and activations. + # When used for embedding actual timesteps, the timesteps are first converted to sinusoidal embeddings. + # As a result, `TimestepEmbedding` can be passed arbitrary vectors. + self.class_embedding = TimestepEmbedding(projection_class_embeddings_input_dim, time_embed_dim) + else: + self.class_embedding = None + + # control net conditioning embedding + self.controlnet_cond_embedding = ControlNetConditioningEmbedding( + conditioning_embedding_channels=block_out_channels[0], + block_out_channels=conditioning_embedding_out_channels, + conditioning_channels=conditioning_channels, + ) + + self.down_blocks = nn.ModuleList([]) + self.controlnet_down_blocks = nn.ModuleList([]) + + if isinstance(only_cross_attention, bool): + only_cross_attention = [only_cross_attention] * len(down_block_types) + + if isinstance(attention_head_dim, int): + attention_head_dim = (attention_head_dim,) * len(down_block_types) + + if isinstance(num_attention_heads, int): + num_attention_heads = (num_attention_heads,) * len(down_block_types) + + # down + output_channel = block_out_channels[0] + + controlnet_block = nn.Conv2d(output_channel, output_channel, kernel_size=1) + controlnet_block = zero_module(controlnet_block) + self.controlnet_down_blocks.append(controlnet_block) + + for i, down_block_type in enumerate(down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + down_block = get_down_block( + down_block_type, + num_layers=layers_per_block, + in_channels=input_channel, + out_channels=output_channel, + temb_channels=time_embed_dim, + add_downsample=not is_final_block, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads[i], + attention_head_dim=attention_head_dim[i] if attention_head_dim[i] is not None else output_channel, + downsample_padding=downsample_padding, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention[i], + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + self.down_blocks.append(down_block) + + for _ in range(layers_per_block): + controlnet_block = nn.Conv2d(output_channel, output_channel, kernel_size=1) + controlnet_block = zero_module(controlnet_block) + self.controlnet_down_blocks.append(controlnet_block) + + if not is_final_block: + controlnet_block = nn.Conv2d(output_channel, output_channel, kernel_size=1) + controlnet_block = zero_module(controlnet_block) + self.controlnet_down_blocks.append(controlnet_block) + + # mid + mid_block_channel = block_out_channels[-1] + + controlnet_block = nn.Conv2d(mid_block_channel, mid_block_channel, kernel_size=1) + controlnet_block = zero_module(controlnet_block) + self.controlnet_mid_block = controlnet_block + + self.mid_block = UNetMidBlock2DCrossAttn( + in_channels=mid_block_channel, + temb_channels=time_embed_dim, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + output_scale_factor=mid_block_scale_factor, + resnet_time_scale_shift=resnet_time_scale_shift, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads[-1], + resnet_groups=norm_num_groups, + use_linear_projection=use_linear_projection, + upcast_attention=upcast_attention, + ) + + @classmethod + def from_unet( + cls, + unet: UNet2DConditionModel, + controlnet_conditioning_channel_order: str = "rgb", + conditioning_embedding_out_channels: Optional[Tuple[int]] = (16, 32, 96, 256), + load_weights_from_unet: bool = True, + ): + r""" + Instantiate a [`ControlNetModel`] from [`UNet2DConditionModel`]. + + Parameters: + unet (`UNet2DConditionModel`): + The UNet model weights to copy to the [`ControlNetModel`]. All configuration options are also copied + where applicable. + """ + controlnet = cls( + in_channels=unet.config.in_channels, + flip_sin_to_cos=unet.config.flip_sin_to_cos, + freq_shift=unet.config.freq_shift, + down_block_types=unet.config.down_block_types, + only_cross_attention=unet.config.only_cross_attention, + block_out_channels=unet.config.block_out_channels, + layers_per_block=unet.config.layers_per_block, + downsample_padding=unet.config.downsample_padding, + mid_block_scale_factor=unet.config.mid_block_scale_factor, + act_fn=unet.config.act_fn, + norm_num_groups=unet.config.norm_num_groups, + norm_eps=unet.config.norm_eps, + cross_attention_dim=unet.config.cross_attention_dim, + attention_head_dim=unet.config.attention_head_dim, + num_attention_heads=unet.config.num_attention_heads, + use_linear_projection=unet.config.use_linear_projection, + class_embed_type=unet.config.class_embed_type, + num_class_embeds=unet.config.num_class_embeds, + upcast_attention=unet.config.upcast_attention, + resnet_time_scale_shift=unet.config.resnet_time_scale_shift, + projection_class_embeddings_input_dim=unet.config.projection_class_embeddings_input_dim, + controlnet_conditioning_channel_order=controlnet_conditioning_channel_order, + conditioning_embedding_out_channels=conditioning_embedding_out_channels, + ) + + if load_weights_from_unet: + controlnet.conv_in.load_state_dict(unet.conv_in.state_dict()) + controlnet.time_proj.load_state_dict(unet.time_proj.state_dict()) + controlnet.time_embedding.load_state_dict(unet.time_embedding.state_dict()) + + if controlnet.class_embedding: + controlnet.class_embedding.load_state_dict(unet.class_embedding.state_dict()) + + controlnet.down_blocks.load_state_dict(unet.down_blocks.state_dict()) + controlnet.mid_block.load_state_dict(unet.mid_block.state_dict()) + + return controlnet + + @property + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.attn_processors + def attn_processors(self) -> Dict[str, AttentionProcessor]: + r""" + Returns: + `dict` of attention processors: A dictionary containing all attention processors used in the model with + indexed by its weight name. + """ + # set recursively + processors = {} + + def fn_recursive_add_processors(name: str, module: torch.nn.Module, processors: Dict[str, AttentionProcessor]): + if hasattr(module, "set_processor"): + processors[f"{name}.processor"] = module.processor + + for sub_name, child in module.named_children(): + fn_recursive_add_processors(f"{name}.{sub_name}", child, processors) + + return processors + + for name, module in self.named_children(): + fn_recursive_add_processors(name, module, processors) + + return processors + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_attn_processor + def set_attn_processor(self, processor: Union[AttentionProcessor, Dict[str, AttentionProcessor]]): + r""" + Sets the attention processor to use to compute attention. + + Parameters: + processor (`dict` of `AttentionProcessor` or only `AttentionProcessor`): + The instantiated processor class or a dictionary of processor classes that will be set as the processor + for **all** `Attention` layers. + + If `processor` is a dict, the key needs to define the path to the corresponding cross attention + processor. This is strongly recommended when setting trainable attention processors. + + """ + count = len(self.attn_processors.keys()) + + if isinstance(processor, dict) and len(processor) != count: + raise ValueError( + f"A dict of processors was passed, but the number of processors {len(processor)} does not match the" + f" number of attention layers: {count}. Please make sure to pass {count} processor classes." + ) + + def fn_recursive_attn_processor(name: str, module: torch.nn.Module, processor): + if hasattr(module, "set_processor"): + if not isinstance(processor, dict): + module.set_processor(processor) + else: + module.set_processor(processor.pop(f"{name}.processor")) + + for sub_name, child in module.named_children(): + fn_recursive_attn_processor(f"{name}.{sub_name}", child, processor) + + for name, module in self.named_children(): + fn_recursive_attn_processor(name, module, processor) + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_default_attn_processor + def set_default_attn_processor(self): + """ + Disables custom attention processors and sets the default attention implementation. + """ + self.set_attn_processor(AttnProcessor()) + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_attention_slice + def set_attention_slice(self, slice_size): + r""" + Enable sliced attention computation. + + When this option is enabled, the attention module splits the input tensor in slices to compute attention in + several steps. This is useful for saving some memory in exchange for a small decrease in speed. + + Args: + slice_size (`str` or `int` or `list(int)`, *optional*, defaults to `"auto"`): + When `"auto"`, input to the attention heads is halved, so attention is computed in two steps. If + `"max"`, maximum amount of memory is saved by running only one slice at a time. If a number is + provided, uses as many slices as `attention_head_dim // slice_size`. In this case, `attention_head_dim` + must be a multiple of `slice_size`. + """ + sliceable_head_dims = [] + + def fn_recursive_retrieve_sliceable_dims(module: torch.nn.Module): + if hasattr(module, "set_attention_slice"): + sliceable_head_dims.append(module.sliceable_head_dim) + + for child in module.children(): + fn_recursive_retrieve_sliceable_dims(child) + + # retrieve number of attention layers + for module in self.children(): + fn_recursive_retrieve_sliceable_dims(module) + + num_sliceable_layers = len(sliceable_head_dims) + + if slice_size == "auto": + # half the attention head size is usually a good trade-off between + # speed and memory + slice_size = [dim // 2 for dim in sliceable_head_dims] + elif slice_size == "max": + # make smallest slice possible + slice_size = num_sliceable_layers * [1] + + slice_size = num_sliceable_layers * [slice_size] if not isinstance(slice_size, list) else slice_size + + if len(slice_size) != len(sliceable_head_dims): + raise ValueError( + f"You have provided {len(slice_size)}, but {self.config} has {len(sliceable_head_dims)} different" + f" attention layers. Make sure to match `len(slice_size)` to be {len(sliceable_head_dims)}." + ) + + for i in range(len(slice_size)): + size = slice_size[i] + dim = sliceable_head_dims[i] + if size is not None and size > dim: + raise ValueError(f"size {size} has to be smaller or equal to {dim}.") + + # Recursively walk through all the children. + # Any children which exposes the set_attention_slice method + # gets the message + def fn_recursive_set_attention_slice(module: torch.nn.Module, slice_size: List[int]): + if hasattr(module, "set_attention_slice"): + module.set_attention_slice(slice_size.pop()) + + for child in module.children(): + fn_recursive_set_attention_slice(child, slice_size) + + reversed_slice_size = list(reversed(slice_size)) + for module in self.children(): + fn_recursive_set_attention_slice(module, reversed_slice_size) + + def _set_gradient_checkpointing(self, module, value=False): + if isinstance(module, (CrossAttnDownBlock2D, DownBlock2D)): + module.gradient_checkpointing = value + + def forward( + self, + sample: torch.FloatTensor, + timestep: Union[torch.Tensor, float, int], + encoder_hidden_states: torch.Tensor, + controlnet_cond: torch.FloatTensor, + conditioning_scale: float = 1.0, + class_labels: Optional[torch.Tensor] = None, + timestep_cond: Optional[torch.Tensor] = None, + attention_mask: Optional[torch.Tensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + guess_mode: bool = False, + return_dict: bool = True, + ) -> Union[ControlNetOutput, Tuple]: + """ + The [`ControlNetModel`] forward method. + + Args: + sample (`torch.FloatTensor`): + The noisy input tensor. + timestep (`Union[torch.Tensor, float, int]`): + The number of timesteps to denoise an input. + encoder_hidden_states (`torch.Tensor`): + The encoder hidden states. + controlnet_cond (`torch.FloatTensor`): + The conditional input tensor of shape `(batch_size, sequence_length, hidden_size)`. + conditioning_scale (`float`, defaults to `1.0`): + The scale factor for ControlNet outputs. + class_labels (`torch.Tensor`, *optional*, defaults to `None`): + Optional class labels for conditioning. Their embeddings will be summed with the timestep embeddings. + timestep_cond (`torch.Tensor`, *optional*, defaults to `None`): + attention_mask (`torch.Tensor`, *optional*, defaults to `None`): + cross_attention_kwargs(`dict[str]`, *optional*, defaults to `None`): + A kwargs dictionary that if specified is passed along to the `AttnProcessor`. + guess_mode (`bool`, defaults to `False`): + In this mode, the ControlNet encoder tries its best to recognize the input content of the input even if + you remove all prompts. A `guidance_scale` between 3.0 and 5.0 is recommended. + return_dict (`bool`, defaults to `True`): + Whether or not to return a [`~models.controlnet.ControlNetOutput`] instead of a plain tuple. + + Returns: + [`~models.controlnet.ControlNetOutput`] **or** `tuple`: + If `return_dict` is `True`, a [`~models.controlnet.ControlNetOutput`] is returned, otherwise a tuple is + returned where the first element is the sample tensor. + """ + # check channel order + channel_order = self.config.controlnet_conditioning_channel_order + + if channel_order == "rgb": + # in rgb order by default + ... + elif channel_order == "bgr": + controlnet_cond = torch.flip(controlnet_cond, dims=[1]) + else: + raise ValueError(f"unknown `controlnet_conditioning_channel_order`: {channel_order}") + + # prepare attention_mask + if attention_mask is not None: + attention_mask = (1 - attention_mask.to(sample.dtype)) * -10000.0 + attention_mask = attention_mask.unsqueeze(1) + + # 1. time + timesteps = timestep + if not torch.is_tensor(timesteps): + # TODO: this requires sync between CPU and GPU. So try to pass timesteps as tensors if you can + # This would be a good case for the `match` statement (Python 3.10+) + is_mps = sample.device.type == "mps" + if isinstance(timestep, float): + dtype = torch.float32 if is_mps else torch.float64 + else: + dtype = torch.int32 if is_mps else torch.int64 + timesteps = torch.tensor([timesteps], dtype=dtype, device=sample.device) + elif len(timesteps.shape) == 0: + timesteps = timesteps[None].to(sample.device) + + # broadcast to batch dimension in a way that's compatible with ONNX/Core ML + timesteps = timesteps.expand(sample.shape[0]) + + t_emb = self.time_proj(timesteps) + + # timesteps does not contain any weights and will always return f32 tensors + # but time_embedding might actually be running in fp16. so we need to cast here. + # there might be better ways to encapsulate this. + t_emb = t_emb.to(dtype=sample.dtype) + + emb = self.time_embedding(t_emb, timestep_cond) + + if self.class_embedding is not None: + if class_labels is None: + raise ValueError("class_labels should be provided when num_class_embeds > 0") + + if self.config.class_embed_type == "timestep": + class_labels = self.time_proj(class_labels) + + class_emb = self.class_embedding(class_labels).to(dtype=self.dtype) + emb = emb + class_emb + + # 2. pre-process + sample = self.conv_in(sample) + + controlnet_cond = self.controlnet_cond_embedding(controlnet_cond) + + sample = sample + controlnet_cond + + # 3. down + down_block_res_samples = (sample,) + for downsample_block in self.down_blocks: + if hasattr(downsample_block, "has_cross_attention") and downsample_block.has_cross_attention: + sample, res_samples = downsample_block( + hidden_states=sample, + temb=emb, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + ) + else: + sample, res_samples = downsample_block(hidden_states=sample, temb=emb) + + down_block_res_samples += res_samples + + # 4. mid + if self.mid_block is not None: + sample = self.mid_block( + sample, + emb, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + ) + + # 5. Control net blocks + + controlnet_down_block_res_samples = () + + for down_block_res_sample, controlnet_block in zip(down_block_res_samples, self.controlnet_down_blocks): + down_block_res_sample = controlnet_block(down_block_res_sample) + controlnet_down_block_res_samples = controlnet_down_block_res_samples + (down_block_res_sample,) + + down_block_res_samples = controlnet_down_block_res_samples + + mid_block_res_sample = self.controlnet_mid_block(sample) + + # 6. scaling + if guess_mode and not self.config.global_pool_conditions: + scales = torch.logspace(-1, 0, len(down_block_res_samples) + 1, device=sample.device) # 0.1 to 1.0 + + scales = scales * conditioning_scale + down_block_res_samples = [sample * scale for sample, scale in zip(down_block_res_samples, scales)] + mid_block_res_sample = mid_block_res_sample * scales[-1] # last one + else: + down_block_res_samples = [sample * conditioning_scale for sample in down_block_res_samples] + mid_block_res_sample = mid_block_res_sample * conditioning_scale + + if self.config.global_pool_conditions: + down_block_res_samples = [ + torch.mean(sample, dim=(2, 3), keepdim=True) for sample in down_block_res_samples + ] + mid_block_res_sample = torch.mean(mid_block_res_sample, dim=(2, 3), keepdim=True) + + if not return_dict: + return (down_block_res_samples, mid_block_res_sample) + + return ControlNetOutput( + down_block_res_samples=down_block_res_samples, mid_block_res_sample=mid_block_res_sample + ) + + +def zero_module(module): + for p in module.parameters(): + nn.init.zeros_(p) + return module diff --git a/gf/sd-webui-rich-text/diffusers_official/models/controlnet_flax.py b/gf/sd-webui-rich-text/diffusers_official/models/controlnet_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..a826df48e41a632454c513877ec55be7f86089f9 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/controlnet_flax.py @@ -0,0 +1,394 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Optional, Tuple, Union + +import flax +import flax.linen as nn +import jax +import jax.numpy as jnp +from flax.core.frozen_dict import FrozenDict + +from ..configuration_utils import ConfigMixin, flax_register_to_config +from ..utils import BaseOutput +from .embeddings_flax import FlaxTimestepEmbedding, FlaxTimesteps +from .modeling_flax_utils import FlaxModelMixin +from .unet_2d_blocks_flax import ( + FlaxCrossAttnDownBlock2D, + FlaxDownBlock2D, + FlaxUNetMidBlock2DCrossAttn, +) + + +@flax.struct.dataclass +class FlaxControlNetOutput(BaseOutput): + """ + The output of [`FlaxControlNetModel`]. + + Args: + down_block_res_samples (`jnp.ndarray`): + mid_block_res_sample (`jnp.ndarray`): + """ + + down_block_res_samples: jnp.ndarray + mid_block_res_sample: jnp.ndarray + + +class FlaxControlNetConditioningEmbedding(nn.Module): + conditioning_embedding_channels: int + block_out_channels: Tuple[int] = (16, 32, 96, 256) + dtype: jnp.dtype = jnp.float32 + + def setup(self): + self.conv_in = nn.Conv( + self.block_out_channels[0], + kernel_size=(3, 3), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + blocks = [] + for i in range(len(self.block_out_channels) - 1): + channel_in = self.block_out_channels[i] + channel_out = self.block_out_channels[i + 1] + conv1 = nn.Conv( + channel_in, + kernel_size=(3, 3), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + blocks.append(conv1) + conv2 = nn.Conv( + channel_out, + kernel_size=(3, 3), + strides=(2, 2), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + blocks.append(conv2) + self.blocks = blocks + + self.conv_out = nn.Conv( + self.conditioning_embedding_channels, + kernel_size=(3, 3), + padding=((1, 1), (1, 1)), + kernel_init=nn.initializers.zeros_init(), + bias_init=nn.initializers.zeros_init(), + dtype=self.dtype, + ) + + def __call__(self, conditioning): + embedding = self.conv_in(conditioning) + embedding = nn.silu(embedding) + + for block in self.blocks: + embedding = block(embedding) + embedding = nn.silu(embedding) + + embedding = self.conv_out(embedding) + + return embedding + + +@flax_register_to_config +class FlaxControlNetModel(nn.Module, FlaxModelMixin, ConfigMixin): + r""" + A ControlNet model. + + This model inherits from [`FlaxModelMixin`]. Check the superclass documentation for it’s generic methods + implemented for all models (such as downloading or saving). + + This model is also a Flax Linen [`flax.linen.Module`](https://flax.readthedocs.io/en/latest/flax.linen.html#module) + subclass. Use it as a regular Flax Linen module and refer to the Flax documentation for all matters related to its + general usage and behavior. + + Inherent JAX features such as the following are supported: + + - [Just-In-Time (JIT) compilation](https://jax.readthedocs.io/en/latest/jax.html#just-in-time-compilation-jit) + - [Automatic Differentiation](https://jax.readthedocs.io/en/latest/jax.html#automatic-differentiation) + - [Vectorization](https://jax.readthedocs.io/en/latest/jax.html#vectorization-vmap) + - [Parallelization](https://jax.readthedocs.io/en/latest/jax.html#parallelization-pmap) + + Parameters: + sample_size (`int`, *optional*): + The size of the input sample. + in_channels (`int`, *optional*, defaults to 4): + The number of channels in the input sample. + down_block_types (`Tuple[str]`, *optional*, defaults to `("FlaxCrossAttnDownBlock2D", "FlaxCrossAttnDownBlock2D", "FlaxCrossAttnDownBlock2D", "FlaxDownBlock2D")`): + The tuple of downsample blocks to use. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(320, 640, 1280, 1280)`): + The tuple of output channels for each block. + layers_per_block (`int`, *optional*, defaults to 2): + The number of layers per block. + attention_head_dim (`int` or `Tuple[int]`, *optional*, defaults to 8): + The dimension of the attention heads. + num_attention_heads (`int` or `Tuple[int]`, *optional*): + The number of attention heads. + cross_attention_dim (`int`, *optional*, defaults to 768): + The dimension of the cross attention features. + dropout (`float`, *optional*, defaults to 0): + Dropout probability for down, up and bottleneck blocks. + flip_sin_to_cos (`bool`, *optional*, defaults to `True`): + Whether to flip the sin to cos in the time embedding. + freq_shift (`int`, *optional*, defaults to 0): The frequency shift to apply to the time embedding. + controlnet_conditioning_channel_order (`str`, *optional*, defaults to `rgb`): + The channel order of conditional image. Will convert to `rgb` if it's `bgr`. + conditioning_embedding_out_channels (`tuple`, *optional*, defaults to `(16, 32, 96, 256)`): + The tuple of output channel for each block in the `conditioning_embedding` layer. + """ + sample_size: int = 32 + in_channels: int = 4 + down_block_types: Tuple[str] = ( + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "DownBlock2D", + ) + only_cross_attention: Union[bool, Tuple[bool]] = False + block_out_channels: Tuple[int] = (320, 640, 1280, 1280) + layers_per_block: int = 2 + attention_head_dim: Union[int, Tuple[int]] = 8 + num_attention_heads: Optional[Union[int, Tuple[int]]] = None + cross_attention_dim: int = 1280 + dropout: float = 0.0 + use_linear_projection: bool = False + dtype: jnp.dtype = jnp.float32 + flip_sin_to_cos: bool = True + freq_shift: int = 0 + controlnet_conditioning_channel_order: str = "rgb" + conditioning_embedding_out_channels: Tuple[int] = (16, 32, 96, 256) + + def init_weights(self, rng: jax.random.KeyArray) -> FrozenDict: + # init input tensors + sample_shape = (1, self.in_channels, self.sample_size, self.sample_size) + sample = jnp.zeros(sample_shape, dtype=jnp.float32) + timesteps = jnp.ones((1,), dtype=jnp.int32) + encoder_hidden_states = jnp.zeros((1, 1, self.cross_attention_dim), dtype=jnp.float32) + controlnet_cond_shape = (1, 3, self.sample_size * 8, self.sample_size * 8) + controlnet_cond = jnp.zeros(controlnet_cond_shape, dtype=jnp.float32) + + params_rng, dropout_rng = jax.random.split(rng) + rngs = {"params": params_rng, "dropout": dropout_rng} + + return self.init(rngs, sample, timesteps, encoder_hidden_states, controlnet_cond)["params"] + + def setup(self): + block_out_channels = self.block_out_channels + time_embed_dim = block_out_channels[0] * 4 + + # If `num_attention_heads` is not defined (which is the case for most models) + # it will default to `attention_head_dim`. This looks weird upon first reading it and it is. + # The reason for this behavior is to correct for incorrectly named variables that were introduced + # when this library was created. The incorrect naming was only discovered much later in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131 + # Changing `attention_head_dim` to `num_attention_heads` for 40,000+ configurations is too backwards breaking + # which is why we correct for the naming here. + num_attention_heads = self.num_attention_heads or self.attention_head_dim + + # input + self.conv_in = nn.Conv( + block_out_channels[0], + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + # time + self.time_proj = FlaxTimesteps( + block_out_channels[0], flip_sin_to_cos=self.flip_sin_to_cos, freq_shift=self.config.freq_shift + ) + self.time_embedding = FlaxTimestepEmbedding(time_embed_dim, dtype=self.dtype) + + self.controlnet_cond_embedding = FlaxControlNetConditioningEmbedding( + conditioning_embedding_channels=block_out_channels[0], + block_out_channels=self.conditioning_embedding_out_channels, + ) + + only_cross_attention = self.only_cross_attention + if isinstance(only_cross_attention, bool): + only_cross_attention = (only_cross_attention,) * len(self.down_block_types) + + if isinstance(num_attention_heads, int): + num_attention_heads = (num_attention_heads,) * len(self.down_block_types) + + # down + down_blocks = [] + controlnet_down_blocks = [] + + output_channel = block_out_channels[0] + + controlnet_block = nn.Conv( + output_channel, + kernel_size=(1, 1), + padding="VALID", + kernel_init=nn.initializers.zeros_init(), + bias_init=nn.initializers.zeros_init(), + dtype=self.dtype, + ) + controlnet_down_blocks.append(controlnet_block) + + for i, down_block_type in enumerate(self.down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + if down_block_type == "CrossAttnDownBlock2D": + down_block = FlaxCrossAttnDownBlock2D( + in_channels=input_channel, + out_channels=output_channel, + dropout=self.dropout, + num_layers=self.layers_per_block, + num_attention_heads=num_attention_heads[i], + add_downsample=not is_final_block, + use_linear_projection=self.use_linear_projection, + only_cross_attention=only_cross_attention[i], + dtype=self.dtype, + ) + else: + down_block = FlaxDownBlock2D( + in_channels=input_channel, + out_channels=output_channel, + dropout=self.dropout, + num_layers=self.layers_per_block, + add_downsample=not is_final_block, + dtype=self.dtype, + ) + + down_blocks.append(down_block) + + for _ in range(self.layers_per_block): + controlnet_block = nn.Conv( + output_channel, + kernel_size=(1, 1), + padding="VALID", + kernel_init=nn.initializers.zeros_init(), + bias_init=nn.initializers.zeros_init(), + dtype=self.dtype, + ) + controlnet_down_blocks.append(controlnet_block) + + if not is_final_block: + controlnet_block = nn.Conv( + output_channel, + kernel_size=(1, 1), + padding="VALID", + kernel_init=nn.initializers.zeros_init(), + bias_init=nn.initializers.zeros_init(), + dtype=self.dtype, + ) + controlnet_down_blocks.append(controlnet_block) + + self.down_blocks = down_blocks + self.controlnet_down_blocks = controlnet_down_blocks + + # mid + mid_block_channel = block_out_channels[-1] + self.mid_block = FlaxUNetMidBlock2DCrossAttn( + in_channels=mid_block_channel, + dropout=self.dropout, + num_attention_heads=num_attention_heads[-1], + use_linear_projection=self.use_linear_projection, + dtype=self.dtype, + ) + + self.controlnet_mid_block = nn.Conv( + mid_block_channel, + kernel_size=(1, 1), + padding="VALID", + kernel_init=nn.initializers.zeros_init(), + bias_init=nn.initializers.zeros_init(), + dtype=self.dtype, + ) + + def __call__( + self, + sample, + timesteps, + encoder_hidden_states, + controlnet_cond, + conditioning_scale: float = 1.0, + return_dict: bool = True, + train: bool = False, + ) -> Union[FlaxControlNetOutput, Tuple]: + r""" + Args: + sample (`jnp.ndarray`): (batch, channel, height, width) noisy inputs tensor + timestep (`jnp.ndarray` or `float` or `int`): timesteps + encoder_hidden_states (`jnp.ndarray`): (batch_size, sequence_length, hidden_size) encoder hidden states + controlnet_cond (`jnp.ndarray`): (batch, channel, height, width) the conditional input tensor + conditioning_scale: (`float`) the scale factor for controlnet outputs + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`models.unet_2d_condition_flax.FlaxUNet2DConditionOutput`] instead of a + plain tuple. + train (`bool`, *optional*, defaults to `False`): + Use deterministic functions and disable dropout when not training. + + Returns: + [`~models.unet_2d_condition_flax.FlaxUNet2DConditionOutput`] or `tuple`: + [`~models.unet_2d_condition_flax.FlaxUNet2DConditionOutput`] if `return_dict` is True, otherwise a `tuple`. + When returning a tuple, the first element is the sample tensor. + """ + channel_order = self.controlnet_conditioning_channel_order + if channel_order == "bgr": + controlnet_cond = jnp.flip(controlnet_cond, axis=1) + + # 1. time + if not isinstance(timesteps, jnp.ndarray): + timesteps = jnp.array([timesteps], dtype=jnp.int32) + elif isinstance(timesteps, jnp.ndarray) and len(timesteps.shape) == 0: + timesteps = timesteps.astype(dtype=jnp.float32) + timesteps = jnp.expand_dims(timesteps, 0) + + t_emb = self.time_proj(timesteps) + t_emb = self.time_embedding(t_emb) + + # 2. pre-process + sample = jnp.transpose(sample, (0, 2, 3, 1)) + sample = self.conv_in(sample) + + controlnet_cond = jnp.transpose(controlnet_cond, (0, 2, 3, 1)) + controlnet_cond = self.controlnet_cond_embedding(controlnet_cond) + sample += controlnet_cond + + # 3. down + down_block_res_samples = (sample,) + for down_block in self.down_blocks: + if isinstance(down_block, FlaxCrossAttnDownBlock2D): + sample, res_samples = down_block(sample, t_emb, encoder_hidden_states, deterministic=not train) + else: + sample, res_samples = down_block(sample, t_emb, deterministic=not train) + down_block_res_samples += res_samples + + # 4. mid + sample = self.mid_block(sample, t_emb, encoder_hidden_states, deterministic=not train) + + # 5. contronet blocks + controlnet_down_block_res_samples = () + for down_block_res_sample, controlnet_block in zip(down_block_res_samples, self.controlnet_down_blocks): + down_block_res_sample = controlnet_block(down_block_res_sample) + controlnet_down_block_res_samples += (down_block_res_sample,) + + down_block_res_samples = controlnet_down_block_res_samples + + mid_block_res_sample = self.controlnet_mid_block(sample) + + # 6. scaling + down_block_res_samples = [sample * conditioning_scale for sample in down_block_res_samples] + mid_block_res_sample *= conditioning_scale + + if not return_dict: + return (down_block_res_samples, mid_block_res_sample) + + return FlaxControlNetOutput( + down_block_res_samples=down_block_res_samples, mid_block_res_sample=mid_block_res_sample + ) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/cross_attention.py b/gf/sd-webui-rich-text/diffusers_official/models/cross_attention.py new file mode 100644 index 0000000000000000000000000000000000000000..44bc156b34cfa8536bdac0fee34709dfd66ae488 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/cross_attention.py @@ -0,0 +1,94 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from ..utils import deprecate +from .attention_processor import ( # noqa: F401 + Attention, + AttentionProcessor, + AttnAddedKVProcessor, + AttnProcessor2_0, + LoRAAttnProcessor, + LoRALinearLayer, + LoRAXFormersAttnProcessor, + SlicedAttnAddedKVProcessor, + SlicedAttnProcessor, + XFormersAttnProcessor, +) +from .attention_processor import AttnProcessor as AttnProcessorRename # noqa: F401 + + +deprecate( + "cross_attention", + "0.20.0", + "Importing from cross_attention is deprecated. Please import from diffusers.models.attention_processor instead.", + standard_warn=False, +) + + +AttnProcessor = AttentionProcessor + + +class CrossAttention(Attention): + def __init__(self, *args, **kwargs): + deprecation_message = f"{self.__class__.__name__} is deprecated and will be removed in `0.20.0`. Please use `from diffusers.models.attention_processor import {''.join(self.__class__.__name__.split('Cross'))} instead." + deprecate("cross_attention", "0.20.0", deprecation_message, standard_warn=False) + super().__init__(*args, **kwargs) + + +class CrossAttnProcessor(AttnProcessorRename): + def __init__(self, *args, **kwargs): + deprecation_message = f"{self.__class__.__name__} is deprecated and will be removed in `0.20.0`. Please use `from diffusers.models.attention_processor import {''.join(self.__class__.__name__.split('Cross'))} instead." + deprecate("cross_attention", "0.20.0", deprecation_message, standard_warn=False) + super().__init__(*args, **kwargs) + + +class LoRACrossAttnProcessor(LoRAAttnProcessor): + def __init__(self, *args, **kwargs): + deprecation_message = f"{self.__class__.__name__} is deprecated and will be removed in `0.20.0`. Please use `from diffusers.models.attention_processor import {''.join(self.__class__.__name__.split('Cross'))} instead." + deprecate("cross_attention", "0.20.0", deprecation_message, standard_warn=False) + super().__init__(*args, **kwargs) + + +class CrossAttnAddedKVProcessor(AttnAddedKVProcessor): + def __init__(self, *args, **kwargs): + deprecation_message = f"{self.__class__.__name__} is deprecated and will be removed in `0.20.0`. Please use `from diffusers.models.attention_processor import {''.join(self.__class__.__name__.split('Cross'))} instead." + deprecate("cross_attention", "0.20.0", deprecation_message, standard_warn=False) + super().__init__(*args, **kwargs) + + +class XFormersCrossAttnProcessor(XFormersAttnProcessor): + def __init__(self, *args, **kwargs): + deprecation_message = f"{self.__class__.__name__} is deprecated and will be removed in `0.20.0`. Please use `from diffusers.models.attention_processor import {''.join(self.__class__.__name__.split('Cross'))} instead." + deprecate("cross_attention", "0.20.0", deprecation_message, standard_warn=False) + super().__init__(*args, **kwargs) + + +class LoRAXFormersCrossAttnProcessor(LoRAXFormersAttnProcessor): + def __init__(self, *args, **kwargs): + deprecation_message = f"{self.__class__.__name__} is deprecated and will be removed in `0.20.0`. Please use `from diffusers.models.attention_processor import {''.join(self.__class__.__name__.split('Cross'))} instead." + deprecate("cross_attention", "0.20.0", deprecation_message, standard_warn=False) + super().__init__(*args, **kwargs) + + +class SlicedCrossAttnProcessor(SlicedAttnProcessor): + def __init__(self, *args, **kwargs): + deprecation_message = f"{self.__class__.__name__} is deprecated and will be removed in `0.20.0`. Please use `from diffusers.models.attention_processor import {''.join(self.__class__.__name__.split('Cross'))} instead." + deprecate("cross_attention", "0.20.0", deprecation_message, standard_warn=False) + super().__init__(*args, **kwargs) + + +class SlicedCrossAttnAddedKVProcessor(SlicedAttnAddedKVProcessor): + def __init__(self, *args, **kwargs): + deprecation_message = f"{self.__class__.__name__} is deprecated and will be removed in `0.20.0`. Please use `from diffusers.models.attention_processor import {''.join(self.__class__.__name__.split('Cross'))} instead." + deprecate("cross_attention", "0.20.0", deprecation_message, standard_warn=False) + super().__init__(*args, **kwargs) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/dual_transformer_2d.py b/gf/sd-webui-rich-text/diffusers_official/models/dual_transformer_2d.py new file mode 100644 index 0000000000000000000000000000000000000000..3db7e73ca6afc5fa7c67c1902d79e67c1aa728bc --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/dual_transformer_2d.py @@ -0,0 +1,151 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Optional + +from torch import nn + +from .transformer_2d import Transformer2DModel, Transformer2DModelOutput + + +class DualTransformer2DModel(nn.Module): + """ + Dual transformer wrapper that combines two `Transformer2DModel`s for mixed inference. + + Parameters: + num_attention_heads (`int`, *optional*, defaults to 16): The number of heads to use for multi-head attention. + attention_head_dim (`int`, *optional*, defaults to 88): The number of channels in each head. + in_channels (`int`, *optional*): + Pass if the input is continuous. The number of channels in the input and output. + num_layers (`int`, *optional*, defaults to 1): The number of layers of Transformer blocks to use. + dropout (`float`, *optional*, defaults to 0.1): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The number of encoder_hidden_states dimensions to use. + sample_size (`int`, *optional*): Pass if the input is discrete. The width of the latent images. + Note that this is fixed at training time as it is used for learning a number of position embeddings. See + `ImagePositionalEmbeddings`. + num_vector_embeds (`int`, *optional*): + Pass if the input is discrete. The number of classes of the vector embeddings of the latent pixels. + Includes the class for the masked latent pixel. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward. + num_embeds_ada_norm ( `int`, *optional*): Pass if at least one of the norm_layers is `AdaLayerNorm`. + The number of diffusion steps used during training. Note that this is fixed at training time as it is used + to learn a number of embeddings that are added to the hidden states. During inference, you can denoise for + up to but not more than steps than `num_embeds_ada_norm`. + attention_bias (`bool`, *optional*): + Configure if the TransformerBlocks' attention should contain a bias parameter. + """ + + def __init__( + self, + num_attention_heads: int = 16, + attention_head_dim: int = 88, + in_channels: Optional[int] = None, + num_layers: int = 1, + dropout: float = 0.0, + norm_num_groups: int = 32, + cross_attention_dim: Optional[int] = None, + attention_bias: bool = False, + sample_size: Optional[int] = None, + num_vector_embeds: Optional[int] = None, + activation_fn: str = "geglu", + num_embeds_ada_norm: Optional[int] = None, + ): + super().__init__() + self.transformers = nn.ModuleList( + [ + Transformer2DModel( + num_attention_heads=num_attention_heads, + attention_head_dim=attention_head_dim, + in_channels=in_channels, + num_layers=num_layers, + dropout=dropout, + norm_num_groups=norm_num_groups, + cross_attention_dim=cross_attention_dim, + attention_bias=attention_bias, + sample_size=sample_size, + num_vector_embeds=num_vector_embeds, + activation_fn=activation_fn, + num_embeds_ada_norm=num_embeds_ada_norm, + ) + for _ in range(2) + ] + ) + + # Variables that can be set by a pipeline: + + # The ratio of transformer1 to transformer2's output states to be combined during inference + self.mix_ratio = 0.5 + + # The shape of `encoder_hidden_states` is expected to be + # `(batch_size, condition_lengths[0]+condition_lengths[1], num_features)` + self.condition_lengths = [77, 257] + + # Which transformer to use to encode which condition. + # E.g. `(1, 0)` means that we'll use `transformers[1](conditions[0])` and `transformers[0](conditions[1])` + self.transformer_index_for_condition = [1, 0] + + def forward( + self, + hidden_states, + encoder_hidden_states, + timestep=None, + attention_mask=None, + cross_attention_kwargs=None, + return_dict: bool = True, + ): + """ + Args: + hidden_states ( When discrete, `torch.LongTensor` of shape `(batch size, num latent pixels)`. + When continuous, `torch.FloatTensor` of shape `(batch size, channel, height, width)`): Input + hidden_states + encoder_hidden_states ( `torch.LongTensor` of shape `(batch size, encoder_hidden_states dim)`, *optional*): + Conditional embeddings for cross attention layer. If not given, cross-attention defaults to + self-attention. + timestep ( `torch.long`, *optional*): + Optional timestep to be applied as an embedding in AdaLayerNorm's. Used to indicate denoising step. + attention_mask (`torch.FloatTensor`, *optional*): + Optional attention mask to be applied in Attention + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`models.unet_2d_condition.UNet2DConditionOutput`] instead of a plain tuple. + + Returns: + [`~models.transformer_2d.Transformer2DModelOutput`] or `tuple`: + [`~models.transformer_2d.Transformer2DModelOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + """ + input_states = hidden_states + + encoded_states = [] + tokens_start = 0 + # attention_mask is not used yet + for i in range(2): + # for each of the two transformers, pass the corresponding condition tokens + condition_state = encoder_hidden_states[:, tokens_start : tokens_start + self.condition_lengths[i]] + transformer_index = self.transformer_index_for_condition[i] + encoded_state = self.transformers[transformer_index]( + input_states, + encoder_hidden_states=condition_state, + timestep=timestep, + cross_attention_kwargs=cross_attention_kwargs, + return_dict=False, + )[0] + encoded_states.append(encoded_state - input_states) + tokens_start += self.condition_lengths[i] + + output_states = encoded_states[0] * self.mix_ratio + encoded_states[1] * (1 - self.mix_ratio) + output_states = output_states + input_states + + if not return_dict: + return (output_states,) + + return Transformer2DModelOutput(sample=output_states) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/embeddings.py b/gf/sd-webui-rich-text/diffusers_official/models/embeddings.py new file mode 100644 index 0000000000000000000000000000000000000000..a5a0c5549ee9d282b4eaa41d496255ad26b74699 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/embeddings.py @@ -0,0 +1,546 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import math +from typing import Optional + +import numpy as np +import torch +from torch import nn + +from .activations import get_activation + + +def get_timestep_embedding( + timesteps: torch.Tensor, + embedding_dim: int, + flip_sin_to_cos: bool = False, + downscale_freq_shift: float = 1, + scale: float = 1, + max_period: int = 10000, +): + """ + This matches the implementation in Denoising Diffusion Probabilistic Models: Create sinusoidal timestep embeddings. + + :param timesteps: a 1-D Tensor of N indices, one per batch element. + These may be fractional. + :param embedding_dim: the dimension of the output. :param max_period: controls the minimum frequency of the + embeddings. :return: an [N x dim] Tensor of positional embeddings. + """ + assert len(timesteps.shape) == 1, "Timesteps should be a 1d-array" + + half_dim = embedding_dim // 2 + exponent = -math.log(max_period) * torch.arange( + start=0, end=half_dim, dtype=torch.float32, device=timesteps.device + ) + exponent = exponent / (half_dim - downscale_freq_shift) + + emb = torch.exp(exponent) + emb = timesteps[:, None].float() * emb[None, :] + + # scale embeddings + emb = scale * emb + + # concat sine and cosine embeddings + emb = torch.cat([torch.sin(emb), torch.cos(emb)], dim=-1) + + # flip sine and cosine embeddings + if flip_sin_to_cos: + emb = torch.cat([emb[:, half_dim:], emb[:, :half_dim]], dim=-1) + + # zero pad + if embedding_dim % 2 == 1: + emb = torch.nn.functional.pad(emb, (0, 1, 0, 0)) + return emb + + +def get_2d_sincos_pos_embed(embed_dim, grid_size, cls_token=False, extra_tokens=0): + """ + grid_size: int of the grid height and width return: pos_embed: [grid_size*grid_size, embed_dim] or + [1+grid_size*grid_size, embed_dim] (w/ or w/o cls_token) + """ + grid_h = np.arange(grid_size, dtype=np.float32) + grid_w = np.arange(grid_size, dtype=np.float32) + grid = np.meshgrid(grid_w, grid_h) # here w goes first + grid = np.stack(grid, axis=0) + + grid = grid.reshape([2, 1, grid_size, grid_size]) + pos_embed = get_2d_sincos_pos_embed_from_grid(embed_dim, grid) + if cls_token and extra_tokens > 0: + pos_embed = np.concatenate([np.zeros([extra_tokens, embed_dim]), pos_embed], axis=0) + return pos_embed + + +def get_2d_sincos_pos_embed_from_grid(embed_dim, grid): + if embed_dim % 2 != 0: + raise ValueError("embed_dim must be divisible by 2") + + # use half of dimensions to encode grid_h + emb_h = get_1d_sincos_pos_embed_from_grid(embed_dim // 2, grid[0]) # (H*W, D/2) + emb_w = get_1d_sincos_pos_embed_from_grid(embed_dim // 2, grid[1]) # (H*W, D/2) + + emb = np.concatenate([emb_h, emb_w], axis=1) # (H*W, D) + return emb + + +def get_1d_sincos_pos_embed_from_grid(embed_dim, pos): + """ + embed_dim: output dimension for each position pos: a list of positions to be encoded: size (M,) out: (M, D) + """ + if embed_dim % 2 != 0: + raise ValueError("embed_dim must be divisible by 2") + + omega = np.arange(embed_dim // 2, dtype=np.float64) + omega /= embed_dim / 2.0 + omega = 1.0 / 10000**omega # (D/2,) + + pos = pos.reshape(-1) # (M,) + out = np.einsum("m,d->md", pos, omega) # (M, D/2), outer product + + emb_sin = np.sin(out) # (M, D/2) + emb_cos = np.cos(out) # (M, D/2) + + emb = np.concatenate([emb_sin, emb_cos], axis=1) # (M, D) + return emb + + +class PatchEmbed(nn.Module): + """2D Image to Patch Embedding""" + + def __init__( + self, + height=224, + width=224, + patch_size=16, + in_channels=3, + embed_dim=768, + layer_norm=False, + flatten=True, + bias=True, + ): + super().__init__() + + num_patches = (height // patch_size) * (width // patch_size) + self.flatten = flatten + self.layer_norm = layer_norm + + self.proj = nn.Conv2d( + in_channels, embed_dim, kernel_size=(patch_size, patch_size), stride=patch_size, bias=bias + ) + if layer_norm: + self.norm = nn.LayerNorm(embed_dim, elementwise_affine=False, eps=1e-6) + else: + self.norm = None + + pos_embed = get_2d_sincos_pos_embed(embed_dim, int(num_patches**0.5)) + self.register_buffer("pos_embed", torch.from_numpy(pos_embed).float().unsqueeze(0), persistent=False) + + def forward(self, latent): + latent = self.proj(latent) + if self.flatten: + latent = latent.flatten(2).transpose(1, 2) # BCHW -> BNC + if self.layer_norm: + latent = self.norm(latent) + return latent + self.pos_embed + + +class TimestepEmbedding(nn.Module): + def __init__( + self, + in_channels: int, + time_embed_dim: int, + act_fn: str = "silu", + out_dim: int = None, + post_act_fn: Optional[str] = None, + cond_proj_dim=None, + ): + super().__init__() + + self.linear_1 = nn.Linear(in_channels, time_embed_dim) + + if cond_proj_dim is not None: + self.cond_proj = nn.Linear(cond_proj_dim, in_channels, bias=False) + else: + self.cond_proj = None + + self.act = get_activation(act_fn) + + if out_dim is not None: + time_embed_dim_out = out_dim + else: + time_embed_dim_out = time_embed_dim + self.linear_2 = nn.Linear(time_embed_dim, time_embed_dim_out) + + if post_act_fn is None: + self.post_act = None + else: + self.post_act = get_activation(post_act_fn) + + def forward(self, sample, condition=None): + if condition is not None: + sample = sample + self.cond_proj(condition) + sample = self.linear_1(sample) + + if self.act is not None: + sample = self.act(sample) + + sample = self.linear_2(sample) + + if self.post_act is not None: + sample = self.post_act(sample) + return sample + + +class Timesteps(nn.Module): + def __init__(self, num_channels: int, flip_sin_to_cos: bool, downscale_freq_shift: float): + super().__init__() + self.num_channels = num_channels + self.flip_sin_to_cos = flip_sin_to_cos + self.downscale_freq_shift = downscale_freq_shift + + def forward(self, timesteps): + t_emb = get_timestep_embedding( + timesteps, + self.num_channels, + flip_sin_to_cos=self.flip_sin_to_cos, + downscale_freq_shift=self.downscale_freq_shift, + ) + return t_emb + + +class GaussianFourierProjection(nn.Module): + """Gaussian Fourier embeddings for noise levels.""" + + def __init__( + self, embedding_size: int = 256, scale: float = 1.0, set_W_to_weight=True, log=True, flip_sin_to_cos=False + ): + super().__init__() + self.weight = nn.Parameter(torch.randn(embedding_size) * scale, requires_grad=False) + self.log = log + self.flip_sin_to_cos = flip_sin_to_cos + + if set_W_to_weight: + # to delete later + self.W = nn.Parameter(torch.randn(embedding_size) * scale, requires_grad=False) + + self.weight = self.W + + def forward(self, x): + if self.log: + x = torch.log(x) + + x_proj = x[:, None] * self.weight[None, :] * 2 * np.pi + + if self.flip_sin_to_cos: + out = torch.cat([torch.cos(x_proj), torch.sin(x_proj)], dim=-1) + else: + out = torch.cat([torch.sin(x_proj), torch.cos(x_proj)], dim=-1) + return out + + +class ImagePositionalEmbeddings(nn.Module): + """ + Converts latent image classes into vector embeddings. Sums the vector embeddings with positional embeddings for the + height and width of the latent space. + + For more details, see figure 10 of the dall-e paper: https://arxiv.org/abs/2102.12092 + + For VQ-diffusion: + + Output vector embeddings are used as input for the transformer. + + Note that the vector embeddings for the transformer are different than the vector embeddings from the VQVAE. + + Args: + num_embed (`int`): + Number of embeddings for the latent pixels embeddings. + height (`int`): + Height of the latent image i.e. the number of height embeddings. + width (`int`): + Width of the latent image i.e. the number of width embeddings. + embed_dim (`int`): + Dimension of the produced vector embeddings. Used for the latent pixel, height, and width embeddings. + """ + + def __init__( + self, + num_embed: int, + height: int, + width: int, + embed_dim: int, + ): + super().__init__() + + self.height = height + self.width = width + self.num_embed = num_embed + self.embed_dim = embed_dim + + self.emb = nn.Embedding(self.num_embed, embed_dim) + self.height_emb = nn.Embedding(self.height, embed_dim) + self.width_emb = nn.Embedding(self.width, embed_dim) + + def forward(self, index): + emb = self.emb(index) + + height_emb = self.height_emb(torch.arange(self.height, device=index.device).view(1, self.height)) + + # 1 x H x D -> 1 x H x 1 x D + height_emb = height_emb.unsqueeze(2) + + width_emb = self.width_emb(torch.arange(self.width, device=index.device).view(1, self.width)) + + # 1 x W x D -> 1 x 1 x W x D + width_emb = width_emb.unsqueeze(1) + + pos_emb = height_emb + width_emb + + # 1 x H x W x D -> 1 x L xD + pos_emb = pos_emb.view(1, self.height * self.width, -1) + + emb = emb + pos_emb[:, : emb.shape[1], :] + + return emb + + +class LabelEmbedding(nn.Module): + """ + Embeds class labels into vector representations. Also handles label dropout for classifier-free guidance. + + Args: + num_classes (`int`): The number of classes. + hidden_size (`int`): The size of the vector embeddings. + dropout_prob (`float`): The probability of dropping a label. + """ + + def __init__(self, num_classes, hidden_size, dropout_prob): + super().__init__() + use_cfg_embedding = dropout_prob > 0 + self.embedding_table = nn.Embedding(num_classes + use_cfg_embedding, hidden_size) + self.num_classes = num_classes + self.dropout_prob = dropout_prob + + def token_drop(self, labels, force_drop_ids=None): + """ + Drops labels to enable classifier-free guidance. + """ + if force_drop_ids is None: + drop_ids = torch.rand(labels.shape[0], device=labels.device) < self.dropout_prob + else: + drop_ids = torch.tensor(force_drop_ids == 1) + labels = torch.where(drop_ids, self.num_classes, labels) + return labels + + def forward(self, labels: torch.LongTensor, force_drop_ids=None): + use_dropout = self.dropout_prob > 0 + if (self.training and use_dropout) or (force_drop_ids is not None): + labels = self.token_drop(labels, force_drop_ids) + embeddings = self.embedding_table(labels) + return embeddings + + +class TextImageProjection(nn.Module): + def __init__( + self, + text_embed_dim: int = 1024, + image_embed_dim: int = 768, + cross_attention_dim: int = 768, + num_image_text_embeds: int = 10, + ): + super().__init__() + + self.num_image_text_embeds = num_image_text_embeds + self.image_embeds = nn.Linear(image_embed_dim, self.num_image_text_embeds * cross_attention_dim) + self.text_proj = nn.Linear(text_embed_dim, cross_attention_dim) + + def forward(self, text_embeds: torch.FloatTensor, image_embeds: torch.FloatTensor): + batch_size = text_embeds.shape[0] + + # image + image_text_embeds = self.image_embeds(image_embeds) + image_text_embeds = image_text_embeds.reshape(batch_size, self.num_image_text_embeds, -1) + + # text + text_embeds = self.text_proj(text_embeds) + + return torch.cat([image_text_embeds, text_embeds], dim=1) + + +class ImageProjection(nn.Module): + def __init__( + self, + image_embed_dim: int = 768, + cross_attention_dim: int = 768, + num_image_text_embeds: int = 32, + ): + super().__init__() + + self.num_image_text_embeds = num_image_text_embeds + self.image_embeds = nn.Linear(image_embed_dim, self.num_image_text_embeds * cross_attention_dim) + self.norm = nn.LayerNorm(cross_attention_dim) + + def forward(self, image_embeds: torch.FloatTensor): + batch_size = image_embeds.shape[0] + + # image + image_embeds = self.image_embeds(image_embeds) + image_embeds = image_embeds.reshape(batch_size, self.num_image_text_embeds, -1) + image_embeds = self.norm(image_embeds) + return image_embeds + + +class CombinedTimestepLabelEmbeddings(nn.Module): + def __init__(self, num_classes, embedding_dim, class_dropout_prob=0.1): + super().__init__() + + self.time_proj = Timesteps(num_channels=256, flip_sin_to_cos=True, downscale_freq_shift=1) + self.timestep_embedder = TimestepEmbedding(in_channels=256, time_embed_dim=embedding_dim) + self.class_embedder = LabelEmbedding(num_classes, embedding_dim, class_dropout_prob) + + def forward(self, timestep, class_labels, hidden_dtype=None): + timesteps_proj = self.time_proj(timestep) + timesteps_emb = self.timestep_embedder(timesteps_proj.to(dtype=hidden_dtype)) # (N, D) + + class_labels = self.class_embedder(class_labels) # (N, D) + + conditioning = timesteps_emb + class_labels # (N, D) + + return conditioning + + +class TextTimeEmbedding(nn.Module): + def __init__(self, encoder_dim: int, time_embed_dim: int, num_heads: int = 64): + super().__init__() + self.norm1 = nn.LayerNorm(encoder_dim) + self.pool = AttentionPooling(num_heads, encoder_dim) + self.proj = nn.Linear(encoder_dim, time_embed_dim) + self.norm2 = nn.LayerNorm(time_embed_dim) + + def forward(self, hidden_states): + hidden_states = self.norm1(hidden_states) + hidden_states = self.pool(hidden_states) + hidden_states = self.proj(hidden_states) + hidden_states = self.norm2(hidden_states) + return hidden_states + + +class TextImageTimeEmbedding(nn.Module): + def __init__(self, text_embed_dim: int = 768, image_embed_dim: int = 768, time_embed_dim: int = 1536): + super().__init__() + self.text_proj = nn.Linear(text_embed_dim, time_embed_dim) + self.text_norm = nn.LayerNorm(time_embed_dim) + self.image_proj = nn.Linear(image_embed_dim, time_embed_dim) + + def forward(self, text_embeds: torch.FloatTensor, image_embeds: torch.FloatTensor): + # text + time_text_embeds = self.text_proj(text_embeds) + time_text_embeds = self.text_norm(time_text_embeds) + + # image + time_image_embeds = self.image_proj(image_embeds) + + return time_image_embeds + time_text_embeds + + +class ImageTimeEmbedding(nn.Module): + def __init__(self, image_embed_dim: int = 768, time_embed_dim: int = 1536): + super().__init__() + self.image_proj = nn.Linear(image_embed_dim, time_embed_dim) + self.image_norm = nn.LayerNorm(time_embed_dim) + + def forward(self, image_embeds: torch.FloatTensor): + # image + time_image_embeds = self.image_proj(image_embeds) + time_image_embeds = self.image_norm(time_image_embeds) + return time_image_embeds + + +class ImageHintTimeEmbedding(nn.Module): + def __init__(self, image_embed_dim: int = 768, time_embed_dim: int = 1536): + super().__init__() + self.image_proj = nn.Linear(image_embed_dim, time_embed_dim) + self.image_norm = nn.LayerNorm(time_embed_dim) + self.input_hint_block = nn.Sequential( + nn.Conv2d(3, 16, 3, padding=1), + nn.SiLU(), + nn.Conv2d(16, 16, 3, padding=1), + nn.SiLU(), + nn.Conv2d(16, 32, 3, padding=1, stride=2), + nn.SiLU(), + nn.Conv2d(32, 32, 3, padding=1), + nn.SiLU(), + nn.Conv2d(32, 96, 3, padding=1, stride=2), + nn.SiLU(), + nn.Conv2d(96, 96, 3, padding=1), + nn.SiLU(), + nn.Conv2d(96, 256, 3, padding=1, stride=2), + nn.SiLU(), + nn.Conv2d(256, 4, 3, padding=1), + ) + + def forward(self, image_embeds: torch.FloatTensor, hint: torch.FloatTensor): + # image + time_image_embeds = self.image_proj(image_embeds) + time_image_embeds = self.image_norm(time_image_embeds) + hint = self.input_hint_block(hint) + return time_image_embeds, hint + + +class AttentionPooling(nn.Module): + # Copied from https://github.com/deep-floyd/IF/blob/2f91391f27dd3c468bf174be5805b4cc92980c0b/deepfloyd_if/model/nn.py#L54 + + def __init__(self, num_heads, embed_dim, dtype=None): + super().__init__() + self.dtype = dtype + self.positional_embedding = nn.Parameter(torch.randn(1, embed_dim) / embed_dim**0.5) + self.k_proj = nn.Linear(embed_dim, embed_dim, dtype=self.dtype) + self.q_proj = nn.Linear(embed_dim, embed_dim, dtype=self.dtype) + self.v_proj = nn.Linear(embed_dim, embed_dim, dtype=self.dtype) + self.num_heads = num_heads + self.dim_per_head = embed_dim // self.num_heads + + def forward(self, x): + bs, length, width = x.size() + + def shape(x): + # (bs, length, width) --> (bs, length, n_heads, dim_per_head) + x = x.view(bs, -1, self.num_heads, self.dim_per_head) + # (bs, length, n_heads, dim_per_head) --> (bs, n_heads, length, dim_per_head) + x = x.transpose(1, 2) + # (bs, n_heads, length, dim_per_head) --> (bs*n_heads, length, dim_per_head) + x = x.reshape(bs * self.num_heads, -1, self.dim_per_head) + # (bs*n_heads, length, dim_per_head) --> (bs*n_heads, dim_per_head, length) + x = x.transpose(1, 2) + return x + + class_token = x.mean(dim=1, keepdim=True) + self.positional_embedding.to(x.dtype) + x = torch.cat([class_token, x], dim=1) # (bs, length+1, width) + + # (bs*n_heads, class_token_length, dim_per_head) + q = shape(self.q_proj(class_token)) + # (bs*n_heads, length+class_token_length, dim_per_head) + k = shape(self.k_proj(x)) + v = shape(self.v_proj(x)) + + # (bs*n_heads, class_token_length, length+class_token_length): + scale = 1 / math.sqrt(math.sqrt(self.dim_per_head)) + weight = torch.einsum("bct,bcs->bts", q * scale, k * scale) # More stable with f16 than dividing afterwards + weight = torch.softmax(weight.float(), dim=-1).type(weight.dtype) + + # (bs*n_heads, dim_per_head, class_token_length) + a = torch.einsum("bts,bcs->bct", weight, v) + + # (bs, length+1, width) + a = a.reshape(bs, -1, 1).transpose(1, 2) + + return a[:, 0, :] # cls_token diff --git a/gf/sd-webui-rich-text/diffusers_official/models/embeddings_flax.py b/gf/sd-webui-rich-text/diffusers_official/models/embeddings_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..88c2c45e4655b8013fa96e0b4408e3ec0a87c2c7 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/embeddings_flax.py @@ -0,0 +1,95 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import math + +import flax.linen as nn +import jax.numpy as jnp + + +def get_sinusoidal_embeddings( + timesteps: jnp.ndarray, + embedding_dim: int, + freq_shift: float = 1, + min_timescale: float = 1, + max_timescale: float = 1.0e4, + flip_sin_to_cos: bool = False, + scale: float = 1.0, +) -> jnp.ndarray: + """Returns the positional encoding (same as Tensor2Tensor). + + Args: + timesteps: a 1-D Tensor of N indices, one per batch element. + These may be fractional. + embedding_dim: The number of output channels. + min_timescale: The smallest time unit (should probably be 0.0). + max_timescale: The largest time unit. + Returns: + a Tensor of timing signals [N, num_channels] + """ + assert timesteps.ndim == 1, "Timesteps should be a 1d-array" + assert embedding_dim % 2 == 0, f"Embedding dimension {embedding_dim} should be even" + num_timescales = float(embedding_dim // 2) + log_timescale_increment = math.log(max_timescale / min_timescale) / (num_timescales - freq_shift) + inv_timescales = min_timescale * jnp.exp(jnp.arange(num_timescales, dtype=jnp.float32) * -log_timescale_increment) + emb = jnp.expand_dims(timesteps, 1) * jnp.expand_dims(inv_timescales, 0) + + # scale embeddings + scaled_time = scale * emb + + if flip_sin_to_cos: + signal = jnp.concatenate([jnp.cos(scaled_time), jnp.sin(scaled_time)], axis=1) + else: + signal = jnp.concatenate([jnp.sin(scaled_time), jnp.cos(scaled_time)], axis=1) + signal = jnp.reshape(signal, [jnp.shape(timesteps)[0], embedding_dim]) + return signal + + +class FlaxTimestepEmbedding(nn.Module): + r""" + Time step Embedding Module. Learns embeddings for input time steps. + + Args: + time_embed_dim (`int`, *optional*, defaults to `32`): + Time step embedding dimension + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + time_embed_dim: int = 32 + dtype: jnp.dtype = jnp.float32 + + @nn.compact + def __call__(self, temb): + temb = nn.Dense(self.time_embed_dim, dtype=self.dtype, name="linear_1")(temb) + temb = nn.silu(temb) + temb = nn.Dense(self.time_embed_dim, dtype=self.dtype, name="linear_2")(temb) + return temb + + +class FlaxTimesteps(nn.Module): + r""" + Wrapper Module for sinusoidal Time step Embeddings as described in https://arxiv.org/abs/2006.11239 + + Args: + dim (`int`, *optional*, defaults to `32`): + Time step embedding dimension + """ + dim: int = 32 + flip_sin_to_cos: bool = False + freq_shift: float = 1 + + @nn.compact + def __call__(self, timesteps): + return get_sinusoidal_embeddings( + timesteps, embedding_dim=self.dim, flip_sin_to_cos=self.flip_sin_to_cos, freq_shift=self.freq_shift + ) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/modeling_flax_pytorch_utils.py b/gf/sd-webui-rich-text/diffusers_official/models/modeling_flax_pytorch_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..f9de83f87dab84d2e7fdd77b835db787cb4f1cb6 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/modeling_flax_pytorch_utils.py @@ -0,0 +1,118 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" PyTorch - Flax general utilities.""" +import re + +import jax.numpy as jnp +from flax.traverse_util import flatten_dict, unflatten_dict +from jax.random import PRNGKey + +from ..utils import logging + + +logger = logging.get_logger(__name__) + + +def rename_key(key): + regex = r"\w+[.]\d+" + pats = re.findall(regex, key) + for pat in pats: + key = key.replace(pat, "_".join(pat.split("."))) + return key + + +##################### +# PyTorch => Flax # +##################### + + +# Adapted from https://github.com/huggingface/transformers/blob/c603c80f46881ae18b2ca50770ef65fa4033eacd/src/transformers/modeling_flax_pytorch_utils.py#L69 +# and https://github.com/patil-suraj/stable-diffusion-jax/blob/main/stable_diffusion_jax/convert_diffusers_to_jax.py +def rename_key_and_reshape_tensor(pt_tuple_key, pt_tensor, random_flax_state_dict): + """Rename PT weight names to corresponding Flax weight names and reshape tensor if necessary""" + + # conv norm or layer norm + renamed_pt_tuple_key = pt_tuple_key[:-1] + ("scale",) + if ( + any("norm" in str_ for str_ in pt_tuple_key) + and (pt_tuple_key[-1] == "bias") + and (pt_tuple_key[:-1] + ("bias",) not in random_flax_state_dict) + and (pt_tuple_key[:-1] + ("scale",) in random_flax_state_dict) + ): + renamed_pt_tuple_key = pt_tuple_key[:-1] + ("scale",) + return renamed_pt_tuple_key, pt_tensor + elif pt_tuple_key[-1] in ["weight", "gamma"] and pt_tuple_key[:-1] + ("scale",) in random_flax_state_dict: + renamed_pt_tuple_key = pt_tuple_key[:-1] + ("scale",) + return renamed_pt_tuple_key, pt_tensor + + # embedding + if pt_tuple_key[-1] == "weight" and pt_tuple_key[:-1] + ("embedding",) in random_flax_state_dict: + pt_tuple_key = pt_tuple_key[:-1] + ("embedding",) + return renamed_pt_tuple_key, pt_tensor + + # conv layer + renamed_pt_tuple_key = pt_tuple_key[:-1] + ("kernel",) + if pt_tuple_key[-1] == "weight" and pt_tensor.ndim == 4: + pt_tensor = pt_tensor.transpose(2, 3, 1, 0) + return renamed_pt_tuple_key, pt_tensor + + # linear layer + renamed_pt_tuple_key = pt_tuple_key[:-1] + ("kernel",) + if pt_tuple_key[-1] == "weight": + pt_tensor = pt_tensor.T + return renamed_pt_tuple_key, pt_tensor + + # old PyTorch layer norm weight + renamed_pt_tuple_key = pt_tuple_key[:-1] + ("weight",) + if pt_tuple_key[-1] == "gamma": + return renamed_pt_tuple_key, pt_tensor + + # old PyTorch layer norm bias + renamed_pt_tuple_key = pt_tuple_key[:-1] + ("bias",) + if pt_tuple_key[-1] == "beta": + return renamed_pt_tuple_key, pt_tensor + + return pt_tuple_key, pt_tensor + + +def convert_pytorch_state_dict_to_flax(pt_state_dict, flax_model, init_key=42): + # Step 1: Convert pytorch tensor to numpy + pt_state_dict = {k: v.numpy() for k, v in pt_state_dict.items()} + + # Step 2: Since the model is stateless, get random Flax params + random_flax_params = flax_model.init_weights(PRNGKey(init_key)) + + random_flax_state_dict = flatten_dict(random_flax_params) + flax_state_dict = {} + + # Need to change some parameters name to match Flax names + for pt_key, pt_tensor in pt_state_dict.items(): + renamed_pt_key = rename_key(pt_key) + pt_tuple_key = tuple(renamed_pt_key.split(".")) + + # Correctly rename weight parameters + flax_key, flax_tensor = rename_key_and_reshape_tensor(pt_tuple_key, pt_tensor, random_flax_state_dict) + + if flax_key in random_flax_state_dict: + if flax_tensor.shape != random_flax_state_dict[flax_key].shape: + raise ValueError( + f"PyTorch checkpoint seems to be incorrect. Weight {pt_key} was expected to be of shape " + f"{random_flax_state_dict[flax_key].shape}, but is {flax_tensor.shape}." + ) + + # also add unexpected weight so that warning is thrown + flax_state_dict[flax_key] = jnp.asarray(flax_tensor) + + return unflatten_dict(flax_state_dict) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/modeling_flax_utils.py b/gf/sd-webui-rich-text/diffusers_official/models/modeling_flax_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..9a6e1b3bba3d94e0252794cd0eda079f2c6f4183 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/modeling_flax_utils.py @@ -0,0 +1,534 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from pickle import UnpicklingError +from typing import Any, Dict, Union + +import jax +import jax.numpy as jnp +import msgpack.exceptions +from flax.core.frozen_dict import FrozenDict, unfreeze +from flax.serialization import from_bytes, to_bytes +from flax.traverse_util import flatten_dict, unflatten_dict +from huggingface_hub import hf_hub_download +from huggingface_hub.utils import EntryNotFoundError, RepositoryNotFoundError, RevisionNotFoundError +from requests import HTTPError + +from .. import __version__, is_torch_available +from ..utils import ( + CONFIG_NAME, + DIFFUSERS_CACHE, + FLAX_WEIGHTS_NAME, + HUGGINGFACE_CO_RESOLVE_ENDPOINT, + WEIGHTS_NAME, + logging, +) +from .modeling_flax_pytorch_utils import convert_pytorch_state_dict_to_flax + + +logger = logging.get_logger(__name__) + + +class FlaxModelMixin: + r""" + Base class for all Flax models. + + [`FlaxModelMixin`] takes care of storing the model configuration and provides methods for loading, downloading and + saving models. + + - **config_name** ([`str`]) -- Filename to save a model to when calling [`~FlaxModelMixin.save_pretrained`]. + """ + config_name = CONFIG_NAME + _automatically_saved_args = ["_diffusers_version", "_class_name", "_name_or_path"] + _flax_internal_args = ["name", "parent", "dtype"] + + @classmethod + def _from_config(cls, config, **kwargs): + """ + All context managers that the model should be initialized under go here. + """ + return cls(config, **kwargs) + + def _cast_floating_to(self, params: Union[Dict, FrozenDict], dtype: jnp.dtype, mask: Any = None) -> Any: + """ + Helper method to cast floating-point values of given parameter `PyTree` to given `dtype`. + """ + + # taken from https://github.com/deepmind/jmp/blob/3a8318abc3292be38582794dbf7b094e6583b192/jmp/_src/policy.py#L27 + def conditional_cast(param): + if isinstance(param, jnp.ndarray) and jnp.issubdtype(param.dtype, jnp.floating): + param = param.astype(dtype) + return param + + if mask is None: + return jax.tree_map(conditional_cast, params) + + flat_params = flatten_dict(params) + flat_mask, _ = jax.tree_flatten(mask) + + for masked, key in zip(flat_mask, flat_params.keys()): + if masked: + param = flat_params[key] + flat_params[key] = conditional_cast(param) + + return unflatten_dict(flat_params) + + def to_bf16(self, params: Union[Dict, FrozenDict], mask: Any = None): + r""" + Cast the floating-point `params` to `jax.numpy.bfloat16`. This returns a new `params` tree and does not cast + the `params` in place. + + This method can be used on a TPU to explicitly convert the model parameters to bfloat16 precision to do full + half-precision training or to save weights in bfloat16 for inference in order to save memory and improve speed. + + Arguments: + params (`Union[Dict, FrozenDict]`): + A `PyTree` of model parameters. + mask (`Union[Dict, FrozenDict]`): + A `PyTree` with same structure as the `params` tree. The leaves should be booleans. It should be `True` + for params you want to cast, and `False` for those you want to skip. + + Examples: + + ```python + >>> from diffusers import FlaxUNet2DConditionModel + + >>> # load model + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") + >>> # By default, the model parameters will be in fp32 precision, to cast these to bfloat16 precision + >>> params = model.to_bf16(params) + >>> # If you don't want to cast certain parameters (for example layer norm bias and scale) + >>> # then pass the mask as follows + >>> from flax import traverse_util + + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") + >>> flat_params = traverse_util.flatten_dict(params) + >>> mask = { + ... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale")) + ... for path in flat_params + ... } + >>> mask = traverse_util.unflatten_dict(mask) + >>> params = model.to_bf16(params, mask) + ```""" + return self._cast_floating_to(params, jnp.bfloat16, mask) + + def to_fp32(self, params: Union[Dict, FrozenDict], mask: Any = None): + r""" + Cast the floating-point `params` to `jax.numpy.float32`. This method can be used to explicitly convert the + model parameters to fp32 precision. This returns a new `params` tree and does not cast the `params` in place. + + Arguments: + params (`Union[Dict, FrozenDict]`): + A `PyTree` of model parameters. + mask (`Union[Dict, FrozenDict]`): + A `PyTree` with same structure as the `params` tree. The leaves should be booleans. It should be `True` + for params you want to cast, and `False` for those you want to skip. + + Examples: + + ```python + >>> from diffusers import FlaxUNet2DConditionModel + + >>> # Download model and configuration from huggingface.co + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") + >>> # By default, the model params will be in fp32, to illustrate the use of this method, + >>> # we'll first cast to fp16 and back to fp32 + >>> params = model.to_f16(params) + >>> # now cast back to fp32 + >>> params = model.to_fp32(params) + ```""" + return self._cast_floating_to(params, jnp.float32, mask) + + def to_fp16(self, params: Union[Dict, FrozenDict], mask: Any = None): + r""" + Cast the floating-point `params` to `jax.numpy.float16`. This returns a new `params` tree and does not cast the + `params` in place. + + This method can be used on a GPU to explicitly convert the model parameters to float16 precision to do full + half-precision training or to save weights in float16 for inference in order to save memory and improve speed. + + Arguments: + params (`Union[Dict, FrozenDict]`): + A `PyTree` of model parameters. + mask (`Union[Dict, FrozenDict]`): + A `PyTree` with same structure as the `params` tree. The leaves should be booleans. It should be `True` + for params you want to cast, and `False` for those you want to skip. + + Examples: + + ```python + >>> from diffusers import FlaxUNet2DConditionModel + + >>> # load model + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") + >>> # By default, the model params will be in fp32, to cast these to float16 + >>> params = model.to_fp16(params) + >>> # If you want don't want to cast certain parameters (for example layer norm bias and scale) + >>> # then pass the mask as follows + >>> from flax import traverse_util + + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") + >>> flat_params = traverse_util.flatten_dict(params) + >>> mask = { + ... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale")) + ... for path in flat_params + ... } + >>> mask = traverse_util.unflatten_dict(mask) + >>> params = model.to_fp16(params, mask) + ```""" + return self._cast_floating_to(params, jnp.float16, mask) + + def init_weights(self, rng: jax.random.KeyArray) -> Dict: + raise NotImplementedError(f"init_weights method has to be implemented for {self}") + + @classmethod + def from_pretrained( + cls, + pretrained_model_name_or_path: Union[str, os.PathLike], + dtype: jnp.dtype = jnp.float32, + *model_args, + **kwargs, + ): + r""" + Instantiate a pretrained Flax model from a pretrained model configuration. + + Parameters: + pretrained_model_name_or_path (`str` or `os.PathLike`): + Can be either: + + - A string, the *model id* (for example `runwayml/stable-diffusion-v1-5`) of a pretrained model + hosted on the Hub. + - A path to a *directory* (for example `./my_model_directory`) containing the model weights saved + using [`~FlaxModelMixin.save_pretrained`]. + dtype (`jax.numpy.dtype`, *optional*, defaults to `jax.numpy.float32`): + The data type of the computation. Can be one of `jax.numpy.float32`, `jax.numpy.float16` (on GPUs) and + `jax.numpy.bfloat16` (on TPUs). + + This can be used to enable mixed-precision training or half-precision inference on GPUs or TPUs. If + specified, all the computation will be performed with the given `dtype`. + + + + This only specifies the dtype of the *computation* and does not influence the dtype of model + parameters. + + If you wish to change the dtype of the model parameters, see [`~FlaxModelMixin.to_fp16`] and + [`~FlaxModelMixin.to_bf16`]. + + + + model_args (sequence of positional arguments, *optional*): + All remaining positional arguments are passed to the underlying model's `__init__` method. + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory where a downloaded pretrained model configuration is cached if the standard cache + is not used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + local_files_only(`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to `True`, the model + won't be downloaded from the Hub. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + from_pt (`bool`, *optional*, defaults to `False`): + Load the model weights from a PyTorch checkpoint save file. + kwargs (remaining dictionary of keyword arguments, *optional*): + Can be used to update the configuration object (after it is loaded) and initiate the model (for + example, `output_attentions=True`). Behaves differently depending on whether a `config` is provided or + automatically loaded: + + - If a configuration is provided with `config`, `kwargs` are directly passed to the underlying + model's `__init__` method (we assume all relevant updates to the configuration have already been + done). + - If a configuration is not provided, `kwargs` are first passed to the configuration class + initialization function [`~ConfigMixin.from_config`]. Each key of the `kwargs` that corresponds + to a configuration attribute is used to override said attribute with the supplied `kwargs` value. + Remaining keys that do not correspond to any configuration attribute are passed to the underlying + model's `__init__` function. + + Examples: + + ```python + >>> from diffusers import FlaxUNet2DConditionModel + + >>> # Download model and configuration from huggingface.co and cache. + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5") + >>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable). + >>> model, params = FlaxUNet2DConditionModel.from_pretrained("./test/saved_model/") + ``` + + If you get the error message below, you need to finetune the weights for your downstream task: + + ```bash + Some weights of UNet2DConditionModel were not initialized from the model checkpoint at runwayml/stable-diffusion-v1-5 and are newly initialized because the shapes did not match: + - conv_in.weight: found shape torch.Size([320, 4, 3, 3]) in the checkpoint and torch.Size([320, 9, 3, 3]) in the model instantiated + You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. + ``` + """ + config = kwargs.pop("config", None) + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + force_download = kwargs.pop("force_download", False) + from_pt = kwargs.pop("from_pt", False) + resume_download = kwargs.pop("resume_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", False) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + subfolder = kwargs.pop("subfolder", None) + + user_agent = { + "diffusers": __version__, + "file_type": "model", + "framework": "flax", + } + + # Load config if we don't provide a configuration + config_path = config if config is not None else pretrained_model_name_or_path + model, model_kwargs = cls.from_config( + config_path, + cache_dir=cache_dir, + return_unused_kwargs=True, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + # model args + dtype=dtype, + **kwargs, + ) + + # Load model + pretrained_path_with_subfolder = ( + pretrained_model_name_or_path + if subfolder is None + else os.path.join(pretrained_model_name_or_path, subfolder) + ) + if os.path.isdir(pretrained_path_with_subfolder): + if from_pt: + if not os.path.isfile(os.path.join(pretrained_path_with_subfolder, WEIGHTS_NAME)): + raise EnvironmentError( + f"Error no file named {WEIGHTS_NAME} found in directory {pretrained_path_with_subfolder} " + ) + model_file = os.path.join(pretrained_path_with_subfolder, WEIGHTS_NAME) + elif os.path.isfile(os.path.join(pretrained_path_with_subfolder, FLAX_WEIGHTS_NAME)): + # Load from a Flax checkpoint + model_file = os.path.join(pretrained_path_with_subfolder, FLAX_WEIGHTS_NAME) + # Check if pytorch weights exist instead + elif os.path.isfile(os.path.join(pretrained_path_with_subfolder, WEIGHTS_NAME)): + raise EnvironmentError( + f"{WEIGHTS_NAME} file found in directory {pretrained_path_with_subfolder}. Please load the model" + " using `from_pt=True`." + ) + else: + raise EnvironmentError( + f"Error no file named {FLAX_WEIGHTS_NAME} or {WEIGHTS_NAME} found in directory " + f"{pretrained_path_with_subfolder}." + ) + else: + try: + model_file = hf_hub_download( + pretrained_model_name_or_path, + filename=FLAX_WEIGHTS_NAME if not from_pt else WEIGHTS_NAME, + cache_dir=cache_dir, + force_download=force_download, + proxies=proxies, + resume_download=resume_download, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + user_agent=user_agent, + subfolder=subfolder, + revision=revision, + ) + + except RepositoryNotFoundError: + raise EnvironmentError( + f"{pretrained_model_name_or_path} is not a local folder and is not a valid model identifier " + "listed on 'https://huggingface.co/models'\nIf this is a private repository, make sure to pass a " + "token having permission to this repo with `use_auth_token` or log in with `huggingface-cli " + "login`." + ) + except RevisionNotFoundError: + raise EnvironmentError( + f"{revision} is not a valid git identifier (branch name, tag name or commit id) that exists for " + "this model name. Check the model page at " + f"'https://huggingface.co/{pretrained_model_name_or_path}' for available revisions." + ) + except EntryNotFoundError: + raise EnvironmentError( + f"{pretrained_model_name_or_path} does not appear to have a file named {FLAX_WEIGHTS_NAME}." + ) + except HTTPError as err: + raise EnvironmentError( + f"There was a specific connection error when trying to load {pretrained_model_name_or_path}:\n" + f"{err}" + ) + except ValueError: + raise EnvironmentError( + f"We couldn't connect to '{HUGGINGFACE_CO_RESOLVE_ENDPOINT}' to load this model, couldn't find it" + f" in the cached files and it looks like {pretrained_model_name_or_path} is not the path to a" + f" directory containing a file named {FLAX_WEIGHTS_NAME} or {WEIGHTS_NAME}.\nCheckout your" + " internet connection or see how to run the library in offline mode at" + " 'https://huggingface.co/docs/transformers/installation#offline-mode'." + ) + except EnvironmentError: + raise EnvironmentError( + f"Can't load the model for '{pretrained_model_name_or_path}'. If you were trying to load it from " + "'https://huggingface.co/models', make sure you don't have a local directory with the same name. " + f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory " + f"containing a file named {FLAX_WEIGHTS_NAME} or {WEIGHTS_NAME}." + ) + + if from_pt: + if is_torch_available(): + from .modeling_utils import load_state_dict + else: + raise EnvironmentError( + "Can't load the model in PyTorch format because PyTorch is not installed. " + "Please, install PyTorch or use native Flax weights." + ) + + # Step 1: Get the pytorch file + pytorch_model_file = load_state_dict(model_file) + + # Step 2: Convert the weights + state = convert_pytorch_state_dict_to_flax(pytorch_model_file, model) + else: + try: + with open(model_file, "rb") as state_f: + state = from_bytes(cls, state_f.read()) + except (UnpicklingError, msgpack.exceptions.ExtraData) as e: + try: + with open(model_file) as f: + if f.read().startswith("version"): + raise OSError( + "You seem to have cloned a repository without having git-lfs installed. Please" + " install git-lfs and run `git lfs install` followed by `git lfs pull` in the" + " folder you cloned." + ) + else: + raise ValueError from e + except (UnicodeDecodeError, ValueError): + raise EnvironmentError(f"Unable to convert {model_file} to Flax deserializable object. ") + # make sure all arrays are stored as jnp.ndarray + # NOTE: This is to prevent a bug this will be fixed in Flax >= v0.3.4: + # https://github.com/google/flax/issues/1261 + state = jax.tree_util.tree_map(lambda x: jax.device_put(x, jax.devices("cpu")[0]), state) + + # flatten dicts + state = flatten_dict(state) + + params_shape_tree = jax.eval_shape(model.init_weights, rng=jax.random.PRNGKey(0)) + required_params = set(flatten_dict(unfreeze(params_shape_tree)).keys()) + + shape_state = flatten_dict(unfreeze(params_shape_tree)) + + missing_keys = required_params - set(state.keys()) + unexpected_keys = set(state.keys()) - required_params + + if missing_keys: + logger.warning( + f"The checkpoint {pretrained_model_name_or_path} is missing required keys: {missing_keys}. " + "Make sure to call model.init_weights to initialize the missing weights." + ) + cls._missing_keys = missing_keys + + for key in state.keys(): + if key in shape_state and state[key].shape != shape_state[key].shape: + raise ValueError( + f"Trying to load the pretrained weight for {key} failed: checkpoint has shape " + f"{state[key].shape} which is incompatible with the model shape {shape_state[key].shape}. " + ) + + # remove unexpected keys to not be saved again + for unexpected_key in unexpected_keys: + del state[unexpected_key] + + if len(unexpected_keys) > 0: + logger.warning( + f"Some weights of the model checkpoint at {pretrained_model_name_or_path} were not used when" + f" initializing {model.__class__.__name__}: {unexpected_keys}\n- This IS expected if you are" + f" initializing {model.__class__.__name__} from the checkpoint of a model trained on another task or" + " with another architecture." + ) + else: + logger.info(f"All model checkpoint weights were used when initializing {model.__class__.__name__}.\n") + + if len(missing_keys) > 0: + logger.warning( + f"Some weights of {model.__class__.__name__} were not initialized from the model checkpoint at" + f" {pretrained_model_name_or_path} and are newly initialized: {missing_keys}\nYou should probably" + " TRAIN this model on a down-stream task to be able to use it for predictions and inference." + ) + else: + logger.info( + f"All the weights of {model.__class__.__name__} were initialized from the model checkpoint at" + f" {pretrained_model_name_or_path}.\nIf your task is similar to the task the model of the checkpoint" + f" was trained on, you can already use {model.__class__.__name__} for predictions without further" + " training." + ) + + return model, unflatten_dict(state) + + def save_pretrained( + self, + save_directory: Union[str, os.PathLike], + params: Union[Dict, FrozenDict], + is_main_process: bool = True, + ): + """ + Save a model and its configuration file to a directory so that it can be reloaded using the + [`~FlaxModelMixin.from_pretrained`] class method. + + Arguments: + save_directory (`str` or `os.PathLike`): + Directory to save a model and its configuration file to. Will be created if it doesn't exist. + params (`Union[Dict, FrozenDict]`): + A `PyTree` of model parameters. + is_main_process (`bool`, *optional*, defaults to `True`): + Whether the process calling this is the main process or not. Useful during distributed training and you + need to call this function on all processes. In this case, set `is_main_process=True` only on the main + process to avoid race conditions. + """ + if os.path.isfile(save_directory): + logger.error(f"Provided path ({save_directory}) should be a directory, not a file") + return + + os.makedirs(save_directory, exist_ok=True) + + model_to_save = self + + # Attach architecture to the config + # Save the config + if is_main_process: + model_to_save.save_config(save_directory) + + # save model + output_model_file = os.path.join(save_directory, FLAX_WEIGHTS_NAME) + with open(output_model_file, "wb") as f: + model_bytes = to_bytes(params) + f.write(model_bytes) + + logger.info(f"Model weights saved in {output_model_file}") diff --git a/gf/sd-webui-rich-text/diffusers_official/models/modeling_pytorch_flax_utils.py b/gf/sd-webui-rich-text/diffusers_official/models/modeling_pytorch_flax_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..17b521b00145330ba99714f732112a44f4ceddc9 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/modeling_pytorch_flax_utils.py @@ -0,0 +1,161 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" PyTorch - Flax general utilities.""" + +from pickle import UnpicklingError + +import jax +import jax.numpy as jnp +import numpy as np +from flax.serialization import from_bytes +from flax.traverse_util import flatten_dict + +from ..utils import logging + + +logger = logging.get_logger(__name__) + + +##################### +# Flax => PyTorch # +##################### + + +# from https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_flax_pytorch_utils.py#L224-L352 +def load_flax_checkpoint_in_pytorch_model(pt_model, model_file): + try: + with open(model_file, "rb") as flax_state_f: + flax_state = from_bytes(None, flax_state_f.read()) + except UnpicklingError as e: + try: + with open(model_file) as f: + if f.read().startswith("version"): + raise OSError( + "You seem to have cloned a repository without having git-lfs installed. Please" + " install git-lfs and run `git lfs install` followed by `git lfs pull` in the" + " folder you cloned." + ) + else: + raise ValueError from e + except (UnicodeDecodeError, ValueError): + raise EnvironmentError(f"Unable to convert {model_file} to Flax deserializable object. ") + + return load_flax_weights_in_pytorch_model(pt_model, flax_state) + + +def load_flax_weights_in_pytorch_model(pt_model, flax_state): + """Load flax checkpoints in a PyTorch model""" + + try: + import torch # noqa: F401 + except ImportError: + logger.error( + "Loading Flax weights in PyTorch requires both PyTorch and Flax to be installed. Please see" + " https://pytorch.org/ and https://flax.readthedocs.io/en/latest/installation.html for installation" + " instructions." + ) + raise + + # check if we have bf16 weights + is_type_bf16 = flatten_dict(jax.tree_util.tree_map(lambda x: x.dtype == jnp.bfloat16, flax_state)).values() + if any(is_type_bf16): + # convert all weights to fp32 if they are bf16 since torch.from_numpy can-not handle bf16 + + # and bf16 is not fully supported in PT yet. + logger.warning( + "Found ``bfloat16`` weights in Flax model. Casting all ``bfloat16`` weights to ``float32`` " + "before loading those in PyTorch model." + ) + flax_state = jax.tree_util.tree_map( + lambda params: params.astype(np.float32) if params.dtype == jnp.bfloat16 else params, flax_state + ) + + pt_model.base_model_prefix = "" + + flax_state_dict = flatten_dict(flax_state, sep=".") + pt_model_dict = pt_model.state_dict() + + # keep track of unexpected & missing keys + unexpected_keys = [] + missing_keys = set(pt_model_dict.keys()) + + for flax_key_tuple, flax_tensor in flax_state_dict.items(): + flax_key_tuple_array = flax_key_tuple.split(".") + + if flax_key_tuple_array[-1] == "kernel" and flax_tensor.ndim == 4: + flax_key_tuple_array = flax_key_tuple_array[:-1] + ["weight"] + flax_tensor = jnp.transpose(flax_tensor, (3, 2, 0, 1)) + elif flax_key_tuple_array[-1] == "kernel": + flax_key_tuple_array = flax_key_tuple_array[:-1] + ["weight"] + flax_tensor = flax_tensor.T + elif flax_key_tuple_array[-1] == "scale": + flax_key_tuple_array = flax_key_tuple_array[:-1] + ["weight"] + + if "time_embedding" not in flax_key_tuple_array: + for i, flax_key_tuple_string in enumerate(flax_key_tuple_array): + flax_key_tuple_array[i] = ( + flax_key_tuple_string.replace("_0", ".0") + .replace("_1", ".1") + .replace("_2", ".2") + .replace("_3", ".3") + .replace("_4", ".4") + .replace("_5", ".5") + .replace("_6", ".6") + .replace("_7", ".7") + .replace("_8", ".8") + .replace("_9", ".9") + ) + + flax_key = ".".join(flax_key_tuple_array) + + if flax_key in pt_model_dict: + if flax_tensor.shape != pt_model_dict[flax_key].shape: + raise ValueError( + f"Flax checkpoint seems to be incorrect. Weight {flax_key_tuple} was expected " + f"to be of shape {pt_model_dict[flax_key].shape}, but is {flax_tensor.shape}." + ) + else: + # add weight to pytorch dict + flax_tensor = np.asarray(flax_tensor) if not isinstance(flax_tensor, np.ndarray) else flax_tensor + pt_model_dict[flax_key] = torch.from_numpy(flax_tensor) + # remove from missing keys + missing_keys.remove(flax_key) + else: + # weight is not expected by PyTorch model + unexpected_keys.append(flax_key) + + pt_model.load_state_dict(pt_model_dict) + + # re-transform missing_keys to list + missing_keys = list(missing_keys) + + if len(unexpected_keys) > 0: + logger.warning( + "Some weights of the Flax model were not used when initializing the PyTorch model" + f" {pt_model.__class__.__name__}: {unexpected_keys}\n- This IS expected if you are initializing" + f" {pt_model.__class__.__name__} from a Flax model trained on another task or with another architecture" + " (e.g. initializing a BertForSequenceClassification model from a FlaxBertForPreTraining model).\n- This" + f" IS NOT expected if you are initializing {pt_model.__class__.__name__} from a Flax model that you expect" + " to be exactly identical (e.g. initializing a BertForSequenceClassification model from a" + " FlaxBertForSequenceClassification model)." + ) + if len(missing_keys) > 0: + logger.warning( + f"Some weights of {pt_model.__class__.__name__} were not initialized from the Flax model and are newly" + f" initialized: {missing_keys}\nYou should probably TRAIN this model on a down-stream task to be able to" + " use it for predictions and inference." + ) + + return pt_model diff --git a/gf/sd-webui-rich-text/diffusers_official/models/modeling_utils.py b/gf/sd-webui-rich-text/diffusers_official/models/modeling_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..1fa96514c0a9e39b9321550f6d85a8e11b0deb36 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/modeling_utils.py @@ -0,0 +1,980 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import inspect +import itertools +import os +import re +from functools import partial +from typing import Any, Callable, List, Optional, Tuple, Union + +import torch +from torch import Tensor, device, nn + +from .. import __version__ +from ..utils import ( + CONFIG_NAME, + DIFFUSERS_CACHE, + FLAX_WEIGHTS_NAME, + HF_HUB_OFFLINE, + SAFETENSORS_WEIGHTS_NAME, + WEIGHTS_NAME, + _add_variant, + _get_model_file, + deprecate, + is_accelerate_available, + is_safetensors_available, + is_torch_version, + logging, +) + + +logger = logging.get_logger(__name__) + + +if is_torch_version(">=", "1.9.0"): + _LOW_CPU_MEM_USAGE_DEFAULT = True +else: + _LOW_CPU_MEM_USAGE_DEFAULT = False + + +if is_accelerate_available(): + import accelerate + from accelerate.utils import set_module_tensor_to_device + from accelerate.utils.versions import is_torch_version + +if is_safetensors_available(): + import safetensors + + +def get_parameter_device(parameter: torch.nn.Module): + try: + parameters_and_buffers = itertools.chain(parameter.parameters(), parameter.buffers()) + return next(parameters_and_buffers).device + except StopIteration: + # For torch.nn.DataParallel compatibility in PyTorch 1.5 + + def find_tensor_attributes(module: torch.nn.Module) -> List[Tuple[str, Tensor]]: + tuples = [(k, v) for k, v in module.__dict__.items() if torch.is_tensor(v)] + return tuples + + gen = parameter._named_members(get_members_fn=find_tensor_attributes) + first_tuple = next(gen) + return first_tuple[1].device + + +def get_parameter_dtype(parameter: torch.nn.Module): + try: + params = tuple(parameter.parameters()) + if len(params) > 0: + return params[0].dtype + + buffers = tuple(parameter.buffers()) + if len(buffers) > 0: + return buffers[0].dtype + + except StopIteration: + # For torch.nn.DataParallel compatibility in PyTorch 1.5 + + def find_tensor_attributes(module: torch.nn.Module) -> List[Tuple[str, Tensor]]: + tuples = [(k, v) for k, v in module.__dict__.items() if torch.is_tensor(v)] + return tuples + + gen = parameter._named_members(get_members_fn=find_tensor_attributes) + first_tuple = next(gen) + return first_tuple[1].dtype + + +def load_state_dict(checkpoint_file: Union[str, os.PathLike], variant: Optional[str] = None): + """ + Reads a checkpoint file, returning properly formatted errors if they arise. + """ + try: + if os.path.basename(checkpoint_file) == _add_variant(WEIGHTS_NAME, variant): + return torch.load(checkpoint_file, map_location="cpu") + else: + return safetensors.torch.load_file(checkpoint_file, device="cpu") + except Exception as e: + try: + with open(checkpoint_file) as f: + if f.read().startswith("version"): + raise OSError( + "You seem to have cloned a repository without having git-lfs installed. Please install " + "git-lfs and run `git lfs install` followed by `git lfs pull` in the folder " + "you cloned." + ) + else: + raise ValueError( + f"Unable to locate the file {checkpoint_file} which is necessary to load this pretrained " + "model. Make sure you have saved the model properly." + ) from e + except (UnicodeDecodeError, ValueError): + raise OSError( + f"Unable to load weights from checkpoint file for '{checkpoint_file}' " + f"at '{checkpoint_file}'. " + "If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True." + ) + + +def _load_state_dict_into_model(model_to_load, state_dict): + # Convert old format to new format if needed from a PyTorch state_dict + # copy state_dict so _load_from_state_dict can modify it + state_dict = state_dict.copy() + error_msgs = [] + + # PyTorch's `_load_from_state_dict` does not copy parameters in a module's descendants + # so we need to apply the function recursively. + def load(module: torch.nn.Module, prefix=""): + args = (state_dict, prefix, {}, True, [], [], error_msgs) + module._load_from_state_dict(*args) + + for name, child in module._modules.items(): + if child is not None: + load(child, prefix + name + ".") + + load(model_to_load) + + return error_msgs + + +class ModelMixin(torch.nn.Module): + r""" + Base class for all models. + + [`ModelMixin`] takes care of storing the model configuration and provides methods for loading, downloading and + saving models. + + - **config_name** ([`str`]) -- Filename to save a model to when calling [`~models.ModelMixin.save_pretrained`]. + """ + config_name = CONFIG_NAME + _automatically_saved_args = ["_diffusers_version", "_class_name", "_name_or_path"] + _supports_gradient_checkpointing = False + _keys_to_ignore_on_load_unexpected = None + + def __init__(self): + super().__init__() + + def __getattr__(self, name: str) -> Any: + """The only reason we overwrite `getattr` here is to gracefully deprecate accessing + config attributes directly. See https://github.com/huggingface/diffusers/pull/3129 We need to overwrite + __getattr__ here in addition so that we don't trigger `torch.nn.Module`'s __getattr__': + https://pytorch.org/docs/stable/_modules/torch/nn/modules/module.html#Module + """ + + is_in_config = "_internal_dict" in self.__dict__ and hasattr(self.__dict__["_internal_dict"], name) + is_attribute = name in self.__dict__ + + if is_in_config and not is_attribute: + deprecation_message = f"Accessing config attribute `{name}` directly via '{type(self).__name__}' object attribute is deprecated. Please access '{name}' over '{type(self).__name__}'s config object instead, e.g. 'unet.config.{name}'." + deprecate("direct config name access", "1.0.0", deprecation_message, standard_warn=False, stacklevel=3) + return self._internal_dict[name] + + # call PyTorch's https://pytorch.org/docs/stable/_modules/torch/nn/modules/module.html#Module + return super().__getattr__(name) + + @property + def is_gradient_checkpointing(self) -> bool: + """ + Whether gradient checkpointing is activated for this model or not. + """ + return any(hasattr(m, "gradient_checkpointing") and m.gradient_checkpointing for m in self.modules()) + + def enable_gradient_checkpointing(self): + """ + Activates gradient checkpointing for the current model (may be referred to as *activation checkpointing* or + *checkpoint activations* in other frameworks). + """ + if not self._supports_gradient_checkpointing: + raise ValueError(f"{self.__class__.__name__} does not support gradient checkpointing.") + self.apply(partial(self._set_gradient_checkpointing, value=True)) + + def disable_gradient_checkpointing(self): + """ + Deactivates gradient checkpointing for the current model (may be referred to as *activation checkpointing* or + *checkpoint activations* in other frameworks). + """ + if self._supports_gradient_checkpointing: + self.apply(partial(self._set_gradient_checkpointing, value=False)) + + def set_use_memory_efficient_attention_xformers( + self, valid: bool, attention_op: Optional[Callable] = None + ) -> None: + # Recursively walk through all the children. + # Any children which exposes the set_use_memory_efficient_attention_xformers method + # gets the message + def fn_recursive_set_mem_eff(module: torch.nn.Module): + if hasattr(module, "set_use_memory_efficient_attention_xformers"): + module.set_use_memory_efficient_attention_xformers(valid, attention_op) + + for child in module.children(): + fn_recursive_set_mem_eff(child) + + for module in self.children(): + if isinstance(module, torch.nn.Module): + fn_recursive_set_mem_eff(module) + + def enable_xformers_memory_efficient_attention(self, attention_op: Optional[Callable] = None): + r""" + Enable memory efficient attention from [xFormers](https://facebookresearch.github.io/xformers/). + + When this option is enabled, you should observe lower GPU memory usage and a potential speed up during + inference. Speed up during training is not guaranteed. + + + + ⚠️ When memory efficient attention and sliced attention are both enabled, memory efficient attention takes + precedent. + + + + Parameters: + attention_op (`Callable`, *optional*): + Override the default `None` operator for use as `op` argument to the + [`memory_efficient_attention()`](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.memory_efficient_attention) + function of xFormers. + + Examples: + + ```py + >>> import torch + >>> from diffusers import UNet2DConditionModel + >>> from xformers.ops import MemoryEfficientAttentionFlashAttentionOp + + >>> model = UNet2DConditionModel.from_pretrained( + ... "stabilityai/stable-diffusion-2-1", subfolder="unet", torch_dtype=torch.float16 + ... ) + >>> model = model.to("cuda") + >>> model.enable_xformers_memory_efficient_attention(attention_op=MemoryEfficientAttentionFlashAttentionOp) + ``` + """ + self.set_use_memory_efficient_attention_xformers(True, attention_op) + + def disable_xformers_memory_efficient_attention(self): + r""" + Disable memory efficient attention from [xFormers](https://facebookresearch.github.io/xformers/). + """ + self.set_use_memory_efficient_attention_xformers(False) + + def save_pretrained( + self, + save_directory: Union[str, os.PathLike], + is_main_process: bool = True, + save_function: Callable = None, + safe_serialization: bool = False, + variant: Optional[str] = None, + ): + """ + Save a model and its configuration file to a directory so that it can be reloaded using the + [`~models.ModelMixin.from_pretrained`] class method. + + Arguments: + save_directory (`str` or `os.PathLike`): + Directory to save a model and its configuration file to. Will be created if it doesn't exist. + is_main_process (`bool`, *optional*, defaults to `True`): + Whether the process calling this is the main process or not. Useful during distributed training and you + need to call this function on all processes. In this case, set `is_main_process=True` only on the main + process to avoid race conditions. + save_function (`Callable`): + The function to use to save the state dictionary. Useful during distributed training when you need to + replace `torch.save` with another method. Can be configured with the environment variable + `DIFFUSERS_SAVE_MODE`. + safe_serialization (`bool`, *optional*, defaults to `False`): + Whether to save the model using `safetensors` or the traditional PyTorch way with `pickle`. + variant (`str`, *optional*): + If specified, weights are saved in the format `pytorch_model..bin`. + """ + if safe_serialization and not is_safetensors_available(): + raise ImportError("`safe_serialization` requires the `safetensors library: `pip install safetensors`.") + + if os.path.isfile(save_directory): + logger.error(f"Provided path ({save_directory}) should be a directory, not a file") + return + + os.makedirs(save_directory, exist_ok=True) + + model_to_save = self + + # Attach architecture to the config + # Save the config + if is_main_process: + model_to_save.save_config(save_directory) + + # Save the model + state_dict = model_to_save.state_dict() + + weights_name = SAFETENSORS_WEIGHTS_NAME if safe_serialization else WEIGHTS_NAME + weights_name = _add_variant(weights_name, variant) + + # Save the model + if safe_serialization: + safetensors.torch.save_file( + state_dict, os.path.join(save_directory, weights_name), metadata={"format": "pt"} + ) + else: + torch.save(state_dict, os.path.join(save_directory, weights_name)) + + logger.info(f"Model weights saved in {os.path.join(save_directory, weights_name)}") + + @classmethod + def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.PathLike]], **kwargs): + r""" + Instantiate a pretrained PyTorch model from a pretrained model configuration. + + The model is set in evaluation mode - `model.eval()` - by default, and dropout modules are deactivated. To + train the model, set it back in training mode with `model.train()`. + + Parameters: + pretrained_model_name_or_path (`str` or `os.PathLike`, *optional*): + Can be either: + + - A string, the *model id* (for example `google/ddpm-celebahq-256`) of a pretrained model hosted on + the Hub. + - A path to a *directory* (for example `./my_model_directory`) containing the model weights saved + with [`~ModelMixin.save_pretrained`]. + + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory where a downloaded pretrained model configuration is cached if the standard cache + is not used. + torch_dtype (`str` or `torch.dtype`, *optional*): + Override the default `torch.dtype` and load the model with another dtype. If `"auto"` is passed, the + dtype is automatically derived from the model's weights. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + output_loading_info (`bool`, *optional*, defaults to `False`): + Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. + local_files_only(`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to `True`, the model + won't be downloaded from the Hub. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from + `diffusers-cli login` (stored in `~/.huggingface`) is used. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + from_flax (`bool`, *optional*, defaults to `False`): + Load the model weights from a Flax checkpoint save file. + subfolder (`str`, *optional*, defaults to `""`): + The subfolder location of a model file within a larger model repository on the Hub or locally. + mirror (`str`, *optional*): + Mirror source to resolve accessibility issues if you're downloading a model in China. We do not + guarantee the timeliness or safety of the source, and you should refer to the mirror site for more + information. + device_map (`str` or `Dict[str, Union[int, str, torch.device]]`, *optional*): + A map that specifies where each submodule should go. It doesn't need to be defined for each + parameter/buffer name; once a given module name is inside, every submodule of it will be sent to the + same device. + + Set `device_map="auto"` to have 🤗 Accelerate automatically compute the most optimized `device_map`. For + more information about each option see [designing a device + map](https://hf.co/docs/accelerate/main/en/usage_guides/big_modeling#designing-a-device-map). + max_memory (`Dict`, *optional*): + A dictionary device identifier for the maximum memory. Will default to the maximum memory available for + each GPU and the available CPU RAM if unset. + offload_folder (`str` or `os.PathLike`, *optional*): + The path to offload weights if `device_map` contains the value `"disk"`. + offload_state_dict (`bool`, *optional*): + If `True`, temporarily offloads the CPU state dict to the hard drive to avoid running out of CPU RAM if + the weight of the CPU state dict + the biggest shard of the checkpoint does not fit. Defaults to `True` + when there is some disk offload. + low_cpu_mem_usage (`bool`, *optional*, defaults to `True` if torch version >= 1.9.0 else `False`): + Speed up model loading only loading the pretrained weights and not initializing the weights. This also + tries to not use more than 1x model size in CPU memory (including peak memory) while loading the model. + Only supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this + argument to `True` will raise an error. + variant (`str`, *optional*): + Load weights from a specified `variant` filename such as `"fp16"` or `"ema"`. This is ignored when + loading `from_flax`. + use_safetensors (`bool`, *optional*, defaults to `None`): + If set to `None`, the `safetensors` weights are downloaded if they're available **and** if the + `safetensors` library is installed. If set to `True`, the model is forcibly loaded from `safetensors` + weights. If set to `False`, `safetensors` weights are not loaded. + + + + To use private or [gated models](https://huggingface.co/docs/hub/models-gated#gated-models), log-in with + `huggingface-cli login`. You can also activate the special + ["offline-mode"](https://huggingface.co/diffusers/installation.html#offline-mode) to use this method in a + firewalled environment. + + + + Example: + + ```py + from diffusers import UNet2DConditionModel + + unet = UNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5", subfolder="unet") + ``` + + If you get the error message below, you need to finetune the weights for your downstream task: + + ```bash + Some weights of UNet2DConditionModel were not initialized from the model checkpoint at runwayml/stable-diffusion-v1-5 and are newly initialized because the shapes did not match: + - conv_in.weight: found shape torch.Size([320, 4, 3, 3]) in the checkpoint and torch.Size([320, 9, 3, 3]) in the model instantiated + You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. + ``` + """ + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + ignore_mismatched_sizes = kwargs.pop("ignore_mismatched_sizes", False) + force_download = kwargs.pop("force_download", False) + from_flax = kwargs.pop("from_flax", False) + resume_download = kwargs.pop("resume_download", False) + proxies = kwargs.pop("proxies", None) + output_loading_info = kwargs.pop("output_loading_info", False) + local_files_only = kwargs.pop("local_files_only", HF_HUB_OFFLINE) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + torch_dtype = kwargs.pop("torch_dtype", None) + subfolder = kwargs.pop("subfolder", None) + device_map = kwargs.pop("device_map", None) + max_memory = kwargs.pop("max_memory", None) + offload_folder = kwargs.pop("offload_folder", None) + offload_state_dict = kwargs.pop("offload_state_dict", False) + low_cpu_mem_usage = kwargs.pop("low_cpu_mem_usage", _LOW_CPU_MEM_USAGE_DEFAULT) + variant = kwargs.pop("variant", None) + use_safetensors = kwargs.pop("use_safetensors", None) + + if use_safetensors and not is_safetensors_available(): + raise ValueError( + "`use_safetensors`=True but safetensors is not installed. Please install safetensors with `pip install safetensors" + ) + + allow_pickle = False + if use_safetensors is None: + use_safetensors = is_safetensors_available() + allow_pickle = True + + if low_cpu_mem_usage and not is_accelerate_available(): + low_cpu_mem_usage = False + logger.warning( + "Cannot initialize model with low cpu memory usage because `accelerate` was not found in the" + " environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install" + " `accelerate` for faster and less memory-intense model loading. You can do so with: \n```\npip" + " install accelerate\n```\n." + ) + + if device_map is not None and not is_accelerate_available(): + raise NotImplementedError( + "Loading and dispatching requires `accelerate`. Please make sure to install accelerate or set" + " `device_map=None`. You can install accelerate with `pip install accelerate`." + ) + + # Check if we can handle device_map and dispatching the weights + if device_map is not None and not is_torch_version(">=", "1.9.0"): + raise NotImplementedError( + "Loading and dispatching requires torch >= 1.9.0. Please either update your PyTorch version or set" + " `device_map=None`." + ) + + if low_cpu_mem_usage is True and not is_torch_version(">=", "1.9.0"): + raise NotImplementedError( + "Low memory initialization requires torch >= 1.9.0. Please either update your PyTorch version or set" + " `low_cpu_mem_usage=False`." + ) + + if low_cpu_mem_usage is False and device_map is not None: + raise ValueError( + f"You cannot set `low_cpu_mem_usage` to `False` while using device_map={device_map} for loading and" + " dispatching. Please make sure to set `low_cpu_mem_usage=True`." + ) + + # Load config if we don't provide a configuration + config_path = pretrained_model_name_or_path + + user_agent = { + "diffusers": __version__, + "file_type": "model", + "framework": "pytorch", + } + + # load config + config, unused_kwargs, commit_hash = cls.load_config( + config_path, + cache_dir=cache_dir, + return_unused_kwargs=True, + return_commit_hash=True, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + device_map=device_map, + max_memory=max_memory, + offload_folder=offload_folder, + offload_state_dict=offload_state_dict, + user_agent=user_agent, + **kwargs, + ) + + # load model + model_file = None + if from_flax: + model_file = _get_model_file( + pretrained_model_name_or_path, + weights_name=FLAX_WEIGHTS_NAME, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + commit_hash=commit_hash, + ) + model = cls.from_config(config, **unused_kwargs) + + # Convert the weights + from .modeling_pytorch_flax_utils import load_flax_checkpoint_in_pytorch_model + + model = load_flax_checkpoint_in_pytorch_model(model, model_file) + else: + if use_safetensors: + try: + model_file = _get_model_file( + pretrained_model_name_or_path, + weights_name=_add_variant(SAFETENSORS_WEIGHTS_NAME, variant), + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + commit_hash=commit_hash, + ) + except IOError as e: + if not allow_pickle: + raise e + pass + if model_file is None: + model_file = _get_model_file( + pretrained_model_name_or_path, + weights_name=_add_variant(WEIGHTS_NAME, variant), + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + subfolder=subfolder, + user_agent=user_agent, + commit_hash=commit_hash, + ) + + if low_cpu_mem_usage: + # Instantiate model with empty weights + with accelerate.init_empty_weights(): + model = cls.from_config(config, **unused_kwargs) + + # if device_map is None, load the state dict and move the params from meta device to the cpu + if device_map is None: + param_device = "cpu" + state_dict = load_state_dict(model_file, variant=variant) + model._convert_deprecated_attention_blocks(state_dict) + # move the params from meta device to cpu + missing_keys = set(model.state_dict().keys()) - set(state_dict.keys()) + if len(missing_keys) > 0: + raise ValueError( + f"Cannot load {cls} from {pretrained_model_name_or_path} because the following keys are" + f" missing: \n {', '.join(missing_keys)}. \n Please make sure to pass" + " `low_cpu_mem_usage=False` and `device_map=None` if you want to randomly initialize" + " those weights or else make sure your checkpoint file is correct." + ) + unexpected_keys = [] + + empty_state_dict = model.state_dict() + for param_name, param in state_dict.items(): + accepts_dtype = "dtype" in set( + inspect.signature(set_module_tensor_to_device).parameters.keys() + ) + + if param_name not in empty_state_dict: + unexpected_keys.append(param_name) + continue + + if empty_state_dict[param_name].shape != param.shape: + raise ValueError( + f"Cannot load {pretrained_model_name_or_path} because {param_name} expected shape {empty_state_dict[param_name]}, but got {param.shape}. If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example." + ) + + if accepts_dtype: + set_module_tensor_to_device( + model, param_name, param_device, value=param, dtype=torch_dtype + ) + else: + set_module_tensor_to_device(model, param_name, param_device, value=param) + + if cls._keys_to_ignore_on_load_unexpected is not None: + for pat in cls._keys_to_ignore_on_load_unexpected: + unexpected_keys = [k for k in unexpected_keys if re.search(pat, k) is None] + + if len(unexpected_keys) > 0: + logger.warn( + f"Some weights of the model checkpoint were not used when initializing {cls.__name__}: \n {[', '.join(unexpected_keys)]}" + ) + + else: # else let accelerate handle loading and dispatching. + # Load weights and dispatch according to the device_map + # by default the device_map is None and the weights are loaded on the CPU + try: + accelerate.load_checkpoint_and_dispatch( + model, + model_file, + device_map, + max_memory=max_memory, + offload_folder=offload_folder, + offload_state_dict=offload_state_dict, + dtype=torch_dtype, + ) + except AttributeError as e: + # When using accelerate loading, we do not have the ability to load the state + # dict and rename the weight names manually. Additionally, accelerate skips + # torch loading conventions and directly writes into `module.{_buffers, _parameters}` + # (which look like they should be private variables?), so we can't use the standard hooks + # to rename parameters on load. We need to mimic the original weight names so the correct + # attributes are available. After we have loaded the weights, we convert the deprecated + # names to the new non-deprecated names. Then we _greatly encourage_ the user to convert + # the weights so we don't have to do this again. + + if "'Attention' object has no attribute" in str(e): + logger.warn( + f"Taking `{str(e)}` while using `accelerate.load_checkpoint_and_dispatch` to mean {pretrained_model_name_or_path}" + " was saved with deprecated attention block weight names. We will load it with the deprecated attention block" + " names and convert them on the fly to the new attention block format. Please re-save the model after this conversion," + " so we don't have to do the on the fly renaming in the future. If the model is from a hub checkpoint," + " please also re-upload it or open a PR on the original repository." + ) + model._temp_convert_self_to_deprecated_attention_blocks() + accelerate.load_checkpoint_and_dispatch( + model, + model_file, + device_map, + max_memory=max_memory, + offload_folder=offload_folder, + offload_state_dict=offload_state_dict, + dtype=torch_dtype, + ) + model._undo_temp_convert_self_to_deprecated_attention_blocks() + else: + raise e + + loading_info = { + "missing_keys": [], + "unexpected_keys": [], + "mismatched_keys": [], + "error_msgs": [], + } + else: + model = cls.from_config(config, **unused_kwargs) + + state_dict = load_state_dict(model_file, variant=variant) + model._convert_deprecated_attention_blocks(state_dict) + + model, missing_keys, unexpected_keys, mismatched_keys, error_msgs = cls._load_pretrained_model( + model, + state_dict, + model_file, + pretrained_model_name_or_path, + ignore_mismatched_sizes=ignore_mismatched_sizes, + ) + + loading_info = { + "missing_keys": missing_keys, + "unexpected_keys": unexpected_keys, + "mismatched_keys": mismatched_keys, + "error_msgs": error_msgs, + } + + if torch_dtype is not None and not isinstance(torch_dtype, torch.dtype): + raise ValueError( + f"{torch_dtype} needs to be of type `torch.dtype`, e.g. `torch.float16`, but is {type(torch_dtype)}." + ) + elif torch_dtype is not None: + model = model.to(torch_dtype) + + model.register_to_config(_name_or_path=pretrained_model_name_or_path) + + # Set model in evaluation mode to deactivate DropOut modules by default + model.eval() + if output_loading_info: + return model, loading_info + + return model + + @classmethod + def _load_pretrained_model( + cls, + model, + state_dict, + resolved_archive_file, + pretrained_model_name_or_path, + ignore_mismatched_sizes=False, + ): + # Retrieve missing & unexpected_keys + model_state_dict = model.state_dict() + loaded_keys = list(state_dict.keys()) + + expected_keys = list(model_state_dict.keys()) + + original_loaded_keys = loaded_keys + + missing_keys = list(set(expected_keys) - set(loaded_keys)) + unexpected_keys = list(set(loaded_keys) - set(expected_keys)) + + # Make sure we are able to load base models as well as derived models (with heads) + model_to_load = model + + def _find_mismatched_keys( + state_dict, + model_state_dict, + loaded_keys, + ignore_mismatched_sizes, + ): + mismatched_keys = [] + if ignore_mismatched_sizes: + for checkpoint_key in loaded_keys: + model_key = checkpoint_key + + if ( + model_key in model_state_dict + and state_dict[checkpoint_key].shape != model_state_dict[model_key].shape + ): + mismatched_keys.append( + (checkpoint_key, state_dict[checkpoint_key].shape, model_state_dict[model_key].shape) + ) + del state_dict[checkpoint_key] + return mismatched_keys + + if state_dict is not None: + # Whole checkpoint + mismatched_keys = _find_mismatched_keys( + state_dict, + model_state_dict, + original_loaded_keys, + ignore_mismatched_sizes, + ) + error_msgs = _load_state_dict_into_model(model_to_load, state_dict) + + if len(error_msgs) > 0: + error_msg = "\n\t".join(error_msgs) + if "size mismatch" in error_msg: + error_msg += ( + "\n\tYou may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method." + ) + raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}") + + if len(unexpected_keys) > 0: + logger.warning( + f"Some weights of the model checkpoint at {pretrained_model_name_or_path} were not used when" + f" initializing {model.__class__.__name__}: {unexpected_keys}\n- This IS expected if you are" + f" initializing {model.__class__.__name__} from the checkpoint of a model trained on another task" + " or with another architecture (e.g. initializing a BertForSequenceClassification model from a" + " BertForPreTraining model).\n- This IS NOT expected if you are initializing" + f" {model.__class__.__name__} from the checkpoint of a model that you expect to be exactly" + " identical (initializing a BertForSequenceClassification model from a" + " BertForSequenceClassification model)." + ) + else: + logger.info(f"All model checkpoint weights were used when initializing {model.__class__.__name__}.\n") + if len(missing_keys) > 0: + logger.warning( + f"Some weights of {model.__class__.__name__} were not initialized from the model checkpoint at" + f" {pretrained_model_name_or_path} and are newly initialized: {missing_keys}\nYou should probably" + " TRAIN this model on a down-stream task to be able to use it for predictions and inference." + ) + elif len(mismatched_keys) == 0: + logger.info( + f"All the weights of {model.__class__.__name__} were initialized from the model checkpoint at" + f" {pretrained_model_name_or_path}.\nIf your task is similar to the task the model of the" + f" checkpoint was trained on, you can already use {model.__class__.__name__} for predictions" + " without further training." + ) + if len(mismatched_keys) > 0: + mismatched_warning = "\n".join( + [ + f"- {key}: found shape {shape1} in the checkpoint and {shape2} in the model instantiated" + for key, shape1, shape2 in mismatched_keys + ] + ) + logger.warning( + f"Some weights of {model.__class__.__name__} were not initialized from the model checkpoint at" + f" {pretrained_model_name_or_path} and are newly initialized because the shapes did not" + f" match:\n{mismatched_warning}\nYou should probably TRAIN this model on a down-stream task to be" + " able to use it for predictions and inference." + ) + + return model, missing_keys, unexpected_keys, mismatched_keys, error_msgs + + @property + def device(self) -> device: + """ + `torch.device`: The device on which the module is (assuming that all the module parameters are on the same + device). + """ + return get_parameter_device(self) + + @property + def dtype(self) -> torch.dtype: + """ + `torch.dtype`: The dtype of the module (assuming that all the module parameters have the same dtype). + """ + return get_parameter_dtype(self) + + def num_parameters(self, only_trainable: bool = False, exclude_embeddings: bool = False) -> int: + """ + Get number of (trainable or non-embedding) parameters in the module. + + Args: + only_trainable (`bool`, *optional*, defaults to `False`): + Whether or not to return only the number of trainable parameters. + exclude_embeddings (`bool`, *optional*, defaults to `False`): + Whether or not to return only the number of non-embedding parameters. + + Returns: + `int`: The number of parameters. + + Example: + + ```py + from diffusers import UNet2DConditionModel + + model_id = "runwayml/stable-diffusion-v1-5" + unet = UNet2DConditionModel.from_pretrained(model_id, subfolder="unet") + unet.num_parameters(only_trainable=True) + 859520964 + ``` + """ + + if exclude_embeddings: + embedding_param_names = [ + f"{name}.weight" + for name, module_type in self.named_modules() + if isinstance(module_type, torch.nn.Embedding) + ] + non_embedding_parameters = [ + parameter for name, parameter in self.named_parameters() if name not in embedding_param_names + ] + return sum(p.numel() for p in non_embedding_parameters if p.requires_grad or not only_trainable) + else: + return sum(p.numel() for p in self.parameters() if p.requires_grad or not only_trainable) + + def _convert_deprecated_attention_blocks(self, state_dict): + deprecated_attention_block_paths = [] + + def recursive_find_attn_block(name, module): + if hasattr(module, "_from_deprecated_attn_block") and module._from_deprecated_attn_block: + deprecated_attention_block_paths.append(name) + + for sub_name, sub_module in module.named_children(): + sub_name = sub_name if name == "" else f"{name}.{sub_name}" + recursive_find_attn_block(sub_name, sub_module) + + recursive_find_attn_block("", self) + + # NOTE: we have to check if the deprecated parameters are in the state dict + # because it is possible we are loading from a state dict that was already + # converted + + for path in deprecated_attention_block_paths: + # group_norm path stays the same + + # query -> to_q + if f"{path}.query.weight" in state_dict: + state_dict[f"{path}.to_q.weight"] = state_dict.pop(f"{path}.query.weight") + if f"{path}.query.bias" in state_dict: + state_dict[f"{path}.to_q.bias"] = state_dict.pop(f"{path}.query.bias") + + # key -> to_k + if f"{path}.key.weight" in state_dict: + state_dict[f"{path}.to_k.weight"] = state_dict.pop(f"{path}.key.weight") + if f"{path}.key.bias" in state_dict: + state_dict[f"{path}.to_k.bias"] = state_dict.pop(f"{path}.key.bias") + + # value -> to_v + if f"{path}.value.weight" in state_dict: + state_dict[f"{path}.to_v.weight"] = state_dict.pop(f"{path}.value.weight") + if f"{path}.value.bias" in state_dict: + state_dict[f"{path}.to_v.bias"] = state_dict.pop(f"{path}.value.bias") + + # proj_attn -> to_out.0 + if f"{path}.proj_attn.weight" in state_dict: + state_dict[f"{path}.to_out.0.weight"] = state_dict.pop(f"{path}.proj_attn.weight") + if f"{path}.proj_attn.bias" in state_dict: + state_dict[f"{path}.to_out.0.bias"] = state_dict.pop(f"{path}.proj_attn.bias") + + def _temp_convert_self_to_deprecated_attention_blocks(self): + deprecated_attention_block_modules = [] + + def recursive_find_attn_block(module): + if hasattr(module, "_from_deprecated_attn_block") and module._from_deprecated_attn_block: + deprecated_attention_block_modules.append(module) + + for sub_module in module.children(): + recursive_find_attn_block(sub_module) + + recursive_find_attn_block(self) + + for module in deprecated_attention_block_modules: + module.query = module.to_q + module.key = module.to_k + module.value = module.to_v + module.proj_attn = module.to_out[0] + + # We don't _have_ to delete the old attributes, but it's helpful to ensure + # that _all_ the weights are loaded into the new attributes and we're not + # making an incorrect assumption that this model should be converted when + # it really shouldn't be. + del module.to_q + del module.to_k + del module.to_v + del module.to_out + + def _undo_temp_convert_self_to_deprecated_attention_blocks(self): + deprecated_attention_block_modules = [] + + def recursive_find_attn_block(module): + if hasattr(module, "_from_deprecated_attn_block") and module._from_deprecated_attn_block: + deprecated_attention_block_modules.append(module) + + for sub_module in module.children(): + recursive_find_attn_block(sub_module) + + recursive_find_attn_block(self) + + for module in deprecated_attention_block_modules: + module.to_q = module.query + module.to_k = module.key + module.to_v = module.value + module.to_out = nn.ModuleList([module.proj_attn, nn.Dropout(module.dropout)]) + + del module.query + del module.key + del module.value + del module.proj_attn diff --git a/gf/sd-webui-rich-text/diffusers_official/models/prior_transformer.py b/gf/sd-webui-rich-text/diffusers_official/models/prior_transformer.py new file mode 100644 index 0000000000000000000000000000000000000000..9f3c61dd7561742114947e3419c19fec8c2a824f --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/prior_transformer.py @@ -0,0 +1,364 @@ +from dataclasses import dataclass +from typing import Dict, Optional, Union + +import torch +import torch.nn.functional as F +from torch import nn + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput +from .attention import BasicTransformerBlock +from .attention_processor import AttentionProcessor, AttnProcessor +from .embeddings import TimestepEmbedding, Timesteps +from .modeling_utils import ModelMixin + + +@dataclass +class PriorTransformerOutput(BaseOutput): + """ + The output of [`PriorTransformer`]. + + Args: + predicted_image_embedding (`torch.FloatTensor` of shape `(batch_size, embedding_dim)`): + The predicted CLIP image embedding conditioned on the CLIP text embedding input. + """ + + predicted_image_embedding: torch.FloatTensor + + +class PriorTransformer(ModelMixin, ConfigMixin): + """ + A Prior Transformer model. + + Parameters: + num_attention_heads (`int`, *optional*, defaults to 32): The number of heads to use for multi-head attention. + attention_head_dim (`int`, *optional*, defaults to 64): The number of channels in each head. + num_layers (`int`, *optional*, defaults to 20): The number of layers of Transformer blocks to use. + embedding_dim (`int`, *optional*, defaults to 768): The dimension of the model input `hidden_states` + num_embeddings (`int`, *optional*, defaults to 77): + The number of embeddings of the model input `hidden_states` + additional_embeddings (`int`, *optional*, defaults to 4): The number of additional tokens appended to the + projected `hidden_states`. The actual length of the used `hidden_states` is `num_embeddings + + additional_embeddings`. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + time_embed_act_fn (`str`, *optional*, defaults to 'silu'): + The activation function to use to create timestep embeddings. + norm_in_type (`str`, *optional*, defaults to None): The normalization layer to apply on hidden states before + passing to Transformer blocks. Set it to `None` if normalization is not needed. + embedding_proj_norm_type (`str`, *optional*, defaults to None): + The normalization layer to apply on the input `proj_embedding`. Set it to `None` if normalization is not + needed. + encoder_hid_proj_type (`str`, *optional*, defaults to `linear`): + The projection layer to apply on the input `encoder_hidden_states`. Set it to `None` if + `encoder_hidden_states` is `None`. + added_emb_type (`str`, *optional*, defaults to `prd`): Additional embeddings to condition the model. + Choose from `prd` or `None`. if choose `prd`, it will prepend a token indicating the (quantized) dot + product between the text embedding and image embedding as proposed in the unclip paper + https://arxiv.org/abs/2204.06125 If it is `None`, no additional embeddings will be prepended. + time_embed_dim (`int, *optional*, defaults to None): The dimension of timestep embeddings. + If None, will be set to `num_attention_heads * attention_head_dim` + embedding_proj_dim (`int`, *optional*, default to None): + The dimension of `proj_embedding`. If None, will be set to `embedding_dim`. + clip_embed_dim (`int`, *optional*, default to None): + The dimension of the output. If None, will be set to `embedding_dim`. + """ + + @register_to_config + def __init__( + self, + num_attention_heads: int = 32, + attention_head_dim: int = 64, + num_layers: int = 20, + embedding_dim: int = 768, + num_embeddings=77, + additional_embeddings=4, + dropout: float = 0.0, + time_embed_act_fn: str = "silu", + norm_in_type: Optional[str] = None, # layer + embedding_proj_norm_type: Optional[str] = None, # layer + encoder_hid_proj_type: Optional[str] = "linear", # linear + added_emb_type: Optional[str] = "prd", # prd + time_embed_dim: Optional[int] = None, + embedding_proj_dim: Optional[int] = None, + clip_embed_dim: Optional[int] = None, + ): + super().__init__() + self.num_attention_heads = num_attention_heads + self.attention_head_dim = attention_head_dim + inner_dim = num_attention_heads * attention_head_dim + self.additional_embeddings = additional_embeddings + + time_embed_dim = time_embed_dim or inner_dim + embedding_proj_dim = embedding_proj_dim or embedding_dim + clip_embed_dim = clip_embed_dim or embedding_dim + + self.time_proj = Timesteps(inner_dim, True, 0) + self.time_embedding = TimestepEmbedding(inner_dim, time_embed_dim, out_dim=inner_dim, act_fn=time_embed_act_fn) + + self.proj_in = nn.Linear(embedding_dim, inner_dim) + + if embedding_proj_norm_type is None: + self.embedding_proj_norm = None + elif embedding_proj_norm_type == "layer": + self.embedding_proj_norm = nn.LayerNorm(embedding_proj_dim) + else: + raise ValueError(f"unsupported embedding_proj_norm_type: {embedding_proj_norm_type}") + + self.embedding_proj = nn.Linear(embedding_proj_dim, inner_dim) + + if encoder_hid_proj_type is None: + self.encoder_hidden_states_proj = None + elif encoder_hid_proj_type == "linear": + self.encoder_hidden_states_proj = nn.Linear(embedding_dim, inner_dim) + else: + raise ValueError(f"unsupported encoder_hid_proj_type: {encoder_hid_proj_type}") + + self.positional_embedding = nn.Parameter(torch.zeros(1, num_embeddings + additional_embeddings, inner_dim)) + + if added_emb_type == "prd": + self.prd_embedding = nn.Parameter(torch.zeros(1, 1, inner_dim)) + elif added_emb_type is None: + self.prd_embedding = None + else: + raise ValueError( + f"`added_emb_type`: {added_emb_type} is not supported. Make sure to choose one of `'prd'` or `None`." + ) + + self.transformer_blocks = nn.ModuleList( + [ + BasicTransformerBlock( + inner_dim, + num_attention_heads, + attention_head_dim, + dropout=dropout, + activation_fn="gelu", + attention_bias=True, + ) + for d in range(num_layers) + ] + ) + + if norm_in_type == "layer": + self.norm_in = nn.LayerNorm(inner_dim) + elif norm_in_type is None: + self.norm_in = None + else: + raise ValueError(f"Unsupported norm_in_type: {norm_in_type}.") + + self.norm_out = nn.LayerNorm(inner_dim) + + self.proj_to_clip_embeddings = nn.Linear(inner_dim, clip_embed_dim) + + causal_attention_mask = torch.full( + [num_embeddings + additional_embeddings, num_embeddings + additional_embeddings], -10000.0 + ) + causal_attention_mask.triu_(1) + causal_attention_mask = causal_attention_mask[None, ...] + self.register_buffer("causal_attention_mask", causal_attention_mask, persistent=False) + + self.clip_mean = nn.Parameter(torch.zeros(1, clip_embed_dim)) + self.clip_std = nn.Parameter(torch.zeros(1, clip_embed_dim)) + + @property + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.attn_processors + def attn_processors(self) -> Dict[str, AttentionProcessor]: + r""" + Returns: + `dict` of attention processors: A dictionary containing all attention processors used in the model with + indexed by its weight name. + """ + # set recursively + processors = {} + + def fn_recursive_add_processors(name: str, module: torch.nn.Module, processors: Dict[str, AttentionProcessor]): + if hasattr(module, "set_processor"): + processors[f"{name}.processor"] = module.processor + + for sub_name, child in module.named_children(): + fn_recursive_add_processors(f"{name}.{sub_name}", child, processors) + + return processors + + for name, module in self.named_children(): + fn_recursive_add_processors(name, module, processors) + + return processors + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_attn_processor + def set_attn_processor(self, processor: Union[AttentionProcessor, Dict[str, AttentionProcessor]]): + r""" + Sets the attention processor to use to compute attention. + + Parameters: + processor (`dict` of `AttentionProcessor` or only `AttentionProcessor`): + The instantiated processor class or a dictionary of processor classes that will be set as the processor + for **all** `Attention` layers. + + If `processor` is a dict, the key needs to define the path to the corresponding cross attention + processor. This is strongly recommended when setting trainable attention processors. + + """ + count = len(self.attn_processors.keys()) + + if isinstance(processor, dict) and len(processor) != count: + raise ValueError( + f"A dict of processors was passed, but the number of processors {len(processor)} does not match the" + f" number of attention layers: {count}. Please make sure to pass {count} processor classes." + ) + + def fn_recursive_attn_processor(name: str, module: torch.nn.Module, processor): + if hasattr(module, "set_processor"): + if not isinstance(processor, dict): + module.set_processor(processor) + else: + module.set_processor(processor.pop(f"{name}.processor")) + + for sub_name, child in module.named_children(): + fn_recursive_attn_processor(f"{name}.{sub_name}", child, processor) + + for name, module in self.named_children(): + fn_recursive_attn_processor(name, module, processor) + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_default_attn_processor + def set_default_attn_processor(self): + """ + Disables custom attention processors and sets the default attention implementation. + """ + self.set_attn_processor(AttnProcessor()) + + def forward( + self, + hidden_states, + timestep: Union[torch.Tensor, float, int], + proj_embedding: torch.FloatTensor, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.BoolTensor] = None, + return_dict: bool = True, + ): + """ + The [`PriorTransformer`] forward method. + + Args: + hidden_states (`torch.FloatTensor` of shape `(batch_size, embedding_dim)`): + The currently predicted image embeddings. + timestep (`torch.LongTensor`): + Current denoising step. + proj_embedding (`torch.FloatTensor` of shape `(batch_size, embedding_dim)`): + Projected embedding vector the denoising process is conditioned on. + encoder_hidden_states (`torch.FloatTensor` of shape `(batch_size, num_embeddings, embedding_dim)`): + Hidden states of the text embeddings the denoising process is conditioned on. + attention_mask (`torch.BoolTensor` of shape `(batch_size, num_embeddings)`): + Text mask for the text embeddings. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.prior_transformer.PriorTransformerOutput`] instead of a plain + tuple. + + Returns: + [`~models.prior_transformer.PriorTransformerOutput`] or `tuple`: + If return_dict is True, a [`~models.prior_transformer.PriorTransformerOutput`] is returned, otherwise a + tuple is returned where the first element is the sample tensor. + """ + batch_size = hidden_states.shape[0] + + timesteps = timestep + if not torch.is_tensor(timesteps): + timesteps = torch.tensor([timesteps], dtype=torch.long, device=hidden_states.device) + elif torch.is_tensor(timesteps) and len(timesteps.shape) == 0: + timesteps = timesteps[None].to(hidden_states.device) + + # broadcast to batch dimension in a way that's compatible with ONNX/Core ML + timesteps = timesteps * torch.ones(batch_size, dtype=timesteps.dtype, device=timesteps.device) + + timesteps_projected = self.time_proj(timesteps) + + # timesteps does not contain any weights and will always return f32 tensors + # but time_embedding might be fp16, so we need to cast here. + timesteps_projected = timesteps_projected.to(dtype=self.dtype) + time_embeddings = self.time_embedding(timesteps_projected) + + if self.embedding_proj_norm is not None: + proj_embedding = self.embedding_proj_norm(proj_embedding) + + proj_embeddings = self.embedding_proj(proj_embedding) + if self.encoder_hidden_states_proj is not None and encoder_hidden_states is not None: + encoder_hidden_states = self.encoder_hidden_states_proj(encoder_hidden_states) + elif self.encoder_hidden_states_proj is not None and encoder_hidden_states is None: + raise ValueError("`encoder_hidden_states_proj` requires `encoder_hidden_states` to be set") + + hidden_states = self.proj_in(hidden_states) + + positional_embeddings = self.positional_embedding.to(hidden_states.dtype) + + additional_embeds = [] + additional_embeddings_len = 0 + + if encoder_hidden_states is not None: + additional_embeds.append(encoder_hidden_states) + additional_embeddings_len += encoder_hidden_states.shape[1] + + if len(proj_embeddings.shape) == 2: + proj_embeddings = proj_embeddings[:, None, :] + + if len(hidden_states.shape) == 2: + hidden_states = hidden_states[:, None, :] + + additional_embeds = additional_embeds + [ + proj_embeddings, + time_embeddings[:, None, :], + hidden_states, + ] + + if self.prd_embedding is not None: + prd_embedding = self.prd_embedding.to(hidden_states.dtype).expand(batch_size, -1, -1) + additional_embeds.append(prd_embedding) + + hidden_states = torch.cat( + additional_embeds, + dim=1, + ) + + # Allow positional_embedding to not include the `addtional_embeddings` and instead pad it with zeros for these additional tokens + additional_embeddings_len = additional_embeddings_len + proj_embeddings.shape[1] + 1 + if positional_embeddings.shape[1] < hidden_states.shape[1]: + positional_embeddings = F.pad( + positional_embeddings, + ( + 0, + 0, + additional_embeddings_len, + self.prd_embedding.shape[1] if self.prd_embedding is not None else 0, + ), + value=0.0, + ) + + hidden_states = hidden_states + positional_embeddings + + if attention_mask is not None: + attention_mask = (1 - attention_mask.to(hidden_states.dtype)) * -10000.0 + attention_mask = F.pad(attention_mask, (0, self.additional_embeddings), value=0.0) + attention_mask = (attention_mask[:, None, :] + self.causal_attention_mask).to(hidden_states.dtype) + attention_mask = attention_mask.repeat_interleave(self.config.num_attention_heads, dim=0) + + if self.norm_in is not None: + hidden_states = self.norm_in(hidden_states) + + for block in self.transformer_blocks: + hidden_states = block(hidden_states, attention_mask=attention_mask) + + hidden_states = self.norm_out(hidden_states) + + if self.prd_embedding is not None: + hidden_states = hidden_states[:, -1] + else: + hidden_states = hidden_states[:, additional_embeddings_len:] + + predicted_image_embedding = self.proj_to_clip_embeddings(hidden_states) + + if not return_dict: + return (predicted_image_embedding,) + + return PriorTransformerOutput(predicted_image_embedding=predicted_image_embedding) + + def post_process_latents(self, prior_latents): + prior_latents = (prior_latents * self.clip_std) + self.clip_mean + return prior_latents diff --git a/gf/sd-webui-rich-text/diffusers_official/models/resnet.py b/gf/sd-webui-rich-text/diffusers_official/models/resnet.py new file mode 100644 index 0000000000000000000000000000000000000000..24c3b07e7cb65447ad996b00066d42a74700dd97 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/resnet.py @@ -0,0 +1,877 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# `TemporalConvLayer` Copyright 2023 Alibaba DAMO-VILAB, The ModelScope Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from functools import partial +from typing import Optional + +import torch +import torch.nn as nn +import torch.nn.functional as F + +from .activations import get_activation +from .attention import AdaGroupNorm +from .attention_processor import SpatialNorm + + +class Upsample1D(nn.Module): + """A 1D upsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + use_conv_transpose (`bool`, default `False`): + option to use a convolution transpose. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + """ + + def __init__(self, channels, use_conv=False, use_conv_transpose=False, out_channels=None, name="conv"): + super().__init__() + self.channels = channels + self.out_channels = out_channels or channels + self.use_conv = use_conv + self.use_conv_transpose = use_conv_transpose + self.name = name + + self.conv = None + if use_conv_transpose: + self.conv = nn.ConvTranspose1d(channels, self.out_channels, 4, 2, 1) + elif use_conv: + self.conv = nn.Conv1d(self.channels, self.out_channels, 3, padding=1) + + def forward(self, inputs): + assert inputs.shape[1] == self.channels + if self.use_conv_transpose: + return self.conv(inputs) + + outputs = F.interpolate(inputs, scale_factor=2.0, mode="nearest") + + if self.use_conv: + outputs = self.conv(outputs) + + return outputs + + +class Downsample1D(nn.Module): + """A 1D downsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + padding (`int`, default `1`): + padding for the convolution. + """ + + def __init__(self, channels, use_conv=False, out_channels=None, padding=1, name="conv"): + super().__init__() + self.channels = channels + self.out_channels = out_channels or channels + self.use_conv = use_conv + self.padding = padding + stride = 2 + self.name = name + + if use_conv: + self.conv = nn.Conv1d(self.channels, self.out_channels, 3, stride=stride, padding=padding) + else: + assert self.channels == self.out_channels + self.conv = nn.AvgPool1d(kernel_size=stride, stride=stride) + + def forward(self, inputs): + assert inputs.shape[1] == self.channels + return self.conv(inputs) + + +class Upsample2D(nn.Module): + """A 2D upsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + use_conv_transpose (`bool`, default `False`): + option to use a convolution transpose. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + """ + + def __init__(self, channels, use_conv=False, use_conv_transpose=False, out_channels=None, name="conv"): + super().__init__() + self.channels = channels + self.out_channels = out_channels or channels + self.use_conv = use_conv + self.use_conv_transpose = use_conv_transpose + self.name = name + + conv = None + if use_conv_transpose: + conv = nn.ConvTranspose2d(channels, self.out_channels, 4, 2, 1) + elif use_conv: + conv = nn.Conv2d(self.channels, self.out_channels, 3, padding=1) + + # TODO(Suraj, Patrick) - clean up after weight dicts are correctly renamed + if name == "conv": + self.conv = conv + else: + self.Conv2d_0 = conv + + def forward(self, hidden_states, output_size=None): + assert hidden_states.shape[1] == self.channels + + if self.use_conv_transpose: + return self.conv(hidden_states) + + # Cast to float32 to as 'upsample_nearest2d_out_frame' op does not support bfloat16 + # TODO(Suraj): Remove this cast once the issue is fixed in PyTorch + # https://github.com/pytorch/pytorch/issues/86679 + dtype = hidden_states.dtype + if dtype == torch.bfloat16: + hidden_states = hidden_states.to(torch.float32) + + # upsample_nearest_nhwc fails with large batch sizes. see https://github.com/huggingface/diffusers/issues/984 + if hidden_states.shape[0] >= 64: + hidden_states = hidden_states.contiguous() + + # if `output_size` is passed we force the interpolation output + # size and do not make use of `scale_factor=2` + if output_size is None: + hidden_states = F.interpolate(hidden_states, scale_factor=2.0, mode="nearest") + else: + hidden_states = F.interpolate(hidden_states, size=output_size, mode="nearest") + + # If the input is bfloat16, we cast back to bfloat16 + if dtype == torch.bfloat16: + hidden_states = hidden_states.to(dtype) + + # TODO(Suraj, Patrick) - clean up after weight dicts are correctly renamed + if self.use_conv: + if self.name == "conv": + hidden_states = self.conv(hidden_states) + else: + hidden_states = self.Conv2d_0(hidden_states) + + return hidden_states + + +class Downsample2D(nn.Module): + """A 2D downsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + padding (`int`, default `1`): + padding for the convolution. + """ + + def __init__(self, channels, use_conv=False, out_channels=None, padding=1, name="conv"): + super().__init__() + self.channels = channels + self.out_channels = out_channels or channels + self.use_conv = use_conv + self.padding = padding + stride = 2 + self.name = name + + if use_conv: + conv = nn.Conv2d(self.channels, self.out_channels, 3, stride=stride, padding=padding) + else: + assert self.channels == self.out_channels + conv = nn.AvgPool2d(kernel_size=stride, stride=stride) + + # TODO(Suraj, Patrick) - clean up after weight dicts are correctly renamed + if name == "conv": + self.Conv2d_0 = conv + self.conv = conv + elif name == "Conv2d_0": + self.conv = conv + else: + self.conv = conv + + def forward(self, hidden_states): + assert hidden_states.shape[1] == self.channels + if self.use_conv and self.padding == 0: + pad = (0, 1, 0, 1) + hidden_states = F.pad(hidden_states, pad, mode="constant", value=0) + + assert hidden_states.shape[1] == self.channels + hidden_states = self.conv(hidden_states) + + return hidden_states + + +class FirUpsample2D(nn.Module): + """A 2D FIR upsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + fir_kernel (`tuple`, default `(1, 3, 3, 1)`): + kernel for the FIR filter. + """ + + def __init__(self, channels=None, out_channels=None, use_conv=False, fir_kernel=(1, 3, 3, 1)): + super().__init__() + out_channels = out_channels if out_channels else channels + if use_conv: + self.Conv2d_0 = nn.Conv2d(channels, out_channels, kernel_size=3, stride=1, padding=1) + self.use_conv = use_conv + self.fir_kernel = fir_kernel + self.out_channels = out_channels + + def _upsample_2d(self, hidden_states, weight=None, kernel=None, factor=2, gain=1): + """Fused `upsample_2d()` followed by `Conv2d()`. + + Padding is performed only once at the beginning, not between the operations. The fused op is considerably more + efficient than performing the same calculation using standard TensorFlow ops. It supports gradients of + arbitrary order. + + Args: + hidden_states: Input tensor of the shape `[N, C, H, W]` or `[N, H, W, C]`. + weight: Weight tensor of the shape `[filterH, filterW, inChannels, + outChannels]`. Grouped convolution can be performed by `inChannels = x.shape[0] // numGroups`. + kernel: FIR filter of the shape `[firH, firW]` or `[firN]` + (separable). The default is `[1] * factor`, which corresponds to nearest-neighbor upsampling. + factor: Integer upsampling factor (default: 2). + gain: Scaling factor for signal magnitude (default: 1.0). + + Returns: + output: Tensor of the shape `[N, C, H * factor, W * factor]` or `[N, H * factor, W * factor, C]`, and same + datatype as `hidden_states`. + """ + + assert isinstance(factor, int) and factor >= 1 + + # Setup filter kernel. + if kernel is None: + kernel = [1] * factor + + # setup kernel + kernel = torch.tensor(kernel, dtype=torch.float32) + if kernel.ndim == 1: + kernel = torch.outer(kernel, kernel) + kernel /= torch.sum(kernel) + + kernel = kernel * (gain * (factor**2)) + + if self.use_conv: + convH = weight.shape[2] + convW = weight.shape[3] + inC = weight.shape[1] + + pad_value = (kernel.shape[0] - factor) - (convW - 1) + + stride = (factor, factor) + # Determine data dimensions. + output_shape = ( + (hidden_states.shape[2] - 1) * factor + convH, + (hidden_states.shape[3] - 1) * factor + convW, + ) + output_padding = ( + output_shape[0] - (hidden_states.shape[2] - 1) * stride[0] - convH, + output_shape[1] - (hidden_states.shape[3] - 1) * stride[1] - convW, + ) + assert output_padding[0] >= 0 and output_padding[1] >= 0 + num_groups = hidden_states.shape[1] // inC + + # Transpose weights. + weight = torch.reshape(weight, (num_groups, -1, inC, convH, convW)) + weight = torch.flip(weight, dims=[3, 4]).permute(0, 2, 1, 3, 4) + weight = torch.reshape(weight, (num_groups * inC, -1, convH, convW)) + + inverse_conv = F.conv_transpose2d( + hidden_states, weight, stride=stride, output_padding=output_padding, padding=0 + ) + + output = upfirdn2d_native( + inverse_conv, + torch.tensor(kernel, device=inverse_conv.device), + pad=((pad_value + 1) // 2 + factor - 1, pad_value // 2 + 1), + ) + else: + pad_value = kernel.shape[0] - factor + output = upfirdn2d_native( + hidden_states, + torch.tensor(kernel, device=hidden_states.device), + up=factor, + pad=((pad_value + 1) // 2 + factor - 1, pad_value // 2), + ) + + return output + + def forward(self, hidden_states): + if self.use_conv: + height = self._upsample_2d(hidden_states, self.Conv2d_0.weight, kernel=self.fir_kernel) + height = height + self.Conv2d_0.bias.reshape(1, -1, 1, 1) + else: + height = self._upsample_2d(hidden_states, kernel=self.fir_kernel, factor=2) + + return height + + +class FirDownsample2D(nn.Module): + """A 2D FIR downsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + fir_kernel (`tuple`, default `(1, 3, 3, 1)`): + kernel for the FIR filter. + """ + + def __init__(self, channels=None, out_channels=None, use_conv=False, fir_kernel=(1, 3, 3, 1)): + super().__init__() + out_channels = out_channels if out_channels else channels + if use_conv: + self.Conv2d_0 = nn.Conv2d(channels, out_channels, kernel_size=3, stride=1, padding=1) + self.fir_kernel = fir_kernel + self.use_conv = use_conv + self.out_channels = out_channels + + def _downsample_2d(self, hidden_states, weight=None, kernel=None, factor=2, gain=1): + """Fused `Conv2d()` followed by `downsample_2d()`. + Padding is performed only once at the beginning, not between the operations. The fused op is considerably more + efficient than performing the same calculation using standard TensorFlow ops. It supports gradients of + arbitrary order. + + Args: + hidden_states: Input tensor of the shape `[N, C, H, W]` or `[N, H, W, C]`. + weight: + Weight tensor of the shape `[filterH, filterW, inChannels, outChannels]`. Grouped convolution can be + performed by `inChannels = x.shape[0] // numGroups`. + kernel: FIR filter of the shape `[firH, firW]` or `[firN]` (separable). The default is `[1] * + factor`, which corresponds to average pooling. + factor: Integer downsampling factor (default: 2). + gain: Scaling factor for signal magnitude (default: 1.0). + + Returns: + output: Tensor of the shape `[N, C, H // factor, W // factor]` or `[N, H // factor, W // factor, C]`, and + same datatype as `x`. + """ + + assert isinstance(factor, int) and factor >= 1 + if kernel is None: + kernel = [1] * factor + + # setup kernel + kernel = torch.tensor(kernel, dtype=torch.float32) + if kernel.ndim == 1: + kernel = torch.outer(kernel, kernel) + kernel /= torch.sum(kernel) + + kernel = kernel * gain + + if self.use_conv: + _, _, convH, convW = weight.shape + pad_value = (kernel.shape[0] - factor) + (convW - 1) + stride_value = [factor, factor] + upfirdn_input = upfirdn2d_native( + hidden_states, + torch.tensor(kernel, device=hidden_states.device), + pad=((pad_value + 1) // 2, pad_value // 2), + ) + output = F.conv2d(upfirdn_input, weight, stride=stride_value, padding=0) + else: + pad_value = kernel.shape[0] - factor + output = upfirdn2d_native( + hidden_states, + torch.tensor(kernel, device=hidden_states.device), + down=factor, + pad=((pad_value + 1) // 2, pad_value // 2), + ) + + return output + + def forward(self, hidden_states): + if self.use_conv: + downsample_input = self._downsample_2d(hidden_states, weight=self.Conv2d_0.weight, kernel=self.fir_kernel) + hidden_states = downsample_input + self.Conv2d_0.bias.reshape(1, -1, 1, 1) + else: + hidden_states = self._downsample_2d(hidden_states, kernel=self.fir_kernel, factor=2) + + return hidden_states + + +# downsample/upsample layer used in k-upscaler, might be able to use FirDownsample2D/DirUpsample2D instead +class KDownsample2D(nn.Module): + def __init__(self, pad_mode="reflect"): + super().__init__() + self.pad_mode = pad_mode + kernel_1d = torch.tensor([[1 / 8, 3 / 8, 3 / 8, 1 / 8]]) + self.pad = kernel_1d.shape[1] // 2 - 1 + self.register_buffer("kernel", kernel_1d.T @ kernel_1d, persistent=False) + + def forward(self, inputs): + inputs = F.pad(inputs, (self.pad,) * 4, self.pad_mode) + weight = inputs.new_zeros([inputs.shape[1], inputs.shape[1], self.kernel.shape[0], self.kernel.shape[1]]) + indices = torch.arange(inputs.shape[1], device=inputs.device) + kernel = self.kernel.to(weight)[None, :].expand(inputs.shape[1], -1, -1) + weight[indices, indices] = kernel + return F.conv2d(inputs, weight, stride=2) + + +class KUpsample2D(nn.Module): + def __init__(self, pad_mode="reflect"): + super().__init__() + self.pad_mode = pad_mode + kernel_1d = torch.tensor([[1 / 8, 3 / 8, 3 / 8, 1 / 8]]) * 2 + self.pad = kernel_1d.shape[1] // 2 - 1 + self.register_buffer("kernel", kernel_1d.T @ kernel_1d, persistent=False) + + def forward(self, inputs): + inputs = F.pad(inputs, ((self.pad + 1) // 2,) * 4, self.pad_mode) + weight = inputs.new_zeros([inputs.shape[1], inputs.shape[1], self.kernel.shape[0], self.kernel.shape[1]]) + indices = torch.arange(inputs.shape[1], device=inputs.device) + kernel = self.kernel.to(weight)[None, :].expand(inputs.shape[1], -1, -1) + weight[indices, indices] = kernel + return F.conv_transpose2d(inputs, weight, stride=2, padding=self.pad * 2 + 1) + + +class ResnetBlock2D(nn.Module): + r""" + A Resnet block. + + Parameters: + in_channels (`int`): The number of channels in the input. + out_channels (`int`, *optional*, default to be `None`): + The number of output channels for the first conv2d layer. If None, same as `in_channels`. + dropout (`float`, *optional*, defaults to `0.0`): The dropout probability to use. + temb_channels (`int`, *optional*, default to `512`): the number of channels in timestep embedding. + groups (`int`, *optional*, default to `32`): The number of groups to use for the first normalization layer. + groups_out (`int`, *optional*, default to None): + The number of groups to use for the second normalization layer. if set to None, same as `groups`. + eps (`float`, *optional*, defaults to `1e-6`): The epsilon to use for the normalization. + non_linearity (`str`, *optional*, default to `"swish"`): the activation function to use. + time_embedding_norm (`str`, *optional*, default to `"default"` ): Time scale shift config. + By default, apply timestep embedding conditioning with a simple shift mechanism. Choose "scale_shift" or + "ada_group" for a stronger conditioning with scale and shift. + kernel (`torch.FloatTensor`, optional, default to None): FIR filter, see + [`~models.resnet.FirUpsample2D`] and [`~models.resnet.FirDownsample2D`]. + output_scale_factor (`float`, *optional*, default to be `1.0`): the scale factor to use for the output. + use_in_shortcut (`bool`, *optional*, default to `True`): + If `True`, add a 1x1 nn.conv2d layer for skip-connection. + up (`bool`, *optional*, default to `False`): If `True`, add an upsample layer. + down (`bool`, *optional*, default to `False`): If `True`, add a downsample layer. + conv_shortcut_bias (`bool`, *optional*, default to `True`): If `True`, adds a learnable bias to the + `conv_shortcut` output. + conv_2d_out_channels (`int`, *optional*, default to `None`): the number of channels in the output. + If None, same as `out_channels`. + """ + + def __init__( + self, + *, + in_channels, + out_channels=None, + conv_shortcut=False, + dropout=0.0, + temb_channels=512, + groups=32, + groups_out=None, + pre_norm=True, + eps=1e-6, + non_linearity="swish", + skip_time_act=False, + time_embedding_norm="default", # default, scale_shift, ada_group, spatial + kernel=None, + output_scale_factor=1.0, + use_in_shortcut=None, + up=False, + down=False, + conv_shortcut_bias: bool = True, + conv_2d_out_channels: Optional[int] = None, + ): + super().__init__() + self.pre_norm = pre_norm + self.pre_norm = True + self.in_channels = in_channels + out_channels = in_channels if out_channels is None else out_channels + self.out_channels = out_channels + self.use_conv_shortcut = conv_shortcut + self.up = up + self.down = down + self.output_scale_factor = output_scale_factor + self.time_embedding_norm = time_embedding_norm + self.skip_time_act = skip_time_act + + if groups_out is None: + groups_out = groups + + if self.time_embedding_norm == "ada_group": + self.norm1 = AdaGroupNorm(temb_channels, in_channels, groups, eps=eps) + elif self.time_embedding_norm == "spatial": + self.norm1 = SpatialNorm(in_channels, temb_channels) + else: + self.norm1 = torch.nn.GroupNorm(num_groups=groups, num_channels=in_channels, eps=eps, affine=True) + + self.conv1 = torch.nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=1, padding=1) + + if temb_channels is not None: + if self.time_embedding_norm == "default": + self.time_emb_proj = torch.nn.Linear(temb_channels, out_channels) + elif self.time_embedding_norm == "scale_shift": + self.time_emb_proj = torch.nn.Linear(temb_channels, 2 * out_channels) + elif self.time_embedding_norm == "ada_group" or self.time_embedding_norm == "spatial": + self.time_emb_proj = None + else: + raise ValueError(f"unknown time_embedding_norm : {self.time_embedding_norm} ") + else: + self.time_emb_proj = None + + if self.time_embedding_norm == "ada_group": + self.norm2 = AdaGroupNorm(temb_channels, out_channels, groups_out, eps=eps) + elif self.time_embedding_norm == "spatial": + self.norm2 = SpatialNorm(out_channels, temb_channels) + else: + self.norm2 = torch.nn.GroupNorm(num_groups=groups_out, num_channels=out_channels, eps=eps, affine=True) + + self.dropout = torch.nn.Dropout(dropout) + conv_2d_out_channels = conv_2d_out_channels or out_channels + self.conv2 = torch.nn.Conv2d(out_channels, conv_2d_out_channels, kernel_size=3, stride=1, padding=1) + + self.nonlinearity = get_activation(non_linearity) + + self.upsample = self.downsample = None + if self.up: + if kernel == "fir": + fir_kernel = (1, 3, 3, 1) + self.upsample = lambda x: upsample_2d(x, kernel=fir_kernel) + elif kernel == "sde_vp": + self.upsample = partial(F.interpolate, scale_factor=2.0, mode="nearest") + else: + self.upsample = Upsample2D(in_channels, use_conv=False) + elif self.down: + if kernel == "fir": + fir_kernel = (1, 3, 3, 1) + self.downsample = lambda x: downsample_2d(x, kernel=fir_kernel) + elif kernel == "sde_vp": + self.downsample = partial(F.avg_pool2d, kernel_size=2, stride=2) + else: + self.downsample = Downsample2D(in_channels, use_conv=False, padding=1, name="op") + + self.use_in_shortcut = self.in_channels != conv_2d_out_channels if use_in_shortcut is None else use_in_shortcut + + self.conv_shortcut = None + if self.use_in_shortcut: + self.conv_shortcut = torch.nn.Conv2d( + in_channels, conv_2d_out_channels, kernel_size=1, stride=1, padding=0, bias=conv_shortcut_bias + ) + + def forward(self, input_tensor, temb): + hidden_states = input_tensor + + if self.time_embedding_norm == "ada_group" or self.time_embedding_norm == "spatial": + hidden_states = self.norm1(hidden_states, temb) + else: + hidden_states = self.norm1(hidden_states) + + hidden_states = self.nonlinearity(hidden_states) + + if self.upsample is not None: + # upsample_nearest_nhwc fails with large batch sizes. see https://github.com/huggingface/diffusers/issues/984 + if hidden_states.shape[0] >= 64: + input_tensor = input_tensor.contiguous() + hidden_states = hidden_states.contiguous() + input_tensor = self.upsample(input_tensor) + hidden_states = self.upsample(hidden_states) + elif self.downsample is not None: + input_tensor = self.downsample(input_tensor) + hidden_states = self.downsample(hidden_states) + + hidden_states = self.conv1(hidden_states) + + if self.time_emb_proj is not None: + if not self.skip_time_act: + temb = self.nonlinearity(temb) + temb = self.time_emb_proj(temb)[:, :, None, None] + + if temb is not None and self.time_embedding_norm == "default": + hidden_states = hidden_states + temb + + if self.time_embedding_norm == "ada_group" or self.time_embedding_norm == "spatial": + hidden_states = self.norm2(hidden_states, temb) + else: + hidden_states = self.norm2(hidden_states) + + if temb is not None and self.time_embedding_norm == "scale_shift": + scale, shift = torch.chunk(temb, 2, dim=1) + hidden_states = hidden_states * (1 + scale) + shift + + hidden_states = self.nonlinearity(hidden_states) + + hidden_states = self.dropout(hidden_states) + hidden_states = self.conv2(hidden_states) + + if self.conv_shortcut is not None: + input_tensor = self.conv_shortcut(input_tensor) + + output_tensor = (input_tensor + hidden_states) / self.output_scale_factor + + return output_tensor + + +# unet_rl.py +def rearrange_dims(tensor): + if len(tensor.shape) == 2: + return tensor[:, :, None] + if len(tensor.shape) == 3: + return tensor[:, :, None, :] + elif len(tensor.shape) == 4: + return tensor[:, :, 0, :] + else: + raise ValueError(f"`len(tensor)`: {len(tensor)} has to be 2, 3 or 4.") + + +class Conv1dBlock(nn.Module): + """ + Conv1d --> GroupNorm --> Mish + """ + + def __init__(self, inp_channels, out_channels, kernel_size, n_groups=8): + super().__init__() + + self.conv1d = nn.Conv1d(inp_channels, out_channels, kernel_size, padding=kernel_size // 2) + self.group_norm = nn.GroupNorm(n_groups, out_channels) + self.mish = nn.Mish() + + def forward(self, inputs): + intermediate_repr = self.conv1d(inputs) + intermediate_repr = rearrange_dims(intermediate_repr) + intermediate_repr = self.group_norm(intermediate_repr) + intermediate_repr = rearrange_dims(intermediate_repr) + output = self.mish(intermediate_repr) + return output + + +# unet_rl.py +class ResidualTemporalBlock1D(nn.Module): + def __init__(self, inp_channels, out_channels, embed_dim, kernel_size=5): + super().__init__() + self.conv_in = Conv1dBlock(inp_channels, out_channels, kernel_size) + self.conv_out = Conv1dBlock(out_channels, out_channels, kernel_size) + + self.time_emb_act = nn.Mish() + self.time_emb = nn.Linear(embed_dim, out_channels) + + self.residual_conv = ( + nn.Conv1d(inp_channels, out_channels, 1) if inp_channels != out_channels else nn.Identity() + ) + + def forward(self, inputs, t): + """ + Args: + inputs : [ batch_size x inp_channels x horizon ] + t : [ batch_size x embed_dim ] + + returns: + out : [ batch_size x out_channels x horizon ] + """ + t = self.time_emb_act(t) + t = self.time_emb(t) + out = self.conv_in(inputs) + rearrange_dims(t) + out = self.conv_out(out) + return out + self.residual_conv(inputs) + + +def upsample_2d(hidden_states, kernel=None, factor=2, gain=1): + r"""Upsample2D a batch of 2D images with the given filter. + Accepts a batch of 2D images of the shape `[N, C, H, W]` or `[N, H, W, C]` and upsamples each image with the given + filter. The filter is normalized so that if the input pixels are constant, they will be scaled by the specified + `gain`. Pixels outside the image are assumed to be zero, and the filter is padded with zeros so that its shape is + a: multiple of the upsampling factor. + + Args: + hidden_states: Input tensor of the shape `[N, C, H, W]` or `[N, H, W, C]`. + kernel: FIR filter of the shape `[firH, firW]` or `[firN]` + (separable). The default is `[1] * factor`, which corresponds to nearest-neighbor upsampling. + factor: Integer upsampling factor (default: 2). + gain: Scaling factor for signal magnitude (default: 1.0). + + Returns: + output: Tensor of the shape `[N, C, H * factor, W * factor]` + """ + assert isinstance(factor, int) and factor >= 1 + if kernel is None: + kernel = [1] * factor + + kernel = torch.tensor(kernel, dtype=torch.float32) + if kernel.ndim == 1: + kernel = torch.outer(kernel, kernel) + kernel /= torch.sum(kernel) + + kernel = kernel * (gain * (factor**2)) + pad_value = kernel.shape[0] - factor + output = upfirdn2d_native( + hidden_states, + kernel.to(device=hidden_states.device), + up=factor, + pad=((pad_value + 1) // 2 + factor - 1, pad_value // 2), + ) + return output + + +def downsample_2d(hidden_states, kernel=None, factor=2, gain=1): + r"""Downsample2D a batch of 2D images with the given filter. + Accepts a batch of 2D images of the shape `[N, C, H, W]` or `[N, H, W, C]` and downsamples each image with the + given filter. The filter is normalized so that if the input pixels are constant, they will be scaled by the + specified `gain`. Pixels outside the image are assumed to be zero, and the filter is padded with zeros so that its + shape is a multiple of the downsampling factor. + + Args: + hidden_states: Input tensor of the shape `[N, C, H, W]` or `[N, H, W, C]`. + kernel: FIR filter of the shape `[firH, firW]` or `[firN]` + (separable). The default is `[1] * factor`, which corresponds to average pooling. + factor: Integer downsampling factor (default: 2). + gain: Scaling factor for signal magnitude (default: 1.0). + + Returns: + output: Tensor of the shape `[N, C, H // factor, W // factor]` + """ + + assert isinstance(factor, int) and factor >= 1 + if kernel is None: + kernel = [1] * factor + + kernel = torch.tensor(kernel, dtype=torch.float32) + if kernel.ndim == 1: + kernel = torch.outer(kernel, kernel) + kernel /= torch.sum(kernel) + + kernel = kernel * gain + pad_value = kernel.shape[0] - factor + output = upfirdn2d_native( + hidden_states, kernel.to(device=hidden_states.device), down=factor, pad=((pad_value + 1) // 2, pad_value // 2) + ) + return output + + +def upfirdn2d_native(tensor, kernel, up=1, down=1, pad=(0, 0)): + up_x = up_y = up + down_x = down_y = down + pad_x0 = pad_y0 = pad[0] + pad_x1 = pad_y1 = pad[1] + + _, channel, in_h, in_w = tensor.shape + tensor = tensor.reshape(-1, in_h, in_w, 1) + + _, in_h, in_w, minor = tensor.shape + kernel_h, kernel_w = kernel.shape + + out = tensor.view(-1, in_h, 1, in_w, 1, minor) + out = F.pad(out, [0, 0, 0, up_x - 1, 0, 0, 0, up_y - 1]) + out = out.view(-1, in_h * up_y, in_w * up_x, minor) + + out = F.pad(out, [0, 0, max(pad_x0, 0), max(pad_x1, 0), max(pad_y0, 0), max(pad_y1, 0)]) + out = out.to(tensor.device) # Move back to mps if necessary + out = out[ + :, + max(-pad_y0, 0) : out.shape[1] - max(-pad_y1, 0), + max(-pad_x0, 0) : out.shape[2] - max(-pad_x1, 0), + :, + ] + + out = out.permute(0, 3, 1, 2) + out = out.reshape([-1, 1, in_h * up_y + pad_y0 + pad_y1, in_w * up_x + pad_x0 + pad_x1]) + w = torch.flip(kernel, [0, 1]).view(1, 1, kernel_h, kernel_w) + out = F.conv2d(out, w) + out = out.reshape( + -1, + minor, + in_h * up_y + pad_y0 + pad_y1 - kernel_h + 1, + in_w * up_x + pad_x0 + pad_x1 - kernel_w + 1, + ) + out = out.permute(0, 2, 3, 1) + out = out[:, ::down_y, ::down_x, :] + + out_h = (in_h * up_y + pad_y0 + pad_y1 - kernel_h) // down_y + 1 + out_w = (in_w * up_x + pad_x0 + pad_x1 - kernel_w) // down_x + 1 + + return out.view(-1, channel, out_h, out_w) + + +class TemporalConvLayer(nn.Module): + """ + Temporal convolutional layer that can be used for video (sequence of images) input Code mostly copied from: + https://github.com/modelscope/modelscope/blob/1509fdb973e5871f37148a4b5e5964cafd43e64d/modelscope/models/multi_modal/video_synthesis/unet_sd.py#L1016 + """ + + def __init__(self, in_dim, out_dim=None, dropout=0.0): + super().__init__() + out_dim = out_dim or in_dim + self.in_dim = in_dim + self.out_dim = out_dim + + # conv layers + self.conv1 = nn.Sequential( + nn.GroupNorm(32, in_dim), nn.SiLU(), nn.Conv3d(in_dim, out_dim, (3, 1, 1), padding=(1, 0, 0)) + ) + self.conv2 = nn.Sequential( + nn.GroupNorm(32, out_dim), + nn.SiLU(), + nn.Dropout(dropout), + nn.Conv3d(out_dim, in_dim, (3, 1, 1), padding=(1, 0, 0)), + ) + self.conv3 = nn.Sequential( + nn.GroupNorm(32, out_dim), + nn.SiLU(), + nn.Dropout(dropout), + nn.Conv3d(out_dim, in_dim, (3, 1, 1), padding=(1, 0, 0)), + ) + self.conv4 = nn.Sequential( + nn.GroupNorm(32, out_dim), + nn.SiLU(), + nn.Dropout(dropout), + nn.Conv3d(out_dim, in_dim, (3, 1, 1), padding=(1, 0, 0)), + ) + + # zero out the last layer params,so the conv block is identity + nn.init.zeros_(self.conv4[-1].weight) + nn.init.zeros_(self.conv4[-1].bias) + + def forward(self, hidden_states, num_frames=1): + hidden_states = ( + hidden_states[None, :].reshape((-1, num_frames) + hidden_states.shape[1:]).permute(0, 2, 1, 3, 4) + ) + + identity = hidden_states + hidden_states = self.conv1(hidden_states) + hidden_states = self.conv2(hidden_states) + hidden_states = self.conv3(hidden_states) + hidden_states = self.conv4(hidden_states) + + hidden_states = identity + hidden_states + + hidden_states = hidden_states.permute(0, 2, 1, 3, 4).reshape( + (hidden_states.shape[0] * hidden_states.shape[2], -1) + hidden_states.shape[3:] + ) + return hidden_states diff --git a/gf/sd-webui-rich-text/diffusers_official/models/resnet_flax.py b/gf/sd-webui-rich-text/diffusers_official/models/resnet_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..9a391f4b947e74beda03f26e376141b2b3c21502 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/resnet_flax.py @@ -0,0 +1,124 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import flax.linen as nn +import jax +import jax.numpy as jnp + + +class FlaxUpsample2D(nn.Module): + out_channels: int + dtype: jnp.dtype = jnp.float32 + + def setup(self): + self.conv = nn.Conv( + self.out_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + def __call__(self, hidden_states): + batch, height, width, channels = hidden_states.shape + hidden_states = jax.image.resize( + hidden_states, + shape=(batch, height * 2, width * 2, channels), + method="nearest", + ) + hidden_states = self.conv(hidden_states) + return hidden_states + + +class FlaxDownsample2D(nn.Module): + out_channels: int + dtype: jnp.dtype = jnp.float32 + + def setup(self): + self.conv = nn.Conv( + self.out_channels, + kernel_size=(3, 3), + strides=(2, 2), + padding=((1, 1), (1, 1)), # padding="VALID", + dtype=self.dtype, + ) + + def __call__(self, hidden_states): + # pad = ((0, 0), (0, 1), (0, 1), (0, 0)) # pad height and width dim + # hidden_states = jnp.pad(hidden_states, pad_width=pad) + hidden_states = self.conv(hidden_states) + return hidden_states + + +class FlaxResnetBlock2D(nn.Module): + in_channels: int + out_channels: int = None + dropout_prob: float = 0.0 + use_nin_shortcut: bool = None + dtype: jnp.dtype = jnp.float32 + + def setup(self): + out_channels = self.in_channels if self.out_channels is None else self.out_channels + + self.norm1 = nn.GroupNorm(num_groups=32, epsilon=1e-5) + self.conv1 = nn.Conv( + out_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + self.time_emb_proj = nn.Dense(out_channels, dtype=self.dtype) + + self.norm2 = nn.GroupNorm(num_groups=32, epsilon=1e-5) + self.dropout = nn.Dropout(self.dropout_prob) + self.conv2 = nn.Conv( + out_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + use_nin_shortcut = self.in_channels != out_channels if self.use_nin_shortcut is None else self.use_nin_shortcut + + self.conv_shortcut = None + if use_nin_shortcut: + self.conv_shortcut = nn.Conv( + out_channels, + kernel_size=(1, 1), + strides=(1, 1), + padding="VALID", + dtype=self.dtype, + ) + + def __call__(self, hidden_states, temb, deterministic=True): + residual = hidden_states + hidden_states = self.norm1(hidden_states) + hidden_states = nn.swish(hidden_states) + hidden_states = self.conv1(hidden_states) + + temb = self.time_emb_proj(nn.swish(temb)) + temb = jnp.expand_dims(jnp.expand_dims(temb, 1), 1) + hidden_states = hidden_states + temb + + hidden_states = self.norm2(hidden_states) + hidden_states = nn.swish(hidden_states) + hidden_states = self.dropout(hidden_states, deterministic) + hidden_states = self.conv2(hidden_states) + + if self.conv_shortcut is not None: + residual = self.conv_shortcut(residual) + + return hidden_states + residual diff --git a/gf/sd-webui-rich-text/diffusers_official/models/t5_film_transformer.py b/gf/sd-webui-rich-text/diffusers_official/models/t5_film_transformer.py new file mode 100644 index 0000000000000000000000000000000000000000..1c41e656a9dbe81edafd5a2958d49ff28e84fd01 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/t5_film_transformer.py @@ -0,0 +1,321 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import math + +import torch +from torch import nn + +from ..configuration_utils import ConfigMixin, register_to_config +from .attention_processor import Attention +from .embeddings import get_timestep_embedding +from .modeling_utils import ModelMixin + + +class T5FilmDecoder(ModelMixin, ConfigMixin): + @register_to_config + def __init__( + self, + input_dims: int = 128, + targets_length: int = 256, + max_decoder_noise_time: float = 2000.0, + d_model: int = 768, + num_layers: int = 12, + num_heads: int = 12, + d_kv: int = 64, + d_ff: int = 2048, + dropout_rate: float = 0.1, + ): + super().__init__() + + self.conditioning_emb = nn.Sequential( + nn.Linear(d_model, d_model * 4, bias=False), + nn.SiLU(), + nn.Linear(d_model * 4, d_model * 4, bias=False), + nn.SiLU(), + ) + + self.position_encoding = nn.Embedding(targets_length, d_model) + self.position_encoding.weight.requires_grad = False + + self.continuous_inputs_projection = nn.Linear(input_dims, d_model, bias=False) + + self.dropout = nn.Dropout(p=dropout_rate) + + self.decoders = nn.ModuleList() + for lyr_num in range(num_layers): + # FiLM conditional T5 decoder + lyr = DecoderLayer(d_model=d_model, d_kv=d_kv, num_heads=num_heads, d_ff=d_ff, dropout_rate=dropout_rate) + self.decoders.append(lyr) + + self.decoder_norm = T5LayerNorm(d_model) + + self.post_dropout = nn.Dropout(p=dropout_rate) + self.spec_out = nn.Linear(d_model, input_dims, bias=False) + + def encoder_decoder_mask(self, query_input, key_input): + mask = torch.mul(query_input.unsqueeze(-1), key_input.unsqueeze(-2)) + return mask.unsqueeze(-3) + + def forward(self, encodings_and_masks, decoder_input_tokens, decoder_noise_time): + batch, _, _ = decoder_input_tokens.shape + assert decoder_noise_time.shape == (batch,) + + # decoder_noise_time is in [0, 1), so rescale to expected timing range. + time_steps = get_timestep_embedding( + decoder_noise_time * self.config.max_decoder_noise_time, + embedding_dim=self.config.d_model, + max_period=self.config.max_decoder_noise_time, + ).to(dtype=self.dtype) + + conditioning_emb = self.conditioning_emb(time_steps).unsqueeze(1) + + assert conditioning_emb.shape == (batch, 1, self.config.d_model * 4) + + seq_length = decoder_input_tokens.shape[1] + + # If we want to use relative positions for audio context, we can just offset + # this sequence by the length of encodings_and_masks. + decoder_positions = torch.broadcast_to( + torch.arange(seq_length, device=decoder_input_tokens.device), + (batch, seq_length), + ) + + position_encodings = self.position_encoding(decoder_positions) + + inputs = self.continuous_inputs_projection(decoder_input_tokens) + inputs += position_encodings + y = self.dropout(inputs) + + # decoder: No padding present. + decoder_mask = torch.ones( + decoder_input_tokens.shape[:2], device=decoder_input_tokens.device, dtype=inputs.dtype + ) + + # Translate encoding masks to encoder-decoder masks. + encodings_and_encdec_masks = [(x, self.encoder_decoder_mask(decoder_mask, y)) for x, y in encodings_and_masks] + + # cross attend style: concat encodings + encoded = torch.cat([x[0] for x in encodings_and_encdec_masks], dim=1) + encoder_decoder_mask = torch.cat([x[1] for x in encodings_and_encdec_masks], dim=-1) + + for lyr in self.decoders: + y = lyr( + y, + conditioning_emb=conditioning_emb, + encoder_hidden_states=encoded, + encoder_attention_mask=encoder_decoder_mask, + )[0] + + y = self.decoder_norm(y) + y = self.post_dropout(y) + + spec_out = self.spec_out(y) + return spec_out + + +class DecoderLayer(nn.Module): + def __init__(self, d_model, d_kv, num_heads, d_ff, dropout_rate, layer_norm_epsilon=1e-6): + super().__init__() + self.layer = nn.ModuleList() + + # cond self attention: layer 0 + self.layer.append( + T5LayerSelfAttentionCond(d_model=d_model, d_kv=d_kv, num_heads=num_heads, dropout_rate=dropout_rate) + ) + + # cross attention: layer 1 + self.layer.append( + T5LayerCrossAttention( + d_model=d_model, + d_kv=d_kv, + num_heads=num_heads, + dropout_rate=dropout_rate, + layer_norm_epsilon=layer_norm_epsilon, + ) + ) + + # Film Cond MLP + dropout: last layer + self.layer.append( + T5LayerFFCond(d_model=d_model, d_ff=d_ff, dropout_rate=dropout_rate, layer_norm_epsilon=layer_norm_epsilon) + ) + + def forward( + self, + hidden_states, + conditioning_emb=None, + attention_mask=None, + encoder_hidden_states=None, + encoder_attention_mask=None, + encoder_decoder_position_bias=None, + ): + hidden_states = self.layer[0]( + hidden_states, + conditioning_emb=conditioning_emb, + attention_mask=attention_mask, + ) + + if encoder_hidden_states is not None: + encoder_extended_attention_mask = torch.where(encoder_attention_mask > 0, 0, -1e10).to( + encoder_hidden_states.dtype + ) + + hidden_states = self.layer[1]( + hidden_states, + key_value_states=encoder_hidden_states, + attention_mask=encoder_extended_attention_mask, + ) + + # Apply Film Conditional Feed Forward layer + hidden_states = self.layer[-1](hidden_states, conditioning_emb) + + return (hidden_states,) + + +class T5LayerSelfAttentionCond(nn.Module): + def __init__(self, d_model, d_kv, num_heads, dropout_rate): + super().__init__() + self.layer_norm = T5LayerNorm(d_model) + self.FiLMLayer = T5FiLMLayer(in_features=d_model * 4, out_features=d_model) + self.attention = Attention(query_dim=d_model, heads=num_heads, dim_head=d_kv, out_bias=False, scale_qk=False) + self.dropout = nn.Dropout(dropout_rate) + + def forward( + self, + hidden_states, + conditioning_emb=None, + attention_mask=None, + ): + # pre_self_attention_layer_norm + normed_hidden_states = self.layer_norm(hidden_states) + + if conditioning_emb is not None: + normed_hidden_states = self.FiLMLayer(normed_hidden_states, conditioning_emb) + + # Self-attention block + attention_output = self.attention(normed_hidden_states) + + hidden_states = hidden_states + self.dropout(attention_output) + + return hidden_states + + +class T5LayerCrossAttention(nn.Module): + def __init__(self, d_model, d_kv, num_heads, dropout_rate, layer_norm_epsilon): + super().__init__() + self.attention = Attention(query_dim=d_model, heads=num_heads, dim_head=d_kv, out_bias=False, scale_qk=False) + self.layer_norm = T5LayerNorm(d_model, eps=layer_norm_epsilon) + self.dropout = nn.Dropout(dropout_rate) + + def forward( + self, + hidden_states, + key_value_states=None, + attention_mask=None, + ): + normed_hidden_states = self.layer_norm(hidden_states) + attention_output = self.attention( + normed_hidden_states, + encoder_hidden_states=key_value_states, + attention_mask=attention_mask.squeeze(1), + ) + layer_output = hidden_states + self.dropout(attention_output) + return layer_output + + +class T5LayerFFCond(nn.Module): + def __init__(self, d_model, d_ff, dropout_rate, layer_norm_epsilon): + super().__init__() + self.DenseReluDense = T5DenseGatedActDense(d_model=d_model, d_ff=d_ff, dropout_rate=dropout_rate) + self.film = T5FiLMLayer(in_features=d_model * 4, out_features=d_model) + self.layer_norm = T5LayerNorm(d_model, eps=layer_norm_epsilon) + self.dropout = nn.Dropout(dropout_rate) + + def forward(self, hidden_states, conditioning_emb=None): + forwarded_states = self.layer_norm(hidden_states) + if conditioning_emb is not None: + forwarded_states = self.film(forwarded_states, conditioning_emb) + + forwarded_states = self.DenseReluDense(forwarded_states) + hidden_states = hidden_states + self.dropout(forwarded_states) + return hidden_states + + +class T5DenseGatedActDense(nn.Module): + def __init__(self, d_model, d_ff, dropout_rate): + super().__init__() + self.wi_0 = nn.Linear(d_model, d_ff, bias=False) + self.wi_1 = nn.Linear(d_model, d_ff, bias=False) + self.wo = nn.Linear(d_ff, d_model, bias=False) + self.dropout = nn.Dropout(dropout_rate) + self.act = NewGELUActivation() + + def forward(self, hidden_states): + hidden_gelu = self.act(self.wi_0(hidden_states)) + hidden_linear = self.wi_1(hidden_states) + hidden_states = hidden_gelu * hidden_linear + hidden_states = self.dropout(hidden_states) + + hidden_states = self.wo(hidden_states) + return hidden_states + + +class T5LayerNorm(nn.Module): + def __init__(self, hidden_size, eps=1e-6): + """ + Construct a layernorm module in the T5 style. No bias and no subtraction of mean. + """ + super().__init__() + self.weight = nn.Parameter(torch.ones(hidden_size)) + self.variance_epsilon = eps + + def forward(self, hidden_states): + # T5 uses a layer_norm which only scales and doesn't shift, which is also known as Root Mean + # Square Layer Normalization https://arxiv.org/abs/1910.07467 thus variance is calculated + # w/o mean and there is no bias. Additionally we want to make sure that the accumulation for + # half-precision inputs is done in fp32 + + variance = hidden_states.to(torch.float32).pow(2).mean(-1, keepdim=True) + hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon) + + # convert into half-precision if necessary + if self.weight.dtype in [torch.float16, torch.bfloat16]: + hidden_states = hidden_states.to(self.weight.dtype) + + return self.weight * hidden_states + + +class NewGELUActivation(nn.Module): + """ + Implementation of the GELU activation function currently in Google BERT repo (identical to OpenAI GPT). Also see + the Gaussian Error Linear Units paper: https://arxiv.org/abs/1606.08415 + """ + + def forward(self, input: torch.Tensor) -> torch.Tensor: + return 0.5 * input * (1.0 + torch.tanh(math.sqrt(2.0 / math.pi) * (input + 0.044715 * torch.pow(input, 3.0)))) + + +class T5FiLMLayer(nn.Module): + """ + FiLM Layer + """ + + def __init__(self, in_features, out_features): + super().__init__() + self.scale_bias = nn.Linear(in_features, out_features * 2, bias=False) + + def forward(self, x, conditioning_emb): + emb = self.scale_bias(conditioning_emb) + scale, shift = torch.chunk(emb, 2, -1) + x = x * (1 + scale) + shift + return x diff --git a/gf/sd-webui-rich-text/diffusers_official/models/transformer_2d.py b/gf/sd-webui-rich-text/diffusers_official/models/transformer_2d.py new file mode 100644 index 0000000000000000000000000000000000000000..83da16838ae2248c31faada9cd5704d20500459c --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/transformer_2d.py @@ -0,0 +1,341 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Any, Dict, Optional + +import torch +import torch.nn.functional as F +from torch import nn + +from ..configuration_utils import ConfigMixin, register_to_config +from ..models.embeddings import ImagePositionalEmbeddings +from ..utils import BaseOutput, deprecate +from .attention import BasicTransformerBlock +from .embeddings import PatchEmbed +from .modeling_utils import ModelMixin + + +@dataclass +class Transformer2DModelOutput(BaseOutput): + """ + The output of [`Transformer2DModel`]. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` or `(batch size, num_vector_embeds - 1, num_latent_pixels)` if [`Transformer2DModel`] is discrete): + The hidden states output conditioned on the `encoder_hidden_states` input. If discrete, returns probability + distributions for the unnoised latent pixels. + """ + + sample: torch.FloatTensor + + +class Transformer2DModel(ModelMixin, ConfigMixin): + """ + A 2D Transformer model for image-like data. + + Parameters: + num_attention_heads (`int`, *optional*, defaults to 16): The number of heads to use for multi-head attention. + attention_head_dim (`int`, *optional*, defaults to 88): The number of channels in each head. + in_channels (`int`, *optional*): + The number of channels in the input and output (specify if the input is **continuous**). + num_layers (`int`, *optional*, defaults to 1): The number of layers of Transformer blocks to use. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The number of `encoder_hidden_states` dimensions to use. + sample_size (`int`, *optional*): The width of the latent images (specify if the input is **discrete**). + This is fixed during training since it is used to learn a number of position embeddings. + num_vector_embeds (`int`, *optional*): + The number of classes of the vector embeddings of the latent pixels (specify if the input is **discrete**). + Includes the class for the masked latent pixel. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to use in feed-forward. + num_embeds_ada_norm ( `int`, *optional*): + The number of diffusion steps used during training. Pass if at least one of the norm_layers is + `AdaLayerNorm`. This is fixed during training since it is used to learn a number of embeddings that are + added to the hidden states. + + During inference, you can denoise for up to but not more steps than `num_embeds_ada_norm`. + attention_bias (`bool`, *optional*): + Configure if the `TransformerBlocks` attention should contain a bias parameter. + """ + + @register_to_config + def __init__( + self, + num_attention_heads: int = 16, + attention_head_dim: int = 88, + in_channels: Optional[int] = None, + out_channels: Optional[int] = None, + num_layers: int = 1, + dropout: float = 0.0, + norm_num_groups: int = 32, + cross_attention_dim: Optional[int] = None, + attention_bias: bool = False, + sample_size: Optional[int] = None, + num_vector_embeds: Optional[int] = None, + patch_size: Optional[int] = None, + activation_fn: str = "geglu", + num_embeds_ada_norm: Optional[int] = None, + use_linear_projection: bool = False, + only_cross_attention: bool = False, + upcast_attention: bool = False, + norm_type: str = "layer_norm", + norm_elementwise_affine: bool = True, + ): + super().__init__() + self.use_linear_projection = use_linear_projection + self.num_attention_heads = num_attention_heads + self.attention_head_dim = attention_head_dim + inner_dim = num_attention_heads * attention_head_dim + + # 1. Transformer2DModel can process both standard continuous images of shape `(batch_size, num_channels, width, height)` as well as quantized image embeddings of shape `(batch_size, num_image_vectors)` + # Define whether input is continuous or discrete depending on configuration + self.is_input_continuous = (in_channels is not None) and (patch_size is None) + self.is_input_vectorized = num_vector_embeds is not None + self.is_input_patches = in_channels is not None and patch_size is not None + + if norm_type == "layer_norm" and num_embeds_ada_norm is not None: + deprecation_message = ( + f"The configuration file of this model: {self.__class__} is outdated. `norm_type` is either not set or" + " incorrectly set to `'layer_norm'`.Make sure to set `norm_type` to `'ada_norm'` in the config." + " Please make sure to update the config accordingly as leaving `norm_type` might led to incorrect" + " results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it" + " would be very nice if you could open a Pull request for the `transformer/config.json` file" + ) + deprecate("norm_type!=num_embeds_ada_norm", "1.0.0", deprecation_message, standard_warn=False) + norm_type = "ada_norm" + + if self.is_input_continuous and self.is_input_vectorized: + raise ValueError( + f"Cannot define both `in_channels`: {in_channels} and `num_vector_embeds`: {num_vector_embeds}. Make" + " sure that either `in_channels` or `num_vector_embeds` is None." + ) + elif self.is_input_vectorized and self.is_input_patches: + raise ValueError( + f"Cannot define both `num_vector_embeds`: {num_vector_embeds} and `patch_size`: {patch_size}. Make" + " sure that either `num_vector_embeds` or `num_patches` is None." + ) + elif not self.is_input_continuous and not self.is_input_vectorized and not self.is_input_patches: + raise ValueError( + f"Has to define `in_channels`: {in_channels}, `num_vector_embeds`: {num_vector_embeds}, or patch_size:" + f" {patch_size}. Make sure that `in_channels`, `num_vector_embeds` or `num_patches` is not None." + ) + + # 2. Define input layers + if self.is_input_continuous: + self.in_channels = in_channels + + self.norm = torch.nn.GroupNorm(num_groups=norm_num_groups, num_channels=in_channels, eps=1e-6, affine=True) + if use_linear_projection: + self.proj_in = nn.Linear(in_channels, inner_dim) + else: + self.proj_in = nn.Conv2d(in_channels, inner_dim, kernel_size=1, stride=1, padding=0) + elif self.is_input_vectorized: + assert sample_size is not None, "Transformer2DModel over discrete input must provide sample_size" + assert num_vector_embeds is not None, "Transformer2DModel over discrete input must provide num_embed" + + self.height = sample_size + self.width = sample_size + self.num_vector_embeds = num_vector_embeds + self.num_latent_pixels = self.height * self.width + + self.latent_image_embedding = ImagePositionalEmbeddings( + num_embed=num_vector_embeds, embed_dim=inner_dim, height=self.height, width=self.width + ) + elif self.is_input_patches: + assert sample_size is not None, "Transformer2DModel over patched input must provide sample_size" + + self.height = sample_size + self.width = sample_size + + self.patch_size = patch_size + self.pos_embed = PatchEmbed( + height=sample_size, + width=sample_size, + patch_size=patch_size, + in_channels=in_channels, + embed_dim=inner_dim, + ) + + # 3. Define transformers blocks + self.transformer_blocks = nn.ModuleList( + [ + BasicTransformerBlock( + inner_dim, + num_attention_heads, + attention_head_dim, + dropout=dropout, + cross_attention_dim=cross_attention_dim, + activation_fn=activation_fn, + num_embeds_ada_norm=num_embeds_ada_norm, + attention_bias=attention_bias, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + norm_type=norm_type, + norm_elementwise_affine=norm_elementwise_affine, + ) + for d in range(num_layers) + ] + ) + + # 4. Define output layers + self.out_channels = in_channels if out_channels is None else out_channels + if self.is_input_continuous: + # TODO: should use out_channels for continuous projections + if use_linear_projection: + self.proj_out = nn.Linear(inner_dim, in_channels) + else: + self.proj_out = nn.Conv2d(inner_dim, in_channels, kernel_size=1, stride=1, padding=0) + elif self.is_input_vectorized: + self.norm_out = nn.LayerNorm(inner_dim) + self.out = nn.Linear(inner_dim, self.num_vector_embeds - 1) + elif self.is_input_patches: + self.norm_out = nn.LayerNorm(inner_dim, elementwise_affine=False, eps=1e-6) + self.proj_out_1 = nn.Linear(inner_dim, 2 * inner_dim) + self.proj_out_2 = nn.Linear(inner_dim, patch_size * patch_size * self.out_channels) + + def forward( + self, + hidden_states: torch.Tensor, + encoder_hidden_states: Optional[torch.Tensor] = None, + timestep: Optional[torch.LongTensor] = None, + class_labels: Optional[torch.LongTensor] = None, + cross_attention_kwargs: Dict[str, Any] = None, + attention_mask: Optional[torch.Tensor] = None, + encoder_attention_mask: Optional[torch.Tensor] = None, + return_dict: bool = True, + ): + """ + The [`Transformer2DModel`] forward method. + + Args: + hidden_states (`torch.LongTensor` of shape `(batch size, num latent pixels)` if discrete, `torch.FloatTensor` of shape `(batch size, channel, height, width)` if continuous): + Input `hidden_states`. + encoder_hidden_states ( `torch.FloatTensor` of shape `(batch size, sequence len, embed dims)`, *optional*): + Conditional embeddings for cross attention layer. If not given, cross-attention defaults to + self-attention. + timestep ( `torch.LongTensor`, *optional*): + Used to indicate denoising step. Optional timestep to be applied as an embedding in `AdaLayerNorm`. + class_labels ( `torch.LongTensor` of shape `(batch size, num classes)`, *optional*): + Used to indicate class labels conditioning. Optional class labels to be applied as an embedding in + `AdaLayerZeroNorm`. + encoder_attention_mask ( `torch.Tensor`, *optional*): + Cross-attention mask applied to `encoder_hidden_states`. Two formats supported: + + * Mask `(batch, sequence_length)` True = keep, False = discard. + * Bias `(batch, 1, sequence_length)` 0 = keep, -10000 = discard. + + If `ndim == 2`: will be interpreted as a mask, then converted into a bias consistent with the format + above. This bias will be added to the cross-attention scores. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.unet_2d_condition.UNet2DConditionOutput`] instead of a plain + tuple. + + Returns: + If `return_dict` is True, an [`~models.transformer_2d.Transformer2DModelOutput`] is returned, otherwise a + `tuple` where the first element is the sample tensor. + """ + # ensure attention_mask is a bias, and give it a singleton query_tokens dimension. + # we may have done this conversion already, e.g. if we came here via UNet2DConditionModel#forward. + # we can tell by counting dims; if ndim == 2: it's a mask rather than a bias. + # expects mask of shape: + # [batch, key_tokens] + # adds singleton query_tokens dimension: + # [batch, 1, key_tokens] + # this helps to broadcast it as a bias over attention scores, which will be in one of the following shapes: + # [batch, heads, query_tokens, key_tokens] (e.g. torch sdp attn) + # [batch * heads, query_tokens, key_tokens] (e.g. xformers or classic attn) + if attention_mask is not None and attention_mask.ndim == 2: + # assume that mask is expressed as: + # (1 = keep, 0 = discard) + # convert mask into a bias that can be added to attention scores: + # (keep = +0, discard = -10000.0) + attention_mask = (1 - attention_mask.to(hidden_states.dtype)) * -10000.0 + attention_mask = attention_mask.unsqueeze(1) + + # convert encoder_attention_mask to a bias the same way we do for attention_mask + if encoder_attention_mask is not None and encoder_attention_mask.ndim == 2: + encoder_attention_mask = (1 - encoder_attention_mask.to(hidden_states.dtype)) * -10000.0 + encoder_attention_mask = encoder_attention_mask.unsqueeze(1) + + # 1. Input + if self.is_input_continuous: + batch, _, height, width = hidden_states.shape + residual = hidden_states + + hidden_states = self.norm(hidden_states) + if not self.use_linear_projection: + hidden_states = self.proj_in(hidden_states) + inner_dim = hidden_states.shape[1] + hidden_states = hidden_states.permute(0, 2, 3, 1).reshape(batch, height * width, inner_dim) + else: + inner_dim = hidden_states.shape[1] + hidden_states = hidden_states.permute(0, 2, 3, 1).reshape(batch, height * width, inner_dim) + hidden_states = self.proj_in(hidden_states) + elif self.is_input_vectorized: + hidden_states = self.latent_image_embedding(hidden_states) + elif self.is_input_patches: + hidden_states = self.pos_embed(hidden_states) + + # 2. Blocks + for block in self.transformer_blocks: + hidden_states = block( + hidden_states, + attention_mask=attention_mask, + encoder_hidden_states=encoder_hidden_states, + encoder_attention_mask=encoder_attention_mask, + timestep=timestep, + cross_attention_kwargs=cross_attention_kwargs, + class_labels=class_labels, + ) + + # 3. Output + if self.is_input_continuous: + if not self.use_linear_projection: + hidden_states = hidden_states.reshape(batch, height, width, inner_dim).permute(0, 3, 1, 2).contiguous() + hidden_states = self.proj_out(hidden_states) + else: + hidden_states = self.proj_out(hidden_states) + hidden_states = hidden_states.reshape(batch, height, width, inner_dim).permute(0, 3, 1, 2).contiguous() + + output = hidden_states + residual + elif self.is_input_vectorized: + hidden_states = self.norm_out(hidden_states) + logits = self.out(hidden_states) + # (batch, self.num_vector_embeds - 1, self.num_latent_pixels) + logits = logits.permute(0, 2, 1) + + # log(p(x_0)) + output = F.log_softmax(logits.double(), dim=1).float() + elif self.is_input_patches: + # TODO: cleanup! + conditioning = self.transformer_blocks[0].norm1.emb( + timestep, class_labels, hidden_dtype=hidden_states.dtype + ) + shift, scale = self.proj_out_1(F.silu(conditioning)).chunk(2, dim=1) + hidden_states = self.norm_out(hidden_states) * (1 + scale[:, None]) + shift[:, None] + hidden_states = self.proj_out_2(hidden_states) + + # unpatchify + height = width = int(hidden_states.shape[1] ** 0.5) + hidden_states = hidden_states.reshape( + shape=(-1, height, width, self.patch_size, self.patch_size, self.out_channels) + ) + hidden_states = torch.einsum("nhwpqc->nchpwq", hidden_states) + output = hidden_states.reshape( + shape=(-1, self.out_channels, height * self.patch_size, width * self.patch_size) + ) + + if not return_dict: + return (output,) + + return Transformer2DModelOutput(sample=output) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/transformer_temporal.py b/gf/sd-webui-rich-text/diffusers_official/models/transformer_temporal.py new file mode 100644 index 0000000000000000000000000000000000000000..cfafdb055bcfedc911b0a19d1e5da8089a18b215 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/transformer_temporal.py @@ -0,0 +1,179 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Optional + +import torch +from torch import nn + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput +from .attention import BasicTransformerBlock +from .modeling_utils import ModelMixin + + +@dataclass +class TransformerTemporalModelOutput(BaseOutput): + """ + The output of [`TransformerTemporalModel`]. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size x num_frames, num_channels, height, width)`): + The hidden states output conditioned on `encoder_hidden_states` input. + """ + + sample: torch.FloatTensor + + +class TransformerTemporalModel(ModelMixin, ConfigMixin): + """ + A Transformer model for video-like data. + + Parameters: + num_attention_heads (`int`, *optional*, defaults to 16): The number of heads to use for multi-head attention. + attention_head_dim (`int`, *optional*, defaults to 88): The number of channels in each head. + in_channels (`int`, *optional*): + The number of channels in the input and output (specify if the input is **continuous**). + num_layers (`int`, *optional*, defaults to 1): The number of layers of Transformer blocks to use. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The number of `encoder_hidden_states` dimensions to use. + sample_size (`int`, *optional*): The width of the latent images (specify if the input is **discrete**). + This is fixed during training since it is used to learn a number of position embeddings. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to use in feed-forward. + attention_bias (`bool`, *optional*): + Configure if the `TransformerBlock` attention should contain a bias parameter. + double_self_attention (`bool`, *optional*): + Configure if each `TransformerBlock` should contain two self-attention layers. + """ + + @register_to_config + def __init__( + self, + num_attention_heads: int = 16, + attention_head_dim: int = 88, + in_channels: Optional[int] = None, + out_channels: Optional[int] = None, + num_layers: int = 1, + dropout: float = 0.0, + norm_num_groups: int = 32, + cross_attention_dim: Optional[int] = None, + attention_bias: bool = False, + sample_size: Optional[int] = None, + activation_fn: str = "geglu", + norm_elementwise_affine: bool = True, + double_self_attention: bool = True, + ): + super().__init__() + self.num_attention_heads = num_attention_heads + self.attention_head_dim = attention_head_dim + inner_dim = num_attention_heads * attention_head_dim + + self.in_channels = in_channels + + self.norm = torch.nn.GroupNorm(num_groups=norm_num_groups, num_channels=in_channels, eps=1e-6, affine=True) + self.proj_in = nn.Linear(in_channels, inner_dim) + + # 3. Define transformers blocks + self.transformer_blocks = nn.ModuleList( + [ + BasicTransformerBlock( + inner_dim, + num_attention_heads, + attention_head_dim, + dropout=dropout, + cross_attention_dim=cross_attention_dim, + activation_fn=activation_fn, + attention_bias=attention_bias, + double_self_attention=double_self_attention, + norm_elementwise_affine=norm_elementwise_affine, + ) + for d in range(num_layers) + ] + ) + + self.proj_out = nn.Linear(inner_dim, in_channels) + + def forward( + self, + hidden_states, + encoder_hidden_states=None, + timestep=None, + class_labels=None, + num_frames=1, + cross_attention_kwargs=None, + return_dict: bool = True, + ): + """ + The [`TransformerTemporal`] forward method. + + Args: + hidden_states (`torch.LongTensor` of shape `(batch size, num latent pixels)` if discrete, `torch.FloatTensor` of shape `(batch size, channel, height, width)` if continuous): + Input hidden_states. + encoder_hidden_states ( `torch.LongTensor` of shape `(batch size, encoder_hidden_states dim)`, *optional*): + Conditional embeddings for cross attention layer. If not given, cross-attention defaults to + self-attention. + timestep ( `torch.long`, *optional*): + Used to indicate denoising step. Optional timestep to be applied as an embedding in `AdaLayerNorm`. + class_labels ( `torch.LongTensor` of shape `(batch size, num classes)`, *optional*): + Used to indicate class labels conditioning. Optional class labels to be applied as an embedding in + `AdaLayerZeroNorm`. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.unet_2d_condition.UNet2DConditionOutput`] instead of a plain + tuple. + + Returns: + [`~models.transformer_temporal.TransformerTemporalModelOutput`] or `tuple`: + If `return_dict` is True, an [`~models.transformer_temporal.TransformerTemporalModelOutput`] is + returned, otherwise a `tuple` where the first element is the sample tensor. + """ + # 1. Input + batch_frames, channel, height, width = hidden_states.shape + batch_size = batch_frames // num_frames + + residual = hidden_states + + hidden_states = hidden_states[None, :].reshape(batch_size, num_frames, channel, height, width) + hidden_states = hidden_states.permute(0, 2, 1, 3, 4) + + hidden_states = self.norm(hidden_states) + hidden_states = hidden_states.permute(0, 3, 4, 2, 1).reshape(batch_size * height * width, num_frames, channel) + + hidden_states = self.proj_in(hidden_states) + + # 2. Blocks + for block in self.transformer_blocks: + hidden_states = block( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + timestep=timestep, + cross_attention_kwargs=cross_attention_kwargs, + class_labels=class_labels, + ) + + # 3. Output + hidden_states = self.proj_out(hidden_states) + hidden_states = ( + hidden_states[None, None, :] + .reshape(batch_size, height, width, channel, num_frames) + .permute(0, 3, 4, 1, 2) + .contiguous() + ) + hidden_states = hidden_states.reshape(batch_frames, channel, height, width) + + output = hidden_states + residual + + if not return_dict: + return (output,) + + return TransformerTemporalModelOutput(sample=output) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_1d.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_1d.py new file mode 100644 index 0000000000000000000000000000000000000000..9b617388f3917c97e8aef39ec0f386eb2e4a1254 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_1d.py @@ -0,0 +1,255 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import torch +import torch.nn as nn + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput +from .embeddings import GaussianFourierProjection, TimestepEmbedding, Timesteps +from .modeling_utils import ModelMixin +from .unet_1d_blocks import get_down_block, get_mid_block, get_out_block, get_up_block + + +@dataclass +class UNet1DOutput(BaseOutput): + """ + The output of [`UNet1DModel`]. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size, num_channels, sample_size)`): + The hidden states output from the last layer of the model. + """ + + sample: torch.FloatTensor + + +class UNet1DModel(ModelMixin, ConfigMixin): + r""" + A 1D UNet model that takes a noisy sample and a timestep and returns a sample shaped output. + + This model inherits from [`ModelMixin`]. Check the superclass documentation for it's generic methods implemented + for all models (such as downloading or saving). + + Parameters: + sample_size (`int`, *optional*): Default length of sample. Should be adaptable at runtime. + in_channels (`int`, *optional*, defaults to 2): Number of channels in the input sample. + out_channels (`int`, *optional*, defaults to 2): Number of channels in the output. + extra_in_channels (`int`, *optional*, defaults to 0): + Number of additional channels to be added to the input of the first down block. Useful for cases where the + input data has more channels than what the model was initially designed for. + time_embedding_type (`str`, *optional*, defaults to `"fourier"`): Type of time embedding to use. + freq_shift (`float`, *optional*, defaults to 0.0): Frequency shift for Fourier time embedding. + flip_sin_to_cos (`bool`, *optional*, defaults to `False`): + Whether to flip sin to cos for Fourier time embedding. + down_block_types (`Tuple[str]`, *optional*, defaults to `("DownBlock1D", "DownBlock1DNoSkip", "AttnDownBlock1D")`): + Tuple of downsample block types. + up_block_types (`Tuple[str]`, *optional*, defaults to `("UpBlock1D", "UpBlock1DNoSkip", "AttnUpBlock1D")`): + Tuple of upsample block types. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(32, 32, 64)`): + Tuple of block output channels. + mid_block_type (`str`, *optional*, defaults to `"UNetMidBlock1D"`): Block type for middle of UNet. + out_block_type (`str`, *optional*, defaults to `None`): Optional output processing block of UNet. + act_fn (`str`, *optional*, defaults to `None`): Optional activation function in UNet blocks. + norm_num_groups (`int`, *optional*, defaults to 8): The number of groups for normalization. + layers_per_block (`int`, *optional*, defaults to 1): The number of layers per block. + downsample_each_block (`int`, *optional*, defaults to `False`): + Experimental feature for using a UNet without upsampling. + """ + + @register_to_config + def __init__( + self, + sample_size: int = 65536, + sample_rate: Optional[int] = None, + in_channels: int = 2, + out_channels: int = 2, + extra_in_channels: int = 0, + time_embedding_type: str = "fourier", + flip_sin_to_cos: bool = True, + use_timestep_embedding: bool = False, + freq_shift: float = 0.0, + down_block_types: Tuple[str] = ("DownBlock1DNoSkip", "DownBlock1D", "AttnDownBlock1D"), + up_block_types: Tuple[str] = ("AttnUpBlock1D", "UpBlock1D", "UpBlock1DNoSkip"), + mid_block_type: Tuple[str] = "UNetMidBlock1D", + out_block_type: str = None, + block_out_channels: Tuple[int] = (32, 32, 64), + act_fn: str = None, + norm_num_groups: int = 8, + layers_per_block: int = 1, + downsample_each_block: bool = False, + ): + super().__init__() + self.sample_size = sample_size + + # time + if time_embedding_type == "fourier": + self.time_proj = GaussianFourierProjection( + embedding_size=8, set_W_to_weight=False, log=False, flip_sin_to_cos=flip_sin_to_cos + ) + timestep_input_dim = 2 * block_out_channels[0] + elif time_embedding_type == "positional": + self.time_proj = Timesteps( + block_out_channels[0], flip_sin_to_cos=flip_sin_to_cos, downscale_freq_shift=freq_shift + ) + timestep_input_dim = block_out_channels[0] + + if use_timestep_embedding: + time_embed_dim = block_out_channels[0] * 4 + self.time_mlp = TimestepEmbedding( + in_channels=timestep_input_dim, + time_embed_dim=time_embed_dim, + act_fn=act_fn, + out_dim=block_out_channels[0], + ) + + self.down_blocks = nn.ModuleList([]) + self.mid_block = None + self.up_blocks = nn.ModuleList([]) + self.out_block = None + + # down + output_channel = in_channels + for i, down_block_type in enumerate(down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + + if i == 0: + input_channel += extra_in_channels + + is_final_block = i == len(block_out_channels) - 1 + + down_block = get_down_block( + down_block_type, + num_layers=layers_per_block, + in_channels=input_channel, + out_channels=output_channel, + temb_channels=block_out_channels[0], + add_downsample=not is_final_block or downsample_each_block, + ) + self.down_blocks.append(down_block) + + # mid + self.mid_block = get_mid_block( + mid_block_type, + in_channels=block_out_channels[-1], + mid_channels=block_out_channels[-1], + out_channels=block_out_channels[-1], + embed_dim=block_out_channels[0], + num_layers=layers_per_block, + add_downsample=downsample_each_block, + ) + + # up + reversed_block_out_channels = list(reversed(block_out_channels)) + output_channel = reversed_block_out_channels[0] + if out_block_type is None: + final_upsample_channels = out_channels + else: + final_upsample_channels = block_out_channels[0] + + for i, up_block_type in enumerate(up_block_types): + prev_output_channel = output_channel + output_channel = ( + reversed_block_out_channels[i + 1] if i < len(up_block_types) - 1 else final_upsample_channels + ) + + is_final_block = i == len(block_out_channels) - 1 + + up_block = get_up_block( + up_block_type, + num_layers=layers_per_block, + in_channels=prev_output_channel, + out_channels=output_channel, + temb_channels=block_out_channels[0], + add_upsample=not is_final_block, + ) + self.up_blocks.append(up_block) + prev_output_channel = output_channel + + # out + num_groups_out = norm_num_groups if norm_num_groups is not None else min(block_out_channels[0] // 4, 32) + self.out_block = get_out_block( + out_block_type=out_block_type, + num_groups_out=num_groups_out, + embed_dim=block_out_channels[0], + out_channels=out_channels, + act_fn=act_fn, + fc_dim=block_out_channels[-1] // 4, + ) + + def forward( + self, + sample: torch.FloatTensor, + timestep: Union[torch.Tensor, float, int], + return_dict: bool = True, + ) -> Union[UNet1DOutput, Tuple]: + r""" + The [`UNet1DModel`] forward method. + + Args: + sample (`torch.FloatTensor`): + The noisy input tensor with the following shape `(batch_size, num_channels, sample_size)`. + timestep (`torch.FloatTensor` or `float` or `int`): The number of timesteps to denoise an input. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.unet_1d.UNet1DOutput`] instead of a plain tuple. + + Returns: + [`~models.unet_1d.UNet1DOutput`] or `tuple`: + If `return_dict` is True, an [`~models.unet_1d.UNet1DOutput`] is returned, otherwise a `tuple` is + returned where the first element is the sample tensor. + """ + + # 1. time + timesteps = timestep + if not torch.is_tensor(timesteps): + timesteps = torch.tensor([timesteps], dtype=torch.long, device=sample.device) + elif torch.is_tensor(timesteps) and len(timesteps.shape) == 0: + timesteps = timesteps[None].to(sample.device) + + timestep_embed = self.time_proj(timesteps) + if self.config.use_timestep_embedding: + timestep_embed = self.time_mlp(timestep_embed) + else: + timestep_embed = timestep_embed[..., None] + timestep_embed = timestep_embed.repeat([1, 1, sample.shape[2]]).to(sample.dtype) + timestep_embed = timestep_embed.broadcast_to((sample.shape[:1] + timestep_embed.shape[1:])) + + # 2. down + down_block_res_samples = () + for downsample_block in self.down_blocks: + sample, res_samples = downsample_block(hidden_states=sample, temb=timestep_embed) + down_block_res_samples += res_samples + + # 3. mid + if self.mid_block: + sample = self.mid_block(sample, timestep_embed) + + # 4. up + for i, upsample_block in enumerate(self.up_blocks): + res_samples = down_block_res_samples[-1:] + down_block_res_samples = down_block_res_samples[:-1] + sample = upsample_block(sample, res_hidden_states_tuple=res_samples, temb=timestep_embed) + + # 5. post-process + if self.out_block: + sample = self.out_block(sample, timestep_embed) + + if not return_dict: + return (sample,) + + return UNet1DOutput(sample=sample) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_1d_blocks.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_1d_blocks.py new file mode 100644 index 0000000000000000000000000000000000000000..3c04bffeeacc134c94f7fb9633b28bc48fce8e55 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_1d_blocks.py @@ -0,0 +1,656 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import math + +import torch +import torch.nn.functional as F +from torch import nn + +from .activations import get_activation +from .resnet import Downsample1D, ResidualTemporalBlock1D, Upsample1D, rearrange_dims + + +class DownResnetBlock1D(nn.Module): + def __init__( + self, + in_channels, + out_channels=None, + num_layers=1, + conv_shortcut=False, + temb_channels=32, + groups=32, + groups_out=None, + non_linearity=None, + time_embedding_norm="default", + output_scale_factor=1.0, + add_downsample=True, + ): + super().__init__() + self.in_channels = in_channels + out_channels = in_channels if out_channels is None else out_channels + self.out_channels = out_channels + self.use_conv_shortcut = conv_shortcut + self.time_embedding_norm = time_embedding_norm + self.add_downsample = add_downsample + self.output_scale_factor = output_scale_factor + + if groups_out is None: + groups_out = groups + + # there will always be at least one resnet + resnets = [ResidualTemporalBlock1D(in_channels, out_channels, embed_dim=temb_channels)] + + for _ in range(num_layers): + resnets.append(ResidualTemporalBlock1D(out_channels, out_channels, embed_dim=temb_channels)) + + self.resnets = nn.ModuleList(resnets) + + if non_linearity is None: + self.nonlinearity = None + else: + self.nonlinearity = get_activation(non_linearity) + + self.downsample = None + if add_downsample: + self.downsample = Downsample1D(out_channels, use_conv=True, padding=1) + + def forward(self, hidden_states, temb=None): + output_states = () + + hidden_states = self.resnets[0](hidden_states, temb) + for resnet in self.resnets[1:]: + hidden_states = resnet(hidden_states, temb) + + output_states += (hidden_states,) + + if self.nonlinearity is not None: + hidden_states = self.nonlinearity(hidden_states) + + if self.downsample is not None: + hidden_states = self.downsample(hidden_states) + + return hidden_states, output_states + + +class UpResnetBlock1D(nn.Module): + def __init__( + self, + in_channels, + out_channels=None, + num_layers=1, + temb_channels=32, + groups=32, + groups_out=None, + non_linearity=None, + time_embedding_norm="default", + output_scale_factor=1.0, + add_upsample=True, + ): + super().__init__() + self.in_channels = in_channels + out_channels = in_channels if out_channels is None else out_channels + self.out_channels = out_channels + self.time_embedding_norm = time_embedding_norm + self.add_upsample = add_upsample + self.output_scale_factor = output_scale_factor + + if groups_out is None: + groups_out = groups + + # there will always be at least one resnet + resnets = [ResidualTemporalBlock1D(2 * in_channels, out_channels, embed_dim=temb_channels)] + + for _ in range(num_layers): + resnets.append(ResidualTemporalBlock1D(out_channels, out_channels, embed_dim=temb_channels)) + + self.resnets = nn.ModuleList(resnets) + + if non_linearity is None: + self.nonlinearity = None + else: + self.nonlinearity = get_activation(non_linearity) + + self.upsample = None + if add_upsample: + self.upsample = Upsample1D(out_channels, use_conv_transpose=True) + + def forward(self, hidden_states, res_hidden_states_tuple=None, temb=None): + if res_hidden_states_tuple is not None: + res_hidden_states = res_hidden_states_tuple[-1] + hidden_states = torch.cat((hidden_states, res_hidden_states), dim=1) + + hidden_states = self.resnets[0](hidden_states, temb) + for resnet in self.resnets[1:]: + hidden_states = resnet(hidden_states, temb) + + if self.nonlinearity is not None: + hidden_states = self.nonlinearity(hidden_states) + + if self.upsample is not None: + hidden_states = self.upsample(hidden_states) + + return hidden_states + + +class ValueFunctionMidBlock1D(nn.Module): + def __init__(self, in_channels, out_channels, embed_dim): + super().__init__() + self.in_channels = in_channels + self.out_channels = out_channels + self.embed_dim = embed_dim + + self.res1 = ResidualTemporalBlock1D(in_channels, in_channels // 2, embed_dim=embed_dim) + self.down1 = Downsample1D(out_channels // 2, use_conv=True) + self.res2 = ResidualTemporalBlock1D(in_channels // 2, in_channels // 4, embed_dim=embed_dim) + self.down2 = Downsample1D(out_channels // 4, use_conv=True) + + def forward(self, x, temb=None): + x = self.res1(x, temb) + x = self.down1(x) + x = self.res2(x, temb) + x = self.down2(x) + return x + + +class MidResTemporalBlock1D(nn.Module): + def __init__( + self, + in_channels, + out_channels, + embed_dim, + num_layers: int = 1, + add_downsample: bool = False, + add_upsample: bool = False, + non_linearity=None, + ): + super().__init__() + self.in_channels = in_channels + self.out_channels = out_channels + self.add_downsample = add_downsample + + # there will always be at least one resnet + resnets = [ResidualTemporalBlock1D(in_channels, out_channels, embed_dim=embed_dim)] + + for _ in range(num_layers): + resnets.append(ResidualTemporalBlock1D(out_channels, out_channels, embed_dim=embed_dim)) + + self.resnets = nn.ModuleList(resnets) + + if non_linearity is None: + self.nonlinearity = None + else: + self.nonlinearity = get_activation(non_linearity) + + self.upsample = None + if add_upsample: + self.upsample = Downsample1D(out_channels, use_conv=True) + + self.downsample = None + if add_downsample: + self.downsample = Downsample1D(out_channels, use_conv=True) + + if self.upsample and self.downsample: + raise ValueError("Block cannot downsample and upsample") + + def forward(self, hidden_states, temb): + hidden_states = self.resnets[0](hidden_states, temb) + for resnet in self.resnets[1:]: + hidden_states = resnet(hidden_states, temb) + + if self.upsample: + hidden_states = self.upsample(hidden_states) + if self.downsample: + self.downsample = self.downsample(hidden_states) + + return hidden_states + + +class OutConv1DBlock(nn.Module): + def __init__(self, num_groups_out, out_channels, embed_dim, act_fn): + super().__init__() + self.final_conv1d_1 = nn.Conv1d(embed_dim, embed_dim, 5, padding=2) + self.final_conv1d_gn = nn.GroupNorm(num_groups_out, embed_dim) + self.final_conv1d_act = get_activation(act_fn) + self.final_conv1d_2 = nn.Conv1d(embed_dim, out_channels, 1) + + def forward(self, hidden_states, temb=None): + hidden_states = self.final_conv1d_1(hidden_states) + hidden_states = rearrange_dims(hidden_states) + hidden_states = self.final_conv1d_gn(hidden_states) + hidden_states = rearrange_dims(hidden_states) + hidden_states = self.final_conv1d_act(hidden_states) + hidden_states = self.final_conv1d_2(hidden_states) + return hidden_states + + +class OutValueFunctionBlock(nn.Module): + def __init__(self, fc_dim, embed_dim): + super().__init__() + self.final_block = nn.ModuleList( + [ + nn.Linear(fc_dim + embed_dim, fc_dim // 2), + nn.Mish(), + nn.Linear(fc_dim // 2, 1), + ] + ) + + def forward(self, hidden_states, temb): + hidden_states = hidden_states.view(hidden_states.shape[0], -1) + hidden_states = torch.cat((hidden_states, temb), dim=-1) + for layer in self.final_block: + hidden_states = layer(hidden_states) + + return hidden_states + + +_kernels = { + "linear": [1 / 8, 3 / 8, 3 / 8, 1 / 8], + "cubic": [-0.01171875, -0.03515625, 0.11328125, 0.43359375, 0.43359375, 0.11328125, -0.03515625, -0.01171875], + "lanczos3": [ + 0.003689131001010537, + 0.015056144446134567, + -0.03399861603975296, + -0.066637322306633, + 0.13550527393817902, + 0.44638532400131226, + 0.44638532400131226, + 0.13550527393817902, + -0.066637322306633, + -0.03399861603975296, + 0.015056144446134567, + 0.003689131001010537, + ], +} + + +class Downsample1d(nn.Module): + def __init__(self, kernel="linear", pad_mode="reflect"): + super().__init__() + self.pad_mode = pad_mode + kernel_1d = torch.tensor(_kernels[kernel]) + self.pad = kernel_1d.shape[0] // 2 - 1 + self.register_buffer("kernel", kernel_1d) + + def forward(self, hidden_states): + hidden_states = F.pad(hidden_states, (self.pad,) * 2, self.pad_mode) + weight = hidden_states.new_zeros([hidden_states.shape[1], hidden_states.shape[1], self.kernel.shape[0]]) + indices = torch.arange(hidden_states.shape[1], device=hidden_states.device) + kernel = self.kernel.to(weight)[None, :].expand(hidden_states.shape[1], -1) + weight[indices, indices] = kernel + return F.conv1d(hidden_states, weight, stride=2) + + +class Upsample1d(nn.Module): + def __init__(self, kernel="linear", pad_mode="reflect"): + super().__init__() + self.pad_mode = pad_mode + kernel_1d = torch.tensor(_kernels[kernel]) * 2 + self.pad = kernel_1d.shape[0] // 2 - 1 + self.register_buffer("kernel", kernel_1d) + + def forward(self, hidden_states, temb=None): + hidden_states = F.pad(hidden_states, ((self.pad + 1) // 2,) * 2, self.pad_mode) + weight = hidden_states.new_zeros([hidden_states.shape[1], hidden_states.shape[1], self.kernel.shape[0]]) + indices = torch.arange(hidden_states.shape[1], device=hidden_states.device) + kernel = self.kernel.to(weight)[None, :].expand(hidden_states.shape[1], -1) + weight[indices, indices] = kernel + return F.conv_transpose1d(hidden_states, weight, stride=2, padding=self.pad * 2 + 1) + + +class SelfAttention1d(nn.Module): + def __init__(self, in_channels, n_head=1, dropout_rate=0.0): + super().__init__() + self.channels = in_channels + self.group_norm = nn.GroupNorm(1, num_channels=in_channels) + self.num_heads = n_head + + self.query = nn.Linear(self.channels, self.channels) + self.key = nn.Linear(self.channels, self.channels) + self.value = nn.Linear(self.channels, self.channels) + + self.proj_attn = nn.Linear(self.channels, self.channels, bias=True) + + self.dropout = nn.Dropout(dropout_rate, inplace=True) + + def transpose_for_scores(self, projection: torch.Tensor) -> torch.Tensor: + new_projection_shape = projection.size()[:-1] + (self.num_heads, -1) + # move heads to 2nd position (B, T, H * D) -> (B, T, H, D) -> (B, H, T, D) + new_projection = projection.view(new_projection_shape).permute(0, 2, 1, 3) + return new_projection + + def forward(self, hidden_states): + residual = hidden_states + batch, channel_dim, seq = hidden_states.shape + + hidden_states = self.group_norm(hidden_states) + hidden_states = hidden_states.transpose(1, 2) + + query_proj = self.query(hidden_states) + key_proj = self.key(hidden_states) + value_proj = self.value(hidden_states) + + query_states = self.transpose_for_scores(query_proj) + key_states = self.transpose_for_scores(key_proj) + value_states = self.transpose_for_scores(value_proj) + + scale = 1 / math.sqrt(math.sqrt(key_states.shape[-1])) + + attention_scores = torch.matmul(query_states * scale, key_states.transpose(-1, -2) * scale) + attention_probs = torch.softmax(attention_scores, dim=-1) + + # compute attention output + hidden_states = torch.matmul(attention_probs, value_states) + + hidden_states = hidden_states.permute(0, 2, 1, 3).contiguous() + new_hidden_states_shape = hidden_states.size()[:-2] + (self.channels,) + hidden_states = hidden_states.view(new_hidden_states_shape) + + # compute next hidden_states + hidden_states = self.proj_attn(hidden_states) + hidden_states = hidden_states.transpose(1, 2) + hidden_states = self.dropout(hidden_states) + + output = hidden_states + residual + + return output + + +class ResConvBlock(nn.Module): + def __init__(self, in_channels, mid_channels, out_channels, is_last=False): + super().__init__() + self.is_last = is_last + self.has_conv_skip = in_channels != out_channels + + if self.has_conv_skip: + self.conv_skip = nn.Conv1d(in_channels, out_channels, 1, bias=False) + + self.conv_1 = nn.Conv1d(in_channels, mid_channels, 5, padding=2) + self.group_norm_1 = nn.GroupNorm(1, mid_channels) + self.gelu_1 = nn.GELU() + self.conv_2 = nn.Conv1d(mid_channels, out_channels, 5, padding=2) + + if not self.is_last: + self.group_norm_2 = nn.GroupNorm(1, out_channels) + self.gelu_2 = nn.GELU() + + def forward(self, hidden_states): + residual = self.conv_skip(hidden_states) if self.has_conv_skip else hidden_states + + hidden_states = self.conv_1(hidden_states) + hidden_states = self.group_norm_1(hidden_states) + hidden_states = self.gelu_1(hidden_states) + hidden_states = self.conv_2(hidden_states) + + if not self.is_last: + hidden_states = self.group_norm_2(hidden_states) + hidden_states = self.gelu_2(hidden_states) + + output = hidden_states + residual + return output + + +class UNetMidBlock1D(nn.Module): + def __init__(self, mid_channels, in_channels, out_channels=None): + super().__init__() + + out_channels = in_channels if out_channels is None else out_channels + + # there is always at least one resnet + self.down = Downsample1d("cubic") + resnets = [ + ResConvBlock(in_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, out_channels), + ] + attentions = [ + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(out_channels, out_channels // 32), + ] + self.up = Upsample1d(kernel="cubic") + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + def forward(self, hidden_states, temb=None): + hidden_states = self.down(hidden_states) + for attn, resnet in zip(self.attentions, self.resnets): + hidden_states = resnet(hidden_states) + hidden_states = attn(hidden_states) + + hidden_states = self.up(hidden_states) + + return hidden_states + + +class AttnDownBlock1D(nn.Module): + def __init__(self, out_channels, in_channels, mid_channels=None): + super().__init__() + mid_channels = out_channels if mid_channels is None else mid_channels + + self.down = Downsample1d("cubic") + resnets = [ + ResConvBlock(in_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, out_channels), + ] + attentions = [ + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(out_channels, out_channels // 32), + ] + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + def forward(self, hidden_states, temb=None): + hidden_states = self.down(hidden_states) + + for resnet, attn in zip(self.resnets, self.attentions): + hidden_states = resnet(hidden_states) + hidden_states = attn(hidden_states) + + return hidden_states, (hidden_states,) + + +class DownBlock1D(nn.Module): + def __init__(self, out_channels, in_channels, mid_channels=None): + super().__init__() + mid_channels = out_channels if mid_channels is None else mid_channels + + self.down = Downsample1d("cubic") + resnets = [ + ResConvBlock(in_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, out_channels), + ] + + self.resnets = nn.ModuleList(resnets) + + def forward(self, hidden_states, temb=None): + hidden_states = self.down(hidden_states) + + for resnet in self.resnets: + hidden_states = resnet(hidden_states) + + return hidden_states, (hidden_states,) + + +class DownBlock1DNoSkip(nn.Module): + def __init__(self, out_channels, in_channels, mid_channels=None): + super().__init__() + mid_channels = out_channels if mid_channels is None else mid_channels + + resnets = [ + ResConvBlock(in_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, out_channels), + ] + + self.resnets = nn.ModuleList(resnets) + + def forward(self, hidden_states, temb=None): + hidden_states = torch.cat([hidden_states, temb], dim=1) + for resnet in self.resnets: + hidden_states = resnet(hidden_states) + + return hidden_states, (hidden_states,) + + +class AttnUpBlock1D(nn.Module): + def __init__(self, in_channels, out_channels, mid_channels=None): + super().__init__() + mid_channels = out_channels if mid_channels is None else mid_channels + + resnets = [ + ResConvBlock(2 * in_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, out_channels), + ] + attentions = [ + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(mid_channels, mid_channels // 32), + SelfAttention1d(out_channels, out_channels // 32), + ] + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + self.up = Upsample1d(kernel="cubic") + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None): + res_hidden_states = res_hidden_states_tuple[-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + for resnet, attn in zip(self.resnets, self.attentions): + hidden_states = resnet(hidden_states) + hidden_states = attn(hidden_states) + + hidden_states = self.up(hidden_states) + + return hidden_states + + +class UpBlock1D(nn.Module): + def __init__(self, in_channels, out_channels, mid_channels=None): + super().__init__() + mid_channels = in_channels if mid_channels is None else mid_channels + + resnets = [ + ResConvBlock(2 * in_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, out_channels), + ] + + self.resnets = nn.ModuleList(resnets) + self.up = Upsample1d(kernel="cubic") + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None): + res_hidden_states = res_hidden_states_tuple[-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + for resnet in self.resnets: + hidden_states = resnet(hidden_states) + + hidden_states = self.up(hidden_states) + + return hidden_states + + +class UpBlock1DNoSkip(nn.Module): + def __init__(self, in_channels, out_channels, mid_channels=None): + super().__init__() + mid_channels = in_channels if mid_channels is None else mid_channels + + resnets = [ + ResConvBlock(2 * in_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, mid_channels), + ResConvBlock(mid_channels, mid_channels, out_channels, is_last=True), + ] + + self.resnets = nn.ModuleList(resnets) + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None): + res_hidden_states = res_hidden_states_tuple[-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + for resnet in self.resnets: + hidden_states = resnet(hidden_states) + + return hidden_states + + +def get_down_block(down_block_type, num_layers, in_channels, out_channels, temb_channels, add_downsample): + if down_block_type == "DownResnetBlock1D": + return DownResnetBlock1D( + in_channels=in_channels, + num_layers=num_layers, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + ) + elif down_block_type == "DownBlock1D": + return DownBlock1D(out_channels=out_channels, in_channels=in_channels) + elif down_block_type == "AttnDownBlock1D": + return AttnDownBlock1D(out_channels=out_channels, in_channels=in_channels) + elif down_block_type == "DownBlock1DNoSkip": + return DownBlock1DNoSkip(out_channels=out_channels, in_channels=in_channels) + raise ValueError(f"{down_block_type} does not exist.") + + +def get_up_block(up_block_type, num_layers, in_channels, out_channels, temb_channels, add_upsample): + if up_block_type == "UpResnetBlock1D": + return UpResnetBlock1D( + in_channels=in_channels, + num_layers=num_layers, + out_channels=out_channels, + temb_channels=temb_channels, + add_upsample=add_upsample, + ) + elif up_block_type == "UpBlock1D": + return UpBlock1D(in_channels=in_channels, out_channels=out_channels) + elif up_block_type == "AttnUpBlock1D": + return AttnUpBlock1D(in_channels=in_channels, out_channels=out_channels) + elif up_block_type == "UpBlock1DNoSkip": + return UpBlock1DNoSkip(in_channels=in_channels, out_channels=out_channels) + raise ValueError(f"{up_block_type} does not exist.") + + +def get_mid_block(mid_block_type, num_layers, in_channels, mid_channels, out_channels, embed_dim, add_downsample): + if mid_block_type == "MidResTemporalBlock1D": + return MidResTemporalBlock1D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + embed_dim=embed_dim, + add_downsample=add_downsample, + ) + elif mid_block_type == "ValueFunctionMidBlock1D": + return ValueFunctionMidBlock1D(in_channels=in_channels, out_channels=out_channels, embed_dim=embed_dim) + elif mid_block_type == "UNetMidBlock1D": + return UNetMidBlock1D(in_channels=in_channels, mid_channels=mid_channels, out_channels=out_channels) + raise ValueError(f"{mid_block_type} does not exist.") + + +def get_out_block(*, out_block_type, num_groups_out, embed_dim, out_channels, act_fn, fc_dim): + if out_block_type == "OutConv1DBlock": + return OutConv1DBlock(num_groups_out, out_channels, embed_dim, act_fn) + elif out_block_type == "ValueFunction": + return OutValueFunctionBlock(fc_dim, embed_dim) + return None diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_2d.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d.py new file mode 100644 index 0000000000000000000000000000000000000000..3b17acd3d829519465ec0d8daa41b16184aa70f2 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d.py @@ -0,0 +1,329 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import torch +import torch.nn as nn + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput +from .embeddings import GaussianFourierProjection, TimestepEmbedding, Timesteps +from .modeling_utils import ModelMixin +from .unet_2d_blocks import UNetMidBlock2D, get_down_block, get_up_block + + +@dataclass +class UNet2DOutput(BaseOutput): + """ + The output of [`UNet2DModel`]. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)`): + The hidden states output from the last layer of the model. + """ + + sample: torch.FloatTensor + + +class UNet2DModel(ModelMixin, ConfigMixin): + r""" + A 2D UNet model that takes a noisy sample and a timestep and returns a sample shaped output. + + This model inherits from [`ModelMixin`]. Check the superclass documentation for it's generic methods implemented + for all models (such as downloading or saving). + + Parameters: + sample_size (`int` or `Tuple[int, int]`, *optional*, defaults to `None`): + Height and width of input/output sample. Dimensions must be a multiple of `2 ** (len(block_out_channels) - + 1)`. + in_channels (`int`, *optional*, defaults to 3): Number of channels in the input sample. + out_channels (`int`, *optional*, defaults to 3): Number of channels in the output. + center_input_sample (`bool`, *optional*, defaults to `False`): Whether to center the input sample. + time_embedding_type (`str`, *optional*, defaults to `"positional"`): Type of time embedding to use. + freq_shift (`int`, *optional*, defaults to 0): Frequency shift for Fourier time embedding. + flip_sin_to_cos (`bool`, *optional*, defaults to `True`): + Whether to flip sin to cos for Fourier time embedding. + down_block_types (`Tuple[str]`, *optional*, defaults to `("DownBlock2D", "AttnDownBlock2D", "AttnDownBlock2D", "AttnDownBlock2D")`): + Tuple of downsample block types. + mid_block_type (`str`, *optional*, defaults to `"UNetMidBlock2D"`): + Block type for middle of UNet, it can be either `UNetMidBlock2D` or `UnCLIPUNetMidBlock2D`. + up_block_types (`Tuple[str]`, *optional*, defaults to `("AttnUpBlock2D", "AttnUpBlock2D", "AttnUpBlock2D", "UpBlock2D")`): + Tuple of upsample block types. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(224, 448, 672, 896)`): + Tuple of block output channels. + layers_per_block (`int`, *optional*, defaults to `2`): The number of layers per block. + mid_block_scale_factor (`float`, *optional*, defaults to `1`): The scale factor for the mid block. + downsample_padding (`int`, *optional*, defaults to `1`): The padding for the downsample convolution. + downsample_type (`str`, *optional*, defaults to `conv`): + The downsample type for downsampling layers. Choose between "conv" and "resnet" + upsample_type (`str`, *optional*, defaults to `conv`): + The upsample type for upsampling layers. Choose between "conv" and "resnet" + act_fn (`str`, *optional*, defaults to `"silu"`): The activation function to use. + attention_head_dim (`int`, *optional*, defaults to `8`): The attention head dimension. + norm_num_groups (`int`, *optional*, defaults to `32`): The number of groups for normalization. + norm_eps (`float`, *optional*, defaults to `1e-5`): The epsilon for normalization. + resnet_time_scale_shift (`str`, *optional*, defaults to `"default"`): Time scale shift config + for ResNet blocks (see [`~models.resnet.ResnetBlock2D`]). Choose from `default` or `scale_shift`. + class_embed_type (`str`, *optional*, defaults to `None`): + The type of class embedding to use which is ultimately summed with the time embeddings. Choose from `None`, + `"timestep"`, or `"identity"`. + num_class_embeds (`int`, *optional*, defaults to `None`): + Input dimension of the learnable embedding matrix to be projected to `time_embed_dim` when performing class + conditioning with `class_embed_type` equal to `None`. + """ + + @register_to_config + def __init__( + self, + sample_size: Optional[Union[int, Tuple[int, int]]] = None, + in_channels: int = 3, + out_channels: int = 3, + center_input_sample: bool = False, + time_embedding_type: str = "positional", + freq_shift: int = 0, + flip_sin_to_cos: bool = True, + down_block_types: Tuple[str] = ("DownBlock2D", "AttnDownBlock2D", "AttnDownBlock2D", "AttnDownBlock2D"), + up_block_types: Tuple[str] = ("AttnUpBlock2D", "AttnUpBlock2D", "AttnUpBlock2D", "UpBlock2D"), + block_out_channels: Tuple[int] = (224, 448, 672, 896), + layers_per_block: int = 2, + mid_block_scale_factor: float = 1, + downsample_padding: int = 1, + downsample_type: str = "conv", + upsample_type: str = "conv", + act_fn: str = "silu", + attention_head_dim: Optional[int] = 8, + norm_num_groups: int = 32, + norm_eps: float = 1e-5, + resnet_time_scale_shift: str = "default", + add_attention: bool = True, + class_embed_type: Optional[str] = None, + num_class_embeds: Optional[int] = None, + ): + super().__init__() + + self.sample_size = sample_size + time_embed_dim = block_out_channels[0] * 4 + + # Check inputs + if len(down_block_types) != len(up_block_types): + raise ValueError( + f"Must provide the same number of `down_block_types` as `up_block_types`. `down_block_types`: {down_block_types}. `up_block_types`: {up_block_types}." + ) + + if len(block_out_channels) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `block_out_channels` as `down_block_types`. `block_out_channels`: {block_out_channels}. `down_block_types`: {down_block_types}." + ) + + # input + self.conv_in = nn.Conv2d(in_channels, block_out_channels[0], kernel_size=3, padding=(1, 1)) + + # time + if time_embedding_type == "fourier": + self.time_proj = GaussianFourierProjection(embedding_size=block_out_channels[0], scale=16) + timestep_input_dim = 2 * block_out_channels[0] + elif time_embedding_type == "positional": + self.time_proj = Timesteps(block_out_channels[0], flip_sin_to_cos, freq_shift) + timestep_input_dim = block_out_channels[0] + + self.time_embedding = TimestepEmbedding(timestep_input_dim, time_embed_dim) + + # class embedding + if class_embed_type is None and num_class_embeds is not None: + self.class_embedding = nn.Embedding(num_class_embeds, time_embed_dim) + elif class_embed_type == "timestep": + self.class_embedding = TimestepEmbedding(timestep_input_dim, time_embed_dim) + elif class_embed_type == "identity": + self.class_embedding = nn.Identity(time_embed_dim, time_embed_dim) + else: + self.class_embedding = None + + self.down_blocks = nn.ModuleList([]) + self.mid_block = None + self.up_blocks = nn.ModuleList([]) + + # down + output_channel = block_out_channels[0] + for i, down_block_type in enumerate(down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + down_block = get_down_block( + down_block_type, + num_layers=layers_per_block, + in_channels=input_channel, + out_channels=output_channel, + temb_channels=time_embed_dim, + add_downsample=not is_final_block, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + attention_head_dim=attention_head_dim if attention_head_dim is not None else output_channel, + downsample_padding=downsample_padding, + resnet_time_scale_shift=resnet_time_scale_shift, + downsample_type=downsample_type, + ) + self.down_blocks.append(down_block) + + # mid + self.mid_block = UNetMidBlock2D( + in_channels=block_out_channels[-1], + temb_channels=time_embed_dim, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + output_scale_factor=mid_block_scale_factor, + resnet_time_scale_shift=resnet_time_scale_shift, + attention_head_dim=attention_head_dim if attention_head_dim is not None else block_out_channels[-1], + resnet_groups=norm_num_groups, + add_attention=add_attention, + ) + + # up + reversed_block_out_channels = list(reversed(block_out_channels)) + output_channel = reversed_block_out_channels[0] + for i, up_block_type in enumerate(up_block_types): + prev_output_channel = output_channel + output_channel = reversed_block_out_channels[i] + input_channel = reversed_block_out_channels[min(i + 1, len(block_out_channels) - 1)] + + is_final_block = i == len(block_out_channels) - 1 + + up_block = get_up_block( + up_block_type, + num_layers=layers_per_block + 1, + in_channels=input_channel, + out_channels=output_channel, + prev_output_channel=prev_output_channel, + temb_channels=time_embed_dim, + add_upsample=not is_final_block, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + attention_head_dim=attention_head_dim if attention_head_dim is not None else output_channel, + resnet_time_scale_shift=resnet_time_scale_shift, + upsample_type=upsample_type, + ) + self.up_blocks.append(up_block) + prev_output_channel = output_channel + + # out + num_groups_out = norm_num_groups if norm_num_groups is not None else min(block_out_channels[0] // 4, 32) + self.conv_norm_out = nn.GroupNorm(num_channels=block_out_channels[0], num_groups=num_groups_out, eps=norm_eps) + self.conv_act = nn.SiLU() + self.conv_out = nn.Conv2d(block_out_channels[0], out_channels, kernel_size=3, padding=1) + + def forward( + self, + sample: torch.FloatTensor, + timestep: Union[torch.Tensor, float, int], + class_labels: Optional[torch.Tensor] = None, + return_dict: bool = True, + ) -> Union[UNet2DOutput, Tuple]: + r""" + The [`UNet2DModel`] forward method. + + Args: + sample (`torch.FloatTensor`): + The noisy input tensor with the following shape `(batch, channel, height, width)`. + timestep (`torch.FloatTensor` or `float` or `int`): The number of timesteps to denoise an input. + class_labels (`torch.FloatTensor`, *optional*, defaults to `None`): + Optional class labels for conditioning. Their embeddings will be summed with the timestep embeddings. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.unet_2d.UNet2DOutput`] instead of a plain tuple. + + Returns: + [`~models.unet_2d.UNet2DOutput`] or `tuple`: + If `return_dict` is True, an [`~models.unet_2d.UNet2DOutput`] is returned, otherwise a `tuple` is + returned where the first element is the sample tensor. + """ + # 0. center input if necessary + if self.config.center_input_sample: + sample = 2 * sample - 1.0 + + # 1. time + timesteps = timestep + if not torch.is_tensor(timesteps): + timesteps = torch.tensor([timesteps], dtype=torch.long, device=sample.device) + elif torch.is_tensor(timesteps) and len(timesteps.shape) == 0: + timesteps = timesteps[None].to(sample.device) + + # broadcast to batch dimension in a way that's compatible with ONNX/Core ML + timesteps = timesteps * torch.ones(sample.shape[0], dtype=timesteps.dtype, device=timesteps.device) + + t_emb = self.time_proj(timesteps) + + # timesteps does not contain any weights and will always return f32 tensors + # but time_embedding might actually be running in fp16. so we need to cast here. + # there might be better ways to encapsulate this. + t_emb = t_emb.to(dtype=self.dtype) + emb = self.time_embedding(t_emb) + + if self.class_embedding is not None: + if class_labels is None: + raise ValueError("class_labels should be provided when doing class conditioning") + + if self.config.class_embed_type == "timestep": + class_labels = self.time_proj(class_labels) + + class_emb = self.class_embedding(class_labels).to(dtype=self.dtype) + emb = emb + class_emb + + # 2. pre-process + skip_sample = sample + sample = self.conv_in(sample) + + # 3. down + down_block_res_samples = (sample,) + for downsample_block in self.down_blocks: + if hasattr(downsample_block, "skip_conv"): + sample, res_samples, skip_sample = downsample_block( + hidden_states=sample, temb=emb, skip_sample=skip_sample + ) + else: + sample, res_samples = downsample_block(hidden_states=sample, temb=emb) + + down_block_res_samples += res_samples + + # 4. mid + sample = self.mid_block(sample, emb) + + # 5. up + skip_sample = None + for upsample_block in self.up_blocks: + res_samples = down_block_res_samples[-len(upsample_block.resnets) :] + down_block_res_samples = down_block_res_samples[: -len(upsample_block.resnets)] + + if hasattr(upsample_block, "skip_conv"): + sample, skip_sample = upsample_block(sample, res_samples, emb, skip_sample) + else: + sample = upsample_block(sample, res_samples, emb) + + # 6. post-process + sample = self.conv_norm_out(sample) + sample = self.conv_act(sample) + sample = self.conv_out(sample) + + if skip_sample is not None: + sample += skip_sample + + if self.config.time_embedding_type == "fourier": + timesteps = timesteps.reshape((sample.shape[0], *([1] * len(sample.shape[1:])))) + sample = sample / timesteps + + if not return_dict: + return (sample,) + + return UNet2DOutput(sample=sample) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_blocks.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_blocks.py new file mode 100644 index 0000000000000000000000000000000000000000..cb3452f4459c32b1b508f2448bbfe7460d519a04 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_blocks.py @@ -0,0 +1,3175 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Any, Dict, Optional, Tuple + +import numpy as np +import torch +import torch.nn.functional as F +from torch import nn + +from ..utils import is_torch_version, logging +from .attention import AdaGroupNorm +from .attention_processor import Attention, AttnAddedKVProcessor, AttnAddedKVProcessor2_0 +from .dual_transformer_2d import DualTransformer2DModel +from .resnet import Downsample2D, FirDownsample2D, FirUpsample2D, KDownsample2D, KUpsample2D, ResnetBlock2D, Upsample2D +from .transformer_2d import Transformer2DModel + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +def get_down_block( + down_block_type, + num_layers, + in_channels, + out_channels, + temb_channels, + add_downsample, + resnet_eps, + resnet_act_fn, + transformer_layers_per_block=1, + num_attention_heads=None, + resnet_groups=None, + cross_attention_dim=None, + downsample_padding=None, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + resnet_time_scale_shift="default", + resnet_skip_time_act=False, + resnet_out_scale_factor=1.0, + cross_attention_norm=None, + attention_head_dim=None, + downsample_type=None, +): + # If attn head dim is not defined, we default it to the number of heads + if attention_head_dim is None: + logger.warn( + f"It is recommended to provide `attention_head_dim` when calling `get_down_block`. Defaulting `attention_head_dim` to {num_attention_heads}." + ) + attention_head_dim = num_attention_heads + + down_block_type = down_block_type[7:] if down_block_type.startswith("UNetRes") else down_block_type + if down_block_type == "DownBlock2D": + return DownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "ResnetDownsampleBlock2D": + return ResnetDownsampleBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + output_scale_factor=resnet_out_scale_factor, + ) + elif down_block_type == "AttnDownBlock2D": + if add_downsample is False: + downsample_type = None + else: + downsample_type = downsample_type or "conv" # default to 'conv' + return AttnDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + downsample_type=downsample_type, + ) + elif down_block_type == "CrossAttnDownBlock2D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for CrossAttnDownBlock2D") + return CrossAttnDownBlock2D( + num_layers=num_layers, + transformer_layers_per_block=transformer_layers_per_block, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "SimpleCrossAttnDownBlock2D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for SimpleCrossAttnDownBlock2D") + return SimpleCrossAttnDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + cross_attention_dim=cross_attention_dim, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + output_scale_factor=resnet_out_scale_factor, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + ) + elif down_block_type == "SkipDownBlock2D": + return SkipDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + downsample_padding=downsample_padding, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "AttnSkipDownBlock2D": + return AttnSkipDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "DownEncoderBlock2D": + return DownEncoderBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "AttnDownEncoderBlock2D": + return AttnDownEncoderBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "KDownBlock2D": + return KDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + ) + elif down_block_type == "KCrossAttnDownBlock2D": + return KCrossAttnDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + cross_attention_dim=cross_attention_dim, + attention_head_dim=attention_head_dim, + add_self_attention=True if not add_downsample else False, + ) + raise ValueError(f"{down_block_type} does not exist.") + + +def get_up_block( + up_block_type, + num_layers, + in_channels, + out_channels, + prev_output_channel, + temb_channels, + add_upsample, + resnet_eps, + resnet_act_fn, + transformer_layers_per_block=1, + num_attention_heads=None, + resnet_groups=None, + cross_attention_dim=None, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + resnet_time_scale_shift="default", + resnet_skip_time_act=False, + resnet_out_scale_factor=1.0, + cross_attention_norm=None, + attention_head_dim=None, + upsample_type=None, +): + # If attn head dim is not defined, we default it to the number of heads + if attention_head_dim is None: + logger.warn( + f"It is recommended to provide `attention_head_dim` when calling `get_up_block`. Defaulting `attention_head_dim` to {num_attention_heads}." + ) + attention_head_dim = num_attention_heads + + up_block_type = up_block_type[7:] if up_block_type.startswith("UNetRes") else up_block_type + if up_block_type == "UpBlock2D": + return UpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "ResnetUpsampleBlock2D": + return ResnetUpsampleBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + output_scale_factor=resnet_out_scale_factor, + ) + elif up_block_type == "CrossAttnUpBlock2D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for CrossAttnUpBlock2D") + return CrossAttnUpBlock2D( + num_layers=num_layers, + transformer_layers_per_block=transformer_layers_per_block, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "SimpleCrossAttnUpBlock2D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for SimpleCrossAttnUpBlock2D") + return SimpleCrossAttnUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + cross_attention_dim=cross_attention_dim, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + output_scale_factor=resnet_out_scale_factor, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + ) + elif up_block_type == "AttnUpBlock2D": + if add_upsample is False: + upsample_type = None + else: + upsample_type = upsample_type or "conv" # default to 'conv' + + return AttnUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + upsample_type=upsample_type, + ) + elif up_block_type == "SkipUpBlock2D": + return SkipUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "AttnSkipUpBlock2D": + return AttnSkipUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "UpDecoderBlock2D": + return UpDecoderBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + temb_channels=temb_channels, + ) + elif up_block_type == "AttnUpDecoderBlock2D": + return AttnUpDecoderBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + temb_channels=temb_channels, + ) + elif up_block_type == "KUpBlock2D": + return KUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + ) + elif up_block_type == "KCrossAttnUpBlock2D": + return KCrossAttnUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + cross_attention_dim=cross_attention_dim, + attention_head_dim=attention_head_dim, + ) + + raise ValueError(f"{up_block_type} does not exist.") + + +class UNetMidBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", # default, spatial + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + add_attention: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + ): + super().__init__() + resnet_groups = resnet_groups if resnet_groups is not None else min(in_channels // 4, 32) + self.add_attention = add_attention + + # there is always at least one resnet + resnets = [ + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ] + attentions = [] + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {in_channels}." + ) + attention_head_dim = in_channels + + for _ in range(num_layers): + if self.add_attention: + attentions.append( + Attention( + in_channels, + heads=in_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups if resnet_time_scale_shift == "default" else None, + spatial_norm_dim=temb_channels if resnet_time_scale_shift == "spatial" else None, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + else: + attentions.append(None) + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + def forward(self, hidden_states, temb=None): + hidden_states = self.resnets[0](hidden_states, temb) + for attn, resnet in zip(self.attentions, self.resnets[1:]): + if attn is not None: + hidden_states = attn(hidden_states, temb=temb) + hidden_states = resnet(hidden_states, temb) + + return hidden_states + + +class UNetMidBlock2DCrossAttn(nn.Module): + def __init__( + self, + in_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + transformer_layers_per_block: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + output_scale_factor=1.0, + cross_attention_dim=1280, + dual_cross_attention=False, + use_linear_projection=False, + upcast_attention=False, + ): + super().__init__() + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + resnet_groups = resnet_groups if resnet_groups is not None else min(in_channels // 4, 32) + + # there is always at least one resnet + resnets = [ + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ] + attentions = [] + + for _ in range(num_layers): + if not dual_cross_attention: + attentions.append( + Transformer2DModel( + num_attention_heads, + in_channels // num_attention_heads, + in_channels=in_channels, + num_layers=transformer_layers_per_block, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + upcast_attention=upcast_attention, + ) + ) + else: + attentions.append( + DualTransformer2DModel( + num_attention_heads, + in_channels // num_attention_heads, + in_channels=in_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ) -> torch.FloatTensor: + hidden_states = self.resnets[0](hidden_states, temb) + for attn, resnet in zip(self.attentions, self.resnets[1:]): + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + attention_mask=attention_mask, + encoder_attention_mask=encoder_attention_mask, + return_dict=False, + )[0] + hidden_states = resnet(hidden_states, temb) + + return hidden_states + + +class UNetMidBlock2DSimpleCrossAttn(nn.Module): + def __init__( + self, + in_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + cross_attention_dim=1280, + skip_time_act=False, + only_cross_attention=False, + cross_attention_norm=None, + ): + super().__init__() + + self.has_cross_attention = True + + self.attention_head_dim = attention_head_dim + resnet_groups = resnet_groups if resnet_groups is not None else min(in_channels // 4, 32) + + self.num_heads = in_channels // self.attention_head_dim + + # there is always at least one resnet + resnets = [ + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ] + attentions = [] + + for _ in range(num_layers): + processor = ( + AttnAddedKVProcessor2_0() if hasattr(F, "scaled_dot_product_attention") else AttnAddedKVProcessor() + ) + + attentions.append( + Attention( + query_dim=in_channels, + cross_attention_dim=in_channels, + heads=self.num_heads, + dim_head=self.attention_head_dim, + added_kv_proj_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + bias=True, + upcast_softmax=True, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + processor=processor, + ) + ) + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + if attention_mask is None: + # if encoder_hidden_states is defined: we are doing cross-attn, so we should use cross-attn mask. + mask = None if encoder_hidden_states is None else encoder_attention_mask + else: + # when attention_mask is defined: we don't even check for encoder_attention_mask. + # this is to maintain compatibility with UnCLIP, which uses 'attention_mask' param for cross-attn masks. + # TODO: UnCLIP should express cross-attn mask via encoder_attention_mask param instead of via attention_mask. + # then we can simplify this whole if/else block to: + # mask = attention_mask if encoder_hidden_states is None else encoder_attention_mask + mask = attention_mask + + hidden_states = self.resnets[0](hidden_states, temb) + for attn, resnet in zip(self.attentions, self.resnets[1:]): + # attn + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=mask, + **cross_attention_kwargs, + ) + + # resnet + hidden_states = resnet(hidden_states, temb) + + return hidden_states + + +class AttnDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + downsample_padding=1, + downsample_type="conv", + ): + super().__init__() + resnets = [] + attentions = [] + self.downsample_type = downsample_type + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if downsample_type == "conv": + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + elif downsample_type == "resnet": + self.downsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + down=True, + ) + ] + ) + else: + self.downsamplers = None + + def forward(self, hidden_states, temb=None, upsample_size=None): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + hidden_states = resnet(hidden_states, temb) + hidden_states = attn(hidden_states) + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + if self.downsample_type == "resnet": + hidden_states = downsampler(hidden_states, temb=temb) + else: + hidden_states = downsampler(hidden_states) + + output_states += (hidden_states,) + + return hidden_states, output_states + + +class CrossAttnDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + transformer_layers_per_block: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + downsample_padding=1, + add_downsample=True, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + ): + super().__init__() + resnets = [] + attentions = [] + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + if not dual_cross_attention: + attentions.append( + Transformer2DModel( + num_attention_heads, + out_channels // num_attention_heads, + in_channels=out_channels, + num_layers=transformer_layers_per_block, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + ) + ) + else: + attentions.append( + DualTransformer2DModel( + num_attention_heads, + out_channels // num_attention_heads, + in_channels=out_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {} + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), + hidden_states, + temb, + **ckpt_kwargs, + ) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + None, # timestep + None, # class_labels + cross_attention_kwargs, + attention_mask, + encoder_attention_mask, + **ckpt_kwargs, + )[0] + else: + hidden_states = resnet(hidden_states, temb) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + attention_mask=attention_mask, + encoder_attention_mask=encoder_attention_mask, + return_dict=False, + )[0] + + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + output_states = output_states + (hidden_states,) + + return hidden_states, output_states + + +class DownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, temb=None): + output_states = () + + for resnet in self.resnets: + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + hidden_states = resnet(hidden_states, temb) + + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + output_states = output_states + (hidden_states,) + + return hidden_states, output_states + + +class DownEncoderBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=None, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + def forward(self, hidden_states): + for resnet in self.resnets: + hidden_states = resnet(hidden_states, temb=None) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + return hidden_states + + +class AttnDownEncoderBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + resnets = [] + attentions = [] + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=None, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + def forward(self, hidden_states): + for resnet, attn in zip(self.resnets, self.attentions): + hidden_states = resnet(hidden_states, temb=None) + hidden_states = attn(hidden_states) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + return hidden_states + + +class AttnSkipDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=np.sqrt(2.0), + add_downsample=True, + ): + super().__init__() + self.attentions = nn.ModuleList([]) + self.resnets = nn.ModuleList([]) + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + self.resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(in_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + self.attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=32, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + if add_downsample: + self.resnet_down = ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + use_in_shortcut=True, + down=True, + kernel="fir", + ) + self.downsamplers = nn.ModuleList([FirDownsample2D(out_channels, out_channels=out_channels)]) + self.skip_conv = nn.Conv2d(3, out_channels, kernel_size=(1, 1), stride=(1, 1)) + else: + self.resnet_down = None + self.downsamplers = None + self.skip_conv = None + + def forward(self, hidden_states, temb=None, skip_sample=None): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + hidden_states = resnet(hidden_states, temb) + hidden_states = attn(hidden_states) + output_states += (hidden_states,) + + if self.downsamplers is not None: + hidden_states = self.resnet_down(hidden_states, temb) + for downsampler in self.downsamplers: + skip_sample = downsampler(skip_sample) + + hidden_states = self.skip_conv(skip_sample) + hidden_states + + output_states += (hidden_states,) + + return hidden_states, output_states, skip_sample + + +class SkipDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_pre_norm: bool = True, + output_scale_factor=np.sqrt(2.0), + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + self.resnets = nn.ModuleList([]) + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + self.resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(in_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + if add_downsample: + self.resnet_down = ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + use_in_shortcut=True, + down=True, + kernel="fir", + ) + self.downsamplers = nn.ModuleList([FirDownsample2D(out_channels, out_channels=out_channels)]) + self.skip_conv = nn.Conv2d(3, out_channels, kernel_size=(1, 1), stride=(1, 1)) + else: + self.resnet_down = None + self.downsamplers = None + self.skip_conv = None + + def forward(self, hidden_states, temb=None, skip_sample=None): + output_states = () + + for resnet in self.resnets: + hidden_states = resnet(hidden_states, temb) + output_states += (hidden_states,) + + if self.downsamplers is not None: + hidden_states = self.resnet_down(hidden_states, temb) + for downsampler in self.downsamplers: + skip_sample = downsampler(skip_sample) + + hidden_states = self.skip_conv(skip_sample) + hidden_states + + output_states += (hidden_states,) + + return hidden_states, output_states, skip_sample + + +class ResnetDownsampleBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_downsample=True, + skip_time_act=False, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + down=True, + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, temb=None): + output_states = () + + for resnet in self.resnets: + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + hidden_states = resnet(hidden_states, temb) + + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states, temb) + + output_states = output_states + (hidden_states,) + + return hidden_states, output_states + + +class SimpleCrossAttnDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + add_downsample=True, + skip_time_act=False, + only_cross_attention=False, + cross_attention_norm=None, + ): + super().__init__() + + self.has_cross_attention = True + + resnets = [] + attentions = [] + + self.attention_head_dim = attention_head_dim + self.num_heads = out_channels // self.attention_head_dim + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + processor = ( + AttnAddedKVProcessor2_0() if hasattr(F, "scaled_dot_product_attention") else AttnAddedKVProcessor() + ) + + attentions.append( + Attention( + query_dim=out_channels, + cross_attention_dim=out_channels, + heads=self.num_heads, + dim_head=attention_head_dim, + added_kv_proj_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + bias=True, + upcast_softmax=True, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + processor=processor, + ) + ) + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + down=True, + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + output_states = () + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + if attention_mask is None: + # if encoder_hidden_states is defined: we are doing cross-attn, so we should use cross-attn mask. + mask = None if encoder_hidden_states is None else encoder_attention_mask + else: + # when attention_mask is defined: we don't even check for encoder_attention_mask. + # this is to maintain compatibility with UnCLIP, which uses 'attention_mask' param for cross-attn masks. + # TODO: UnCLIP should express cross-attn mask via encoder_attention_mask param instead of via attention_mask. + # then we can simplify this whole if/else block to: + # mask = attention_mask if encoder_hidden_states is None else encoder_attention_mask + mask = attention_mask + + for resnet, attn in zip(self.resnets, self.attentions): + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + hidden_states = torch.utils.checkpoint.checkpoint(create_custom_forward(resnet), hidden_states, temb) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + mask, + cross_attention_kwargs, + )[0] + else: + hidden_states = resnet(hidden_states, temb) + + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=mask, + **cross_attention_kwargs, + ) + + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states, temb) + + output_states = output_states + (hidden_states,) + + return hidden_states, output_states + + +class KDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 4, + resnet_eps: float = 1e-5, + resnet_act_fn: str = "gelu", + resnet_group_size: int = 32, + add_downsample=False, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + groups = in_channels // resnet_group_size + groups_out = out_channels // resnet_group_size + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + dropout=dropout, + temb_channels=temb_channels, + groups=groups, + groups_out=groups_out, + eps=resnet_eps, + non_linearity=resnet_act_fn, + time_embedding_norm="ada_group", + conv_shortcut_bias=False, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + # YiYi's comments- might be able to use FirDownsample2D, look into details later + self.downsamplers = nn.ModuleList([KDownsample2D()]) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, temb=None): + output_states = () + + for resnet in self.resnets: + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + hidden_states = resnet(hidden_states, temb) + + output_states += (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + return hidden_states, output_states + + +class KCrossAttnDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + cross_attention_dim: int, + dropout: float = 0.0, + num_layers: int = 4, + resnet_group_size: int = 32, + add_downsample=True, + attention_head_dim: int = 64, + add_self_attention: bool = False, + resnet_eps: float = 1e-5, + resnet_act_fn: str = "gelu", + ): + super().__init__() + resnets = [] + attentions = [] + + self.has_cross_attention = True + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + groups = in_channels // resnet_group_size + groups_out = out_channels // resnet_group_size + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + dropout=dropout, + temb_channels=temb_channels, + groups=groups, + groups_out=groups_out, + eps=resnet_eps, + non_linearity=resnet_act_fn, + time_embedding_norm="ada_group", + conv_shortcut_bias=False, + ) + ) + attentions.append( + KAttentionBlock( + out_channels, + out_channels // attention_head_dim, + attention_head_dim, + cross_attention_dim=cross_attention_dim, + temb_channels=temb_channels, + attention_bias=True, + add_self_attention=add_self_attention, + cross_attention_norm="layer_norm", + group_size=resnet_group_size, + ) + ) + + self.resnets = nn.ModuleList(resnets) + self.attentions = nn.ModuleList(attentions) + + if add_downsample: + self.downsamplers = nn.ModuleList([KDownsample2D()]) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {} + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), + hidden_states, + temb, + **ckpt_kwargs, + ) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + temb, + attention_mask, + cross_attention_kwargs, + encoder_attention_mask, + **ckpt_kwargs, + ) + else: + hidden_states = resnet(hidden_states, temb) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + emb=temb, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + encoder_attention_mask=encoder_attention_mask, + ) + + if self.downsamplers is None: + output_states += (None,) + else: + output_states += (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + return hidden_states, output_states + + +class AttnUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + upsample_type="conv", + ): + super().__init__() + resnets = [] + attentions = [] + + self.upsample_type = upsample_type + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if upsample_type == "conv": + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + elif upsample_type == "resnet": + self.upsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + up=True, + ) + ] + ) + else: + self.upsamplers = None + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None): + for resnet, attn in zip(self.resnets, self.attentions): + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + hidden_states = resnet(hidden_states, temb) + hidden_states = attn(hidden_states) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + if self.upsample_type == "resnet": + hidden_states = upsampler(hidden_states, temb=temb) + else: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +class CrossAttnUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + prev_output_channel: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + transformer_layers_per_block: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + add_upsample=True, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + ): + super().__init__() + resnets = [] + attentions = [] + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + if not dual_cross_attention: + attentions.append( + Transformer2DModel( + num_attention_heads, + out_channels // num_attention_heads, + in_channels=out_channels, + num_layers=transformer_layers_per_block, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + ) + ) + else: + attentions.append( + DualTransformer2DModel( + num_attention_heads, + out_channels // num_attention_heads, + in_channels=out_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + res_hidden_states_tuple: Tuple[torch.FloatTensor, ...], + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + upsample_size: Optional[int] = None, + attention_mask: Optional[torch.FloatTensor] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + for resnet, attn in zip(self.resnets, self.attentions): + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {} + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), + hidden_states, + temb, + **ckpt_kwargs, + ) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + None, # timestep + None, # class_labels + cross_attention_kwargs, + attention_mask, + encoder_attention_mask, + **ckpt_kwargs, + )[0] + else: + hidden_states = resnet(hidden_states, temb) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + attention_mask=attention_mask, + encoder_attention_mask=encoder_attention_mask, + return_dict=False, + )[0] + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, upsample_size) + + return hidden_states + + +class UpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_upsample=True, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + hidden_states = resnet(hidden_states, temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, upsample_size) + + return hidden_states + + +class UpDecoderBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", # default, spatial + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_upsample=True, + temb_channels=None, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + input_channels = in_channels if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=input_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + def forward(self, hidden_states, temb=None): + for resnet in self.resnets: + hidden_states = resnet(hidden_states, temb=temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +class AttnUpDecoderBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + add_upsample=True, + temb_channels=None, + ): + super().__init__() + resnets = [] + attentions = [] + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `out_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + input_channels = in_channels if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=input_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups if resnet_time_scale_shift != "spatial" else None, + spatial_norm_dim=temb_channels if resnet_time_scale_shift == "spatial" else None, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + def forward(self, hidden_states, temb=None): + for resnet, attn in zip(self.resnets, self.attentions): + hidden_states = resnet(hidden_states, temb=temb) + hidden_states = attn(hidden_states, temb=temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +class AttnSkipUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=np.sqrt(2.0), + add_upsample=True, + ): + super().__init__() + self.attentions = nn.ModuleList([]) + self.resnets = nn.ModuleList([]) + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + self.resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(resnet_in_channels + res_skip_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `out_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + self.attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=32, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.upsampler = FirUpsample2D(in_channels, out_channels=out_channels) + if add_upsample: + self.resnet_up = ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(out_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + use_in_shortcut=True, + up=True, + kernel="fir", + ) + self.skip_conv = nn.Conv2d(out_channels, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) + self.skip_norm = torch.nn.GroupNorm( + num_groups=min(out_channels // 4, 32), num_channels=out_channels, eps=resnet_eps, affine=True + ) + self.act = nn.SiLU() + else: + self.resnet_up = None + self.skip_conv = None + self.skip_norm = None + self.act = None + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, skip_sample=None): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + hidden_states = resnet(hidden_states, temb) + + hidden_states = self.attentions[0](hidden_states) + + if skip_sample is not None: + skip_sample = self.upsampler(skip_sample) + else: + skip_sample = 0 + + if self.resnet_up is not None: + skip_sample_states = self.skip_norm(hidden_states) + skip_sample_states = self.act(skip_sample_states) + skip_sample_states = self.skip_conv(skip_sample_states) + + skip_sample = skip_sample + skip_sample_states + + hidden_states = self.resnet_up(hidden_states, temb) + + return hidden_states, skip_sample + + +class SkipUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_pre_norm: bool = True, + output_scale_factor=np.sqrt(2.0), + add_upsample=True, + upsample_padding=1, + ): + super().__init__() + self.resnets = nn.ModuleList([]) + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + self.resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min((resnet_in_channels + res_skip_channels) // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.upsampler = FirUpsample2D(in_channels, out_channels=out_channels) + if add_upsample: + self.resnet_up = ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(out_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + use_in_shortcut=True, + up=True, + kernel="fir", + ) + self.skip_conv = nn.Conv2d(out_channels, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) + self.skip_norm = torch.nn.GroupNorm( + num_groups=min(out_channels // 4, 32), num_channels=out_channels, eps=resnet_eps, affine=True + ) + self.act = nn.SiLU() + else: + self.resnet_up = None + self.skip_conv = None + self.skip_norm = None + self.act = None + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, skip_sample=None): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + hidden_states = resnet(hidden_states, temb) + + if skip_sample is not None: + skip_sample = self.upsampler(skip_sample) + else: + skip_sample = 0 + + if self.resnet_up is not None: + skip_sample_states = self.skip_norm(hidden_states) + skip_sample_states = self.act(skip_sample_states) + skip_sample_states = self.skip_conv(skip_sample_states) + + skip_sample = skip_sample + skip_sample_states + + hidden_states = self.resnet_up(hidden_states, temb) + + return hidden_states, skip_sample + + +class ResnetUpsampleBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_upsample=True, + skip_time_act=False, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + up=True, + ) + ] + ) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + hidden_states = resnet(hidden_states, temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, temb) + + return hidden_states + + +class SimpleCrossAttnUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + prev_output_channel: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + add_upsample=True, + skip_time_act=False, + only_cross_attention=False, + cross_attention_norm=None, + ): + super().__init__() + resnets = [] + attentions = [] + + self.has_cross_attention = True + self.attention_head_dim = attention_head_dim + + self.num_heads = out_channels // self.attention_head_dim + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + processor = ( + AttnAddedKVProcessor2_0() if hasattr(F, "scaled_dot_product_attention") else AttnAddedKVProcessor() + ) + + attentions.append( + Attention( + query_dim=out_channels, + cross_attention_dim=out_channels, + heads=self.num_heads, + dim_head=self.attention_head_dim, + added_kv_proj_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + bias=True, + upcast_softmax=True, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + processor=processor, + ) + ) + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + up=True, + ) + ] + ) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + res_hidden_states_tuple: Tuple[torch.FloatTensor, ...], + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + upsample_size: Optional[int] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + if attention_mask is None: + # if encoder_hidden_states is defined: we are doing cross-attn, so we should use cross-attn mask. + mask = None if encoder_hidden_states is None else encoder_attention_mask + else: + # when attention_mask is defined: we don't even check for encoder_attention_mask. + # this is to maintain compatibility with UnCLIP, which uses 'attention_mask' param for cross-attn masks. + # TODO: UnCLIP should express cross-attn mask via encoder_attention_mask param instead of via attention_mask. + # then we can simplify this whole if/else block to: + # mask = attention_mask if encoder_hidden_states is None else encoder_attention_mask + mask = attention_mask + + for resnet, attn in zip(self.resnets, self.attentions): + # resnet + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + hidden_states = torch.utils.checkpoint.checkpoint(create_custom_forward(resnet), hidden_states, temb) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + mask, + cross_attention_kwargs, + )[0] + else: + hidden_states = resnet(hidden_states, temb) + + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=mask, + **cross_attention_kwargs, + ) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, temb) + + return hidden_states + + +class KUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 5, + resnet_eps: float = 1e-5, + resnet_act_fn: str = "gelu", + resnet_group_size: Optional[int] = 32, + add_upsample=True, + ): + super().__init__() + resnets = [] + k_in_channels = 2 * out_channels + k_out_channels = in_channels + num_layers = num_layers - 1 + + for i in range(num_layers): + in_channels = k_in_channels if i == 0 else out_channels + groups = in_channels // resnet_group_size + groups_out = out_channels // resnet_group_size + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=k_out_channels if (i == num_layers - 1) else out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=groups, + groups_out=groups_out, + dropout=dropout, + non_linearity=resnet_act_fn, + time_embedding_norm="ada_group", + conv_shortcut_bias=False, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([KUpsample2D()]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None): + res_hidden_states_tuple = res_hidden_states_tuple[-1] + if res_hidden_states_tuple is not None: + hidden_states = torch.cat([hidden_states, res_hidden_states_tuple], dim=1) + + for resnet in self.resnets: + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + hidden_states = resnet(hidden_states, temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +class KCrossAttnUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 4, + resnet_eps: float = 1e-5, + resnet_act_fn: str = "gelu", + resnet_group_size: int = 32, + attention_head_dim=1, # attention dim_head + cross_attention_dim: int = 768, + add_upsample: bool = True, + upcast_attention: bool = False, + ): + super().__init__() + resnets = [] + attentions = [] + + is_first_block = in_channels == out_channels == temb_channels + is_middle_block = in_channels != out_channels + add_self_attention = True if is_first_block else False + + self.has_cross_attention = True + self.attention_head_dim = attention_head_dim + + # in_channels, and out_channels for the block (k-unet) + k_in_channels = out_channels if is_first_block else 2 * out_channels + k_out_channels = in_channels + + num_layers = num_layers - 1 + + for i in range(num_layers): + in_channels = k_in_channels if i == 0 else out_channels + groups = in_channels // resnet_group_size + groups_out = out_channels // resnet_group_size + + if is_middle_block and (i == num_layers - 1): + conv_2d_out_channels = k_out_channels + else: + conv_2d_out_channels = None + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + conv_2d_out_channels=conv_2d_out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=groups, + groups_out=groups_out, + dropout=dropout, + non_linearity=resnet_act_fn, + time_embedding_norm="ada_group", + conv_shortcut_bias=False, + ) + ) + attentions.append( + KAttentionBlock( + k_out_channels if (i == num_layers - 1) else out_channels, + k_out_channels // attention_head_dim + if (i == num_layers - 1) + else out_channels // attention_head_dim, + attention_head_dim, + cross_attention_dim=cross_attention_dim, + temb_channels=temb_channels, + attention_bias=True, + add_self_attention=add_self_attention, + cross_attention_norm="layer_norm", + upcast_attention=upcast_attention, + ) + ) + + self.resnets = nn.ModuleList(resnets) + self.attentions = nn.ModuleList(attentions) + + if add_upsample: + self.upsamplers = nn.ModuleList([KUpsample2D()]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + res_hidden_states_tuple: Tuple[torch.FloatTensor, ...], + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + upsample_size: Optional[int] = None, + attention_mask: Optional[torch.FloatTensor] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + res_hidden_states_tuple = res_hidden_states_tuple[-1] + if res_hidden_states_tuple is not None: + hidden_states = torch.cat([hidden_states, res_hidden_states_tuple], dim=1) + + for resnet, attn in zip(self.resnets, self.attentions): + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {} + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), + hidden_states, + temb, + **ckpt_kwargs, + ) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + temb, + attention_mask, + cross_attention_kwargs, + encoder_attention_mask, + **ckpt_kwargs, + )[0] + else: + hidden_states = resnet(hidden_states, temb) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + emb=temb, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + encoder_attention_mask=encoder_attention_mask, + ) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +# can potentially later be renamed to `No-feed-forward` attention +class KAttentionBlock(nn.Module): + r""" + A basic Transformer block. + + Parameters: + dim (`int`): The number of channels in the input and output. + num_attention_heads (`int`): The number of heads to use for multi-head attention. + attention_head_dim (`int`): The number of channels in each head. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The size of the encoder_hidden_states vector for cross attention. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward. + num_embeds_ada_norm (: + obj: `int`, *optional*): The number of diffusion steps used during training. See `Transformer2DModel`. + attention_bias (: + obj: `bool`, *optional*, defaults to `False`): Configure if the attentions should contain a bias parameter. + """ + + def __init__( + self, + dim: int, + num_attention_heads: int, + attention_head_dim: int, + dropout: float = 0.0, + cross_attention_dim: Optional[int] = None, + attention_bias: bool = False, + upcast_attention: bool = False, + temb_channels: int = 768, # for ada_group_norm + add_self_attention: bool = False, + cross_attention_norm: Optional[str] = None, + group_size: int = 32, + ): + super().__init__() + self.add_self_attention = add_self_attention + + # 1. Self-Attn + if add_self_attention: + self.norm1 = AdaGroupNorm(temb_channels, dim, max(1, dim // group_size)) + self.attn1 = Attention( + query_dim=dim, + heads=num_attention_heads, + dim_head=attention_head_dim, + dropout=dropout, + bias=attention_bias, + cross_attention_dim=None, + cross_attention_norm=None, + ) + + # 2. Cross-Attn + self.norm2 = AdaGroupNorm(temb_channels, dim, max(1, dim // group_size)) + self.attn2 = Attention( + query_dim=dim, + cross_attention_dim=cross_attention_dim, + heads=num_attention_heads, + dim_head=attention_head_dim, + dropout=dropout, + bias=attention_bias, + upcast_attention=upcast_attention, + cross_attention_norm=cross_attention_norm, + ) + + def _to_3d(self, hidden_states, height, weight): + return hidden_states.permute(0, 2, 3, 1).reshape(hidden_states.shape[0], height * weight, -1) + + def _to_4d(self, hidden_states, height, weight): + return hidden_states.permute(0, 2, 1).reshape(hidden_states.shape[0], -1, height, weight) + + def forward( + self, + hidden_states: torch.FloatTensor, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + # TODO: mark emb as non-optional (self.norm2 requires it). + # requires assessing impact of change to positional param interface. + emb: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + # 1. Self-Attention + if self.add_self_attention: + norm_hidden_states = self.norm1(hidden_states, emb) + + height, weight = norm_hidden_states.shape[2:] + norm_hidden_states = self._to_3d(norm_hidden_states, height, weight) + + attn_output = self.attn1( + norm_hidden_states, + encoder_hidden_states=None, + attention_mask=attention_mask, + **cross_attention_kwargs, + ) + attn_output = self._to_4d(attn_output, height, weight) + + hidden_states = attn_output + hidden_states + + # 2. Cross-Attention/None + norm_hidden_states = self.norm2(hidden_states, emb) + + height, weight = norm_hidden_states.shape[2:] + norm_hidden_states = self._to_3d(norm_hidden_states, height, weight) + attn_output = self.attn2( + norm_hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask if encoder_hidden_states is None else encoder_attention_mask, + **cross_attention_kwargs, + ) + attn_output = self._to_4d(attn_output, height, weight) + + hidden_states = attn_output + hidden_states + + return hidden_states diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_blocks_flax.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_blocks_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..0d1447570dda34b814bdc1660dfd37874fed0125 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_blocks_flax.py @@ -0,0 +1,377 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import flax.linen as nn +import jax.numpy as jnp + +from .attention_flax import FlaxTransformer2DModel +from .resnet_flax import FlaxDownsample2D, FlaxResnetBlock2D, FlaxUpsample2D + + +class FlaxCrossAttnDownBlock2D(nn.Module): + r""" + Cross Attention 2D Downsizing block - original architecture from Unet transformers: + https://arxiv.org/abs/2103.06104 + + Parameters: + in_channels (:obj:`int`): + Input channels + out_channels (:obj:`int`): + Output channels + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + num_layers (:obj:`int`, *optional*, defaults to 1): + Number of attention blocks layers + num_attention_heads (:obj:`int`, *optional*, defaults to 1): + Number of attention heads of each spatial transformer block + add_downsample (:obj:`bool`, *optional*, defaults to `True`): + Whether to add downsampling layer before each final output + use_memory_efficient_attention (`bool`, *optional*, defaults to `False`): + enable memory efficient attention https://arxiv.org/abs/2112.05682 + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int + out_channels: int + dropout: float = 0.0 + num_layers: int = 1 + num_attention_heads: int = 1 + add_downsample: bool = True + use_linear_projection: bool = False + only_cross_attention: bool = False + use_memory_efficient_attention: bool = False + dtype: jnp.dtype = jnp.float32 + + def setup(self): + resnets = [] + attentions = [] + + for i in range(self.num_layers): + in_channels = self.in_channels if i == 0 else self.out_channels + + res_block = FlaxResnetBlock2D( + in_channels=in_channels, + out_channels=self.out_channels, + dropout_prob=self.dropout, + dtype=self.dtype, + ) + resnets.append(res_block) + + attn_block = FlaxTransformer2DModel( + in_channels=self.out_channels, + n_heads=self.num_attention_heads, + d_head=self.out_channels // self.num_attention_heads, + depth=1, + use_linear_projection=self.use_linear_projection, + only_cross_attention=self.only_cross_attention, + use_memory_efficient_attention=self.use_memory_efficient_attention, + dtype=self.dtype, + ) + attentions.append(attn_block) + + self.resnets = resnets + self.attentions = attentions + + if self.add_downsample: + self.downsamplers_0 = FlaxDownsample2D(self.out_channels, dtype=self.dtype) + + def __call__(self, hidden_states, temb, encoder_hidden_states, deterministic=True): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + hidden_states = resnet(hidden_states, temb, deterministic=deterministic) + hidden_states = attn(hidden_states, encoder_hidden_states, deterministic=deterministic) + output_states += (hidden_states,) + + if self.add_downsample: + hidden_states = self.downsamplers_0(hidden_states) + output_states += (hidden_states,) + + return hidden_states, output_states + + +class FlaxDownBlock2D(nn.Module): + r""" + Flax 2D downsizing block + + Parameters: + in_channels (:obj:`int`): + Input channels + out_channels (:obj:`int`): + Output channels + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + num_layers (:obj:`int`, *optional*, defaults to 1): + Number of attention blocks layers + add_downsample (:obj:`bool`, *optional*, defaults to `True`): + Whether to add downsampling layer before each final output + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int + out_channels: int + dropout: float = 0.0 + num_layers: int = 1 + add_downsample: bool = True + dtype: jnp.dtype = jnp.float32 + + def setup(self): + resnets = [] + + for i in range(self.num_layers): + in_channels = self.in_channels if i == 0 else self.out_channels + + res_block = FlaxResnetBlock2D( + in_channels=in_channels, + out_channels=self.out_channels, + dropout_prob=self.dropout, + dtype=self.dtype, + ) + resnets.append(res_block) + self.resnets = resnets + + if self.add_downsample: + self.downsamplers_0 = FlaxDownsample2D(self.out_channels, dtype=self.dtype) + + def __call__(self, hidden_states, temb, deterministic=True): + output_states = () + + for resnet in self.resnets: + hidden_states = resnet(hidden_states, temb, deterministic=deterministic) + output_states += (hidden_states,) + + if self.add_downsample: + hidden_states = self.downsamplers_0(hidden_states) + output_states += (hidden_states,) + + return hidden_states, output_states + + +class FlaxCrossAttnUpBlock2D(nn.Module): + r""" + Cross Attention 2D Upsampling block - original architecture from Unet transformers: + https://arxiv.org/abs/2103.06104 + + Parameters: + in_channels (:obj:`int`): + Input channels + out_channels (:obj:`int`): + Output channels + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + num_layers (:obj:`int`, *optional*, defaults to 1): + Number of attention blocks layers + num_attention_heads (:obj:`int`, *optional*, defaults to 1): + Number of attention heads of each spatial transformer block + add_upsample (:obj:`bool`, *optional*, defaults to `True`): + Whether to add upsampling layer before each final output + use_memory_efficient_attention (`bool`, *optional*, defaults to `False`): + enable memory efficient attention https://arxiv.org/abs/2112.05682 + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int + out_channels: int + prev_output_channel: int + dropout: float = 0.0 + num_layers: int = 1 + num_attention_heads: int = 1 + add_upsample: bool = True + use_linear_projection: bool = False + only_cross_attention: bool = False + use_memory_efficient_attention: bool = False + dtype: jnp.dtype = jnp.float32 + + def setup(self): + resnets = [] + attentions = [] + + for i in range(self.num_layers): + res_skip_channels = self.in_channels if (i == self.num_layers - 1) else self.out_channels + resnet_in_channels = self.prev_output_channel if i == 0 else self.out_channels + + res_block = FlaxResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=self.out_channels, + dropout_prob=self.dropout, + dtype=self.dtype, + ) + resnets.append(res_block) + + attn_block = FlaxTransformer2DModel( + in_channels=self.out_channels, + n_heads=self.num_attention_heads, + d_head=self.out_channels // self.num_attention_heads, + depth=1, + use_linear_projection=self.use_linear_projection, + only_cross_attention=self.only_cross_attention, + use_memory_efficient_attention=self.use_memory_efficient_attention, + dtype=self.dtype, + ) + attentions.append(attn_block) + + self.resnets = resnets + self.attentions = attentions + + if self.add_upsample: + self.upsamplers_0 = FlaxUpsample2D(self.out_channels, dtype=self.dtype) + + def __call__(self, hidden_states, res_hidden_states_tuple, temb, encoder_hidden_states, deterministic=True): + for resnet, attn in zip(self.resnets, self.attentions): + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = jnp.concatenate((hidden_states, res_hidden_states), axis=-1) + + hidden_states = resnet(hidden_states, temb, deterministic=deterministic) + hidden_states = attn(hidden_states, encoder_hidden_states, deterministic=deterministic) + + if self.add_upsample: + hidden_states = self.upsamplers_0(hidden_states) + + return hidden_states + + +class FlaxUpBlock2D(nn.Module): + r""" + Flax 2D upsampling block + + Parameters: + in_channels (:obj:`int`): + Input channels + out_channels (:obj:`int`): + Output channels + prev_output_channel (:obj:`int`): + Output channels from the previous block + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + num_layers (:obj:`int`, *optional*, defaults to 1): + Number of attention blocks layers + add_downsample (:obj:`bool`, *optional*, defaults to `True`): + Whether to add downsampling layer before each final output + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int + out_channels: int + prev_output_channel: int + dropout: float = 0.0 + num_layers: int = 1 + add_upsample: bool = True + dtype: jnp.dtype = jnp.float32 + + def setup(self): + resnets = [] + + for i in range(self.num_layers): + res_skip_channels = self.in_channels if (i == self.num_layers - 1) else self.out_channels + resnet_in_channels = self.prev_output_channel if i == 0 else self.out_channels + + res_block = FlaxResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=self.out_channels, + dropout_prob=self.dropout, + dtype=self.dtype, + ) + resnets.append(res_block) + + self.resnets = resnets + + if self.add_upsample: + self.upsamplers_0 = FlaxUpsample2D(self.out_channels, dtype=self.dtype) + + def __call__(self, hidden_states, res_hidden_states_tuple, temb, deterministic=True): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = jnp.concatenate((hidden_states, res_hidden_states), axis=-1) + + hidden_states = resnet(hidden_states, temb, deterministic=deterministic) + + if self.add_upsample: + hidden_states = self.upsamplers_0(hidden_states) + + return hidden_states + + +class FlaxUNetMidBlock2DCrossAttn(nn.Module): + r""" + Cross Attention 2D Mid-level block - original architecture from Unet transformers: https://arxiv.org/abs/2103.06104 + + Parameters: + in_channels (:obj:`int`): + Input channels + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + num_layers (:obj:`int`, *optional*, defaults to 1): + Number of attention blocks layers + num_attention_heads (:obj:`int`, *optional*, defaults to 1): + Number of attention heads of each spatial transformer block + use_memory_efficient_attention (`bool`, *optional*, defaults to `False`): + enable memory efficient attention https://arxiv.org/abs/2112.05682 + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int + dropout: float = 0.0 + num_layers: int = 1 + num_attention_heads: int = 1 + use_linear_projection: bool = False + use_memory_efficient_attention: bool = False + dtype: jnp.dtype = jnp.float32 + + def setup(self): + # there is always at least one resnet + resnets = [ + FlaxResnetBlock2D( + in_channels=self.in_channels, + out_channels=self.in_channels, + dropout_prob=self.dropout, + dtype=self.dtype, + ) + ] + + attentions = [] + + for _ in range(self.num_layers): + attn_block = FlaxTransformer2DModel( + in_channels=self.in_channels, + n_heads=self.num_attention_heads, + d_head=self.in_channels // self.num_attention_heads, + depth=1, + use_linear_projection=self.use_linear_projection, + use_memory_efficient_attention=self.use_memory_efficient_attention, + dtype=self.dtype, + ) + attentions.append(attn_block) + + res_block = FlaxResnetBlock2D( + in_channels=self.in_channels, + out_channels=self.in_channels, + dropout_prob=self.dropout, + dtype=self.dtype, + ) + resnets.append(res_block) + + self.resnets = resnets + self.attentions = attentions + + def __call__(self, hidden_states, temb, encoder_hidden_states, deterministic=True): + hidden_states = self.resnets[0](hidden_states, temb) + for attn, resnet in zip(self.attentions, self.resnets[1:]): + hidden_states = attn(hidden_states, encoder_hidden_states, deterministic=deterministic) + hidden_states = resnet(hidden_states, temb, deterministic=deterministic) + + return hidden_states diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_condition.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_condition.py new file mode 100644 index 0000000000000000000000000000000000000000..dee71bead0f99b2314b7b14b37324aeeed4a238c --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_condition.py @@ -0,0 +1,980 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Any, Dict, List, Optional, Tuple, Union + +import torch +import torch.nn as nn +import torch.utils.checkpoint + +from ..configuration_utils import ConfigMixin, register_to_config +from ..loaders import UNet2DConditionLoadersMixin +from ..utils import BaseOutput, logging +from .activations import get_activation +from .attention_processor import AttentionProcessor, AttnProcessor +from .embeddings import ( + GaussianFourierProjection, + ImageHintTimeEmbedding, + ImageProjection, + ImageTimeEmbedding, + TextImageProjection, + TextImageTimeEmbedding, + TextTimeEmbedding, + TimestepEmbedding, + Timesteps, +) +from .modeling_utils import ModelMixin +from .unet_2d_blocks import ( + CrossAttnDownBlock2D, + CrossAttnUpBlock2D, + DownBlock2D, + UNetMidBlock2DCrossAttn, + UNetMidBlock2DSimpleCrossAttn, + UpBlock2D, + get_down_block, + get_up_block, +) + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +@dataclass +class UNet2DConditionOutput(BaseOutput): + """ + The output of [`UNet2DConditionModel`]. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)`): + The hidden states output conditioned on `encoder_hidden_states` input. Output of last layer of model. + """ + + sample: torch.FloatTensor = None + + +class UNet2DConditionModel(ModelMixin, ConfigMixin, UNet2DConditionLoadersMixin): + r""" + A conditional 2D UNet model that takes a noisy sample, conditional state, and a timestep and returns a sample + shaped output. + + This model inherits from [`ModelMixin`]. Check the superclass documentation for it's generic methods implemented + for all models (such as downloading or saving). + + Parameters: + sample_size (`int` or `Tuple[int, int]`, *optional*, defaults to `None`): + Height and width of input/output sample. + in_channels (`int`, *optional*, defaults to 4): Number of channels in the input sample. + out_channels (`int`, *optional*, defaults to 4): Number of channels in the output. + center_input_sample (`bool`, *optional*, defaults to `False`): Whether to center the input sample. + flip_sin_to_cos (`bool`, *optional*, defaults to `False`): + Whether to flip the sin to cos in the time embedding. + freq_shift (`int`, *optional*, defaults to 0): The frequency shift to apply to the time embedding. + down_block_types (`Tuple[str]`, *optional*, defaults to `("CrossAttnDownBlock2D", "CrossAttnDownBlock2D", "CrossAttnDownBlock2D", "DownBlock2D")`): + The tuple of downsample blocks to use. + mid_block_type (`str`, *optional*, defaults to `"UNetMidBlock2DCrossAttn"`): + Block type for middle of UNet, it can be either `UNetMidBlock2DCrossAttn` or + `UNetMidBlock2DSimpleCrossAttn`. If `None`, the mid block layer is skipped. + up_block_types (`Tuple[str]`, *optional*, defaults to `("UpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D")`): + The tuple of upsample blocks to use. + only_cross_attention(`bool` or `Tuple[bool]`, *optional*, default to `False`): + Whether to include self-attention in the basic transformer blocks, see + [`~models.attention.BasicTransformerBlock`]. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(320, 640, 1280, 1280)`): + The tuple of output channels for each block. + layers_per_block (`int`, *optional*, defaults to 2): The number of layers per block. + downsample_padding (`int`, *optional*, defaults to 1): The padding to use for the downsampling convolution. + mid_block_scale_factor (`float`, *optional*, defaults to 1.0): The scale factor to use for the mid block. + act_fn (`str`, *optional*, defaults to `"silu"`): The activation function to use. + norm_num_groups (`int`, *optional*, defaults to 32): The number of groups to use for the normalization. + If `None`, normalization and activation layers is skipped in post-processing. + norm_eps (`float`, *optional*, defaults to 1e-5): The epsilon to use for the normalization. + cross_attention_dim (`int` or `Tuple[int]`, *optional*, defaults to 1280): + The dimension of the cross attention features. + transformer_layers_per_block (`int` or `Tuple[int]`, *optional*, defaults to 1): + The number of transformer blocks of type [`~models.attention.BasicTransformerBlock`]. Only relevant for + [`~models.unet_2d_blocks.CrossAttnDownBlock2D`], [`~models.unet_2d_blocks.CrossAttnUpBlock2D`], + [`~models.unet_2d_blocks.UNetMidBlock2DCrossAttn`]. + encoder_hid_dim (`int`, *optional*, defaults to None): + If `encoder_hid_dim_type` is defined, `encoder_hidden_states` will be projected from `encoder_hid_dim` + dimension to `cross_attention_dim`. + encoder_hid_dim_type (`str`, *optional*, defaults to `None`): + If given, the `encoder_hidden_states` and potentially other embeddings are down-projected to text + embeddings of dimension `cross_attention` according to `encoder_hid_dim_type`. + attention_head_dim (`int`, *optional*, defaults to 8): The dimension of the attention heads. + num_attention_heads (`int`, *optional*): + The number of attention heads. If not defined, defaults to `attention_head_dim` + resnet_time_scale_shift (`str`, *optional*, defaults to `"default"`): Time scale shift config + for ResNet blocks (see [`~models.resnet.ResnetBlock2D`]). Choose from `default` or `scale_shift`. + class_embed_type (`str`, *optional*, defaults to `None`): + The type of class embedding to use which is ultimately summed with the time embeddings. Choose from `None`, + `"timestep"`, `"identity"`, `"projection"`, or `"simple_projection"`. + addition_embed_type (`str`, *optional*, defaults to `None`): + Configures an optional embedding which will be summed with the time embeddings. Choose from `None` or + "text". "text" will use the `TextTimeEmbedding` layer. + addition_time_embed_dim: (`int`, *optional*, defaults to `None`): + Dimension for the timestep embeddings. + num_class_embeds (`int`, *optional*, defaults to `None`): + Input dimension of the learnable embedding matrix to be projected to `time_embed_dim`, when performing + class conditioning with `class_embed_type` equal to `None`. + time_embedding_type (`str`, *optional*, defaults to `positional`): + The type of position embedding to use for timesteps. Choose from `positional` or `fourier`. + time_embedding_dim (`int`, *optional*, defaults to `None`): + An optional override for the dimension of the projected time embedding. + time_embedding_act_fn (`str`, *optional*, defaults to `None`): + Optional activation function to use only once on the time embeddings before they are passed to the rest of + the UNet. Choose from `silu`, `mish`, `gelu`, and `swish`. + timestep_post_act (`str`, *optional*, defaults to `None`): + The second activation function to use in timestep embedding. Choose from `silu`, `mish` and `gelu`. + time_cond_proj_dim (`int`, *optional*, defaults to `None`): + The dimension of `cond_proj` layer in the timestep embedding. + conv_in_kernel (`int`, *optional*, default to `3`): The kernel size of `conv_in` layer. + conv_out_kernel (`int`, *optional*, default to `3`): The kernel size of `conv_out` layer. + projection_class_embeddings_input_dim (`int`, *optional*): The dimension of the `class_labels` input when + `class_embed_type="projection"`. Required when `class_embed_type="projection"`. + class_embeddings_concat (`bool`, *optional*, defaults to `False`): Whether to concatenate the time + embeddings with the class embeddings. + mid_block_only_cross_attention (`bool`, *optional*, defaults to `None`): + Whether to use cross attention with the mid block when using the `UNetMidBlock2DSimpleCrossAttn`. If + `only_cross_attention` is given as a single boolean and `mid_block_only_cross_attention` is `None`, the + `only_cross_attention` value is used as the value for `mid_block_only_cross_attention`. Default to `False` + otherwise. + """ + + _supports_gradient_checkpointing = True + + @register_to_config + def __init__( + self, + sample_size: Optional[int] = None, + in_channels: int = 4, + out_channels: int = 4, + center_input_sample: bool = False, + flip_sin_to_cos: bool = True, + freq_shift: int = 0, + down_block_types: Tuple[str] = ( + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "DownBlock2D", + ), + mid_block_type: Optional[str] = "UNetMidBlock2DCrossAttn", + up_block_types: Tuple[str] = ("UpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D"), + only_cross_attention: Union[bool, Tuple[bool]] = False, + block_out_channels: Tuple[int] = (320, 640, 1280, 1280), + layers_per_block: Union[int, Tuple[int]] = 2, + downsample_padding: int = 1, + mid_block_scale_factor: float = 1, + act_fn: str = "silu", + norm_num_groups: Optional[int] = 32, + norm_eps: float = 1e-5, + cross_attention_dim: Union[int, Tuple[int]] = 1280, + transformer_layers_per_block: Union[int, Tuple[int]] = 1, + encoder_hid_dim: Optional[int] = None, + encoder_hid_dim_type: Optional[str] = None, + attention_head_dim: Union[int, Tuple[int]] = 8, + num_attention_heads: Optional[Union[int, Tuple[int]]] = None, + dual_cross_attention: bool = False, + use_linear_projection: bool = False, + class_embed_type: Optional[str] = None, + addition_embed_type: Optional[str] = None, + addition_time_embed_dim: Optional[int] = None, + num_class_embeds: Optional[int] = None, + upcast_attention: bool = False, + resnet_time_scale_shift: str = "default", + resnet_skip_time_act: bool = False, + resnet_out_scale_factor: int = 1.0, + time_embedding_type: str = "positional", + time_embedding_dim: Optional[int] = None, + time_embedding_act_fn: Optional[str] = None, + timestep_post_act: Optional[str] = None, + time_cond_proj_dim: Optional[int] = None, + conv_in_kernel: int = 3, + conv_out_kernel: int = 3, + projection_class_embeddings_input_dim: Optional[int] = None, + class_embeddings_concat: bool = False, + mid_block_only_cross_attention: Optional[bool] = None, + cross_attention_norm: Optional[str] = None, + addition_embed_type_num_heads=64, + ): + super().__init__() + + self.sample_size = sample_size + + if num_attention_heads is not None: + raise ValueError( + "At the moment it is not possible to define the number of attention heads via `num_attention_heads` because of a naming issue as described in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131. Passing `num_attention_heads` will only be supported in diffusers v0.19." + ) + + # If `num_attention_heads` is not defined (which is the case for most models) + # it will default to `attention_head_dim`. This looks weird upon first reading it and it is. + # The reason for this behavior is to correct for incorrectly named variables that were introduced + # when this library was created. The incorrect naming was only discovered much later in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131 + # Changing `attention_head_dim` to `num_attention_heads` for 40,000+ configurations is too backwards breaking + # which is why we correct for the naming here. + num_attention_heads = num_attention_heads or attention_head_dim + + # Check inputs + if len(down_block_types) != len(up_block_types): + raise ValueError( + f"Must provide the same number of `down_block_types` as `up_block_types`. `down_block_types`: {down_block_types}. `up_block_types`: {up_block_types}." + ) + + if len(block_out_channels) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `block_out_channels` as `down_block_types`. `block_out_channels`: {block_out_channels}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(only_cross_attention, bool) and len(only_cross_attention) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `only_cross_attention` as `down_block_types`. `only_cross_attention`: {only_cross_attention}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(num_attention_heads, int) and len(num_attention_heads) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `num_attention_heads` as `down_block_types`. `num_attention_heads`: {num_attention_heads}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(attention_head_dim, int) and len(attention_head_dim) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `attention_head_dim` as `down_block_types`. `attention_head_dim`: {attention_head_dim}. `down_block_types`: {down_block_types}." + ) + + if isinstance(cross_attention_dim, list) and len(cross_attention_dim) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `cross_attention_dim` as `down_block_types`. `cross_attention_dim`: {cross_attention_dim}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(layers_per_block, int) and len(layers_per_block) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `layers_per_block` as `down_block_types`. `layers_per_block`: {layers_per_block}. `down_block_types`: {down_block_types}." + ) + + # input + conv_in_padding = (conv_in_kernel - 1) // 2 + self.conv_in = nn.Conv2d( + in_channels, block_out_channels[0], kernel_size=conv_in_kernel, padding=conv_in_padding + ) + + # time + if time_embedding_type == "fourier": + time_embed_dim = time_embedding_dim or block_out_channels[0] * 2 + if time_embed_dim % 2 != 0: + raise ValueError(f"`time_embed_dim` should be divisible by 2, but is {time_embed_dim}.") + self.time_proj = GaussianFourierProjection( + time_embed_dim // 2, set_W_to_weight=False, log=False, flip_sin_to_cos=flip_sin_to_cos + ) + timestep_input_dim = time_embed_dim + elif time_embedding_type == "positional": + time_embed_dim = time_embedding_dim or block_out_channels[0] * 4 + + self.time_proj = Timesteps(block_out_channels[0], flip_sin_to_cos, freq_shift) + timestep_input_dim = block_out_channels[0] + else: + raise ValueError( + f"{time_embedding_type} does not exist. Please make sure to use one of `fourier` or `positional`." + ) + + self.time_embedding = TimestepEmbedding( + timestep_input_dim, + time_embed_dim, + act_fn=act_fn, + post_act_fn=timestep_post_act, + cond_proj_dim=time_cond_proj_dim, + ) + + if encoder_hid_dim_type is None and encoder_hid_dim is not None: + encoder_hid_dim_type = "text_proj" + self.register_to_config(encoder_hid_dim_type=encoder_hid_dim_type) + logger.info("encoder_hid_dim_type defaults to 'text_proj' as `encoder_hid_dim` is defined.") + + if encoder_hid_dim is None and encoder_hid_dim_type is not None: + raise ValueError( + f"`encoder_hid_dim` has to be defined when `encoder_hid_dim_type` is set to {encoder_hid_dim_type}." + ) + + if encoder_hid_dim_type == "text_proj": + self.encoder_hid_proj = nn.Linear(encoder_hid_dim, cross_attention_dim) + elif encoder_hid_dim_type == "text_image_proj": + # image_embed_dim DOESN'T have to be `cross_attention_dim`. To not clutter the __init__ too much + # they are set to `cross_attention_dim` here as this is exactly the required dimension for the currently only use + # case when `addition_embed_type == "text_image_proj"` (Kadinsky 2.1)` + self.encoder_hid_proj = TextImageProjection( + text_embed_dim=encoder_hid_dim, + image_embed_dim=cross_attention_dim, + cross_attention_dim=cross_attention_dim, + ) + elif encoder_hid_dim_type == "image_proj": + # Kandinsky 2.2 + self.encoder_hid_proj = ImageProjection( + image_embed_dim=encoder_hid_dim, + cross_attention_dim=cross_attention_dim, + ) + elif encoder_hid_dim_type is not None: + raise ValueError( + f"encoder_hid_dim_type: {encoder_hid_dim_type} must be None, 'text_proj' or 'text_image_proj'." + ) + else: + self.encoder_hid_proj = None + + # class embedding + if class_embed_type is None and num_class_embeds is not None: + self.class_embedding = nn.Embedding(num_class_embeds, time_embed_dim) + elif class_embed_type == "timestep": + self.class_embedding = TimestepEmbedding(timestep_input_dim, time_embed_dim, act_fn=act_fn) + elif class_embed_type == "identity": + self.class_embedding = nn.Identity(time_embed_dim, time_embed_dim) + elif class_embed_type == "projection": + if projection_class_embeddings_input_dim is None: + raise ValueError( + "`class_embed_type`: 'projection' requires `projection_class_embeddings_input_dim` be set" + ) + # The projection `class_embed_type` is the same as the timestep `class_embed_type` except + # 1. the `class_labels` inputs are not first converted to sinusoidal embeddings + # 2. it projects from an arbitrary input dimension. + # + # Note that `TimestepEmbedding` is quite general, being mainly linear layers and activations. + # When used for embedding actual timesteps, the timesteps are first converted to sinusoidal embeddings. + # As a result, `TimestepEmbedding` can be passed arbitrary vectors. + self.class_embedding = TimestepEmbedding(projection_class_embeddings_input_dim, time_embed_dim) + elif class_embed_type == "simple_projection": + if projection_class_embeddings_input_dim is None: + raise ValueError( + "`class_embed_type`: 'simple_projection' requires `projection_class_embeddings_input_dim` be set" + ) + self.class_embedding = nn.Linear(projection_class_embeddings_input_dim, time_embed_dim) + else: + self.class_embedding = None + + if addition_embed_type == "text": + if encoder_hid_dim is not None: + text_time_embedding_from_dim = encoder_hid_dim + else: + text_time_embedding_from_dim = cross_attention_dim + + self.add_embedding = TextTimeEmbedding( + text_time_embedding_from_dim, time_embed_dim, num_heads=addition_embed_type_num_heads + ) + elif addition_embed_type == "text_image": + # text_embed_dim and image_embed_dim DON'T have to be `cross_attention_dim`. To not clutter the __init__ too much + # they are set to `cross_attention_dim` here as this is exactly the required dimension for the currently only use + # case when `addition_embed_type == "text_image"` (Kadinsky 2.1)` + self.add_embedding = TextImageTimeEmbedding( + text_embed_dim=cross_attention_dim, image_embed_dim=cross_attention_dim, time_embed_dim=time_embed_dim + ) + elif addition_embed_type == "text_time": + self.add_time_proj = Timesteps(addition_time_embed_dim, flip_sin_to_cos, freq_shift) + self.add_embedding = TimestepEmbedding(projection_class_embeddings_input_dim, time_embed_dim) + elif addition_embed_type == "image": + # Kandinsky 2.2 + self.add_embedding = ImageTimeEmbedding(image_embed_dim=encoder_hid_dim, time_embed_dim=time_embed_dim) + elif addition_embed_type == "image_hint": + # Kandinsky 2.2 ControlNet + self.add_embedding = ImageHintTimeEmbedding(image_embed_dim=encoder_hid_dim, time_embed_dim=time_embed_dim) + elif addition_embed_type is not None: + raise ValueError(f"addition_embed_type: {addition_embed_type} must be None, 'text' or 'text_image'.") + + if time_embedding_act_fn is None: + self.time_embed_act = None + else: + self.time_embed_act = get_activation(time_embedding_act_fn) + + self.down_blocks = nn.ModuleList([]) + self.up_blocks = nn.ModuleList([]) + + if isinstance(only_cross_attention, bool): + if mid_block_only_cross_attention is None: + mid_block_only_cross_attention = only_cross_attention + + only_cross_attention = [only_cross_attention] * len(down_block_types) + + if mid_block_only_cross_attention is None: + mid_block_only_cross_attention = False + + if isinstance(num_attention_heads, int): + num_attention_heads = (num_attention_heads,) * len(down_block_types) + + if isinstance(attention_head_dim, int): + attention_head_dim = (attention_head_dim,) * len(down_block_types) + + if isinstance(cross_attention_dim, int): + cross_attention_dim = (cross_attention_dim,) * len(down_block_types) + + if isinstance(layers_per_block, int): + layers_per_block = [layers_per_block] * len(down_block_types) + + if isinstance(transformer_layers_per_block, int): + transformer_layers_per_block = [transformer_layers_per_block] * len(down_block_types) + + if class_embeddings_concat: + # The time embeddings are concatenated with the class embeddings. The dimension of the + # time embeddings passed to the down, middle, and up blocks is twice the dimension of the + # regular time embeddings + blocks_time_embed_dim = time_embed_dim * 2 + else: + blocks_time_embed_dim = time_embed_dim + + # down + output_channel = block_out_channels[0] + for i, down_block_type in enumerate(down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + down_block = get_down_block( + down_block_type, + num_layers=layers_per_block[i], + transformer_layers_per_block=transformer_layers_per_block[i], + in_channels=input_channel, + out_channels=output_channel, + temb_channels=blocks_time_embed_dim, + add_downsample=not is_final_block, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + cross_attention_dim=cross_attention_dim[i], + num_attention_heads=num_attention_heads[i], + downsample_padding=downsample_padding, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention[i], + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + resnet_skip_time_act=resnet_skip_time_act, + resnet_out_scale_factor=resnet_out_scale_factor, + cross_attention_norm=cross_attention_norm, + attention_head_dim=attention_head_dim[i] if attention_head_dim[i] is not None else output_channel, + ) + self.down_blocks.append(down_block) + + # mid + if mid_block_type == "UNetMidBlock2DCrossAttn": + self.mid_block = UNetMidBlock2DCrossAttn( + transformer_layers_per_block=transformer_layers_per_block[-1], + in_channels=block_out_channels[-1], + temb_channels=blocks_time_embed_dim, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + output_scale_factor=mid_block_scale_factor, + resnet_time_scale_shift=resnet_time_scale_shift, + cross_attention_dim=cross_attention_dim[-1], + num_attention_heads=num_attention_heads[-1], + resnet_groups=norm_num_groups, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + upcast_attention=upcast_attention, + ) + elif mid_block_type == "UNetMidBlock2DSimpleCrossAttn": + self.mid_block = UNetMidBlock2DSimpleCrossAttn( + in_channels=block_out_channels[-1], + temb_channels=blocks_time_embed_dim, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + output_scale_factor=mid_block_scale_factor, + cross_attention_dim=cross_attention_dim[-1], + attention_head_dim=attention_head_dim[-1], + resnet_groups=norm_num_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + only_cross_attention=mid_block_only_cross_attention, + cross_attention_norm=cross_attention_norm, + ) + elif mid_block_type is None: + self.mid_block = None + else: + raise ValueError(f"unknown mid_block_type : {mid_block_type}") + + # count how many layers upsample the images + self.num_upsamplers = 0 + + # up + reversed_block_out_channels = list(reversed(block_out_channels)) + reversed_num_attention_heads = list(reversed(num_attention_heads)) + reversed_layers_per_block = list(reversed(layers_per_block)) + reversed_cross_attention_dim = list(reversed(cross_attention_dim)) + reversed_transformer_layers_per_block = list(reversed(transformer_layers_per_block)) + only_cross_attention = list(reversed(only_cross_attention)) + + output_channel = reversed_block_out_channels[0] + for i, up_block_type in enumerate(up_block_types): + is_final_block = i == len(block_out_channels) - 1 + + prev_output_channel = output_channel + output_channel = reversed_block_out_channels[i] + input_channel = reversed_block_out_channels[min(i + 1, len(block_out_channels) - 1)] + + # add upsample block for all BUT final layer + if not is_final_block: + add_upsample = True + self.num_upsamplers += 1 + else: + add_upsample = False + + up_block = get_up_block( + up_block_type, + num_layers=reversed_layers_per_block[i] + 1, + transformer_layers_per_block=reversed_transformer_layers_per_block[i], + in_channels=input_channel, + out_channels=output_channel, + prev_output_channel=prev_output_channel, + temb_channels=blocks_time_embed_dim, + add_upsample=add_upsample, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + cross_attention_dim=reversed_cross_attention_dim[i], + num_attention_heads=reversed_num_attention_heads[i], + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention[i], + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + resnet_skip_time_act=resnet_skip_time_act, + resnet_out_scale_factor=resnet_out_scale_factor, + cross_attention_norm=cross_attention_norm, + attention_head_dim=attention_head_dim[i] if attention_head_dim[i] is not None else output_channel, + ) + self.up_blocks.append(up_block) + prev_output_channel = output_channel + + # out + if norm_num_groups is not None: + self.conv_norm_out = nn.GroupNorm( + num_channels=block_out_channels[0], num_groups=norm_num_groups, eps=norm_eps + ) + + self.conv_act = get_activation(act_fn) + + else: + self.conv_norm_out = None + self.conv_act = None + + conv_out_padding = (conv_out_kernel - 1) // 2 + self.conv_out = nn.Conv2d( + block_out_channels[0], out_channels, kernel_size=conv_out_kernel, padding=conv_out_padding + ) + + @property + def attn_processors(self) -> Dict[str, AttentionProcessor]: + r""" + Returns: + `dict` of attention processors: A dictionary containing all attention processors used in the model with + indexed by its weight name. + """ + # set recursively + processors = {} + + def fn_recursive_add_processors(name: str, module: torch.nn.Module, processors: Dict[str, AttentionProcessor]): + if hasattr(module, "set_processor"): + processors[f"{name}.processor"] = module.processor + + for sub_name, child in module.named_children(): + fn_recursive_add_processors(f"{name}.{sub_name}", child, processors) + + return processors + + for name, module in self.named_children(): + fn_recursive_add_processors(name, module, processors) + + return processors + + def set_attn_processor(self, processor: Union[AttentionProcessor, Dict[str, AttentionProcessor]]): + r""" + Sets the attention processor to use to compute attention. + + Parameters: + processor (`dict` of `AttentionProcessor` or only `AttentionProcessor`): + The instantiated processor class or a dictionary of processor classes that will be set as the processor + for **all** `Attention` layers. + + If `processor` is a dict, the key needs to define the path to the corresponding cross attention + processor. This is strongly recommended when setting trainable attention processors. + + """ + count = len(self.attn_processors.keys()) + + if isinstance(processor, dict) and len(processor) != count: + raise ValueError( + f"A dict of processors was passed, but the number of processors {len(processor)} does not match the" + f" number of attention layers: {count}. Please make sure to pass {count} processor classes." + ) + + def fn_recursive_attn_processor(name: str, module: torch.nn.Module, processor): + if hasattr(module, "set_processor"): + if not isinstance(processor, dict): + module.set_processor(processor) + else: + module.set_processor(processor.pop(f"{name}.processor")) + + for sub_name, child in module.named_children(): + fn_recursive_attn_processor(f"{name}.{sub_name}", child, processor) + + for name, module in self.named_children(): + fn_recursive_attn_processor(name, module, processor) + + def set_default_attn_processor(self): + """ + Disables custom attention processors and sets the default attention implementation. + """ + self.set_attn_processor(AttnProcessor()) + + def set_attention_slice(self, slice_size): + r""" + Enable sliced attention computation. + + When this option is enabled, the attention module splits the input tensor in slices to compute attention in + several steps. This is useful for saving some memory in exchange for a small decrease in speed. + + Args: + slice_size (`str` or `int` or `list(int)`, *optional*, defaults to `"auto"`): + When `"auto"`, input to the attention heads is halved, so attention is computed in two steps. If + `"max"`, maximum amount of memory is saved by running only one slice at a time. If a number is + provided, uses as many slices as `attention_head_dim // slice_size`. In this case, `attention_head_dim` + must be a multiple of `slice_size`. + """ + sliceable_head_dims = [] + + def fn_recursive_retrieve_sliceable_dims(module: torch.nn.Module): + if hasattr(module, "set_attention_slice"): + sliceable_head_dims.append(module.sliceable_head_dim) + + for child in module.children(): + fn_recursive_retrieve_sliceable_dims(child) + + # retrieve number of attention layers + for module in self.children(): + fn_recursive_retrieve_sliceable_dims(module) + + num_sliceable_layers = len(sliceable_head_dims) + + if slice_size == "auto": + # half the attention head size is usually a good trade-off between + # speed and memory + slice_size = [dim // 2 for dim in sliceable_head_dims] + elif slice_size == "max": + # make smallest slice possible + slice_size = num_sliceable_layers * [1] + + slice_size = num_sliceable_layers * [slice_size] if not isinstance(slice_size, list) else slice_size + + if len(slice_size) != len(sliceable_head_dims): + raise ValueError( + f"You have provided {len(slice_size)}, but {self.config} has {len(sliceable_head_dims)} different" + f" attention layers. Make sure to match `len(slice_size)` to be {len(sliceable_head_dims)}." + ) + + for i in range(len(slice_size)): + size = slice_size[i] + dim = sliceable_head_dims[i] + if size is not None and size > dim: + raise ValueError(f"size {size} has to be smaller or equal to {dim}.") + + # Recursively walk through all the children. + # Any children which exposes the set_attention_slice method + # gets the message + def fn_recursive_set_attention_slice(module: torch.nn.Module, slice_size: List[int]): + if hasattr(module, "set_attention_slice"): + module.set_attention_slice(slice_size.pop()) + + for child in module.children(): + fn_recursive_set_attention_slice(child, slice_size) + + reversed_slice_size = list(reversed(slice_size)) + for module in self.children(): + fn_recursive_set_attention_slice(module, reversed_slice_size) + + def _set_gradient_checkpointing(self, module, value=False): + if isinstance(module, (CrossAttnDownBlock2D, DownBlock2D, CrossAttnUpBlock2D, UpBlock2D)): + module.gradient_checkpointing = value + + def forward( + self, + sample: torch.FloatTensor, + timestep: Union[torch.Tensor, float, int], + encoder_hidden_states: torch.Tensor, + class_labels: Optional[torch.Tensor] = None, + timestep_cond: Optional[torch.Tensor] = None, + attention_mask: Optional[torch.Tensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + added_cond_kwargs: Optional[Dict[str, torch.Tensor]] = None, + down_block_additional_residuals: Optional[Tuple[torch.Tensor]] = None, + mid_block_additional_residual: Optional[torch.Tensor] = None, + encoder_attention_mask: Optional[torch.Tensor] = None, + return_dict: bool = True, + ) -> Union[UNet2DConditionOutput, Tuple]: + r""" + The [`UNet2DConditionModel`] forward method. + + Args: + sample (`torch.FloatTensor`): + The noisy input tensor with the following shape `(batch, channel, height, width)`. + timestep (`torch.FloatTensor` or `float` or `int`): The number of timesteps to denoise an input. + encoder_hidden_states (`torch.FloatTensor`): + The encoder hidden states with shape `(batch, sequence_length, feature_dim)`. + encoder_attention_mask (`torch.Tensor`): + A cross-attention mask of shape `(batch, sequence_length)` is applied to `encoder_hidden_states`. If + `True` the mask is kept, otherwise if `False` it is discarded. Mask will be converted into a bias, + which adds large negative values to the attention scores corresponding to "discard" tokens. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.unet_2d_condition.UNet2DConditionOutput`] instead of a plain + tuple. + cross_attention_kwargs (`dict`, *optional*): + A kwargs dictionary that if specified is passed along to the [`AttnProcessor`]. + added_cond_kwargs: (`dict`, *optional*): + A kwargs dictionary containin additional embeddings that if specified are added to the embeddings that + are passed along to the UNet blocks. + + Returns: + [`~models.unet_2d_condition.UNet2DConditionOutput`] or `tuple`: + If `return_dict` is True, an [`~models.unet_2d_condition.UNet2DConditionOutput`] is returned, otherwise + a `tuple` is returned where the first element is the sample tensor. + """ + # By default samples have to be AT least a multiple of the overall upsampling factor. + # The overall upsampling factor is equal to 2 ** (# num of upsampling layers). + # However, the upsampling interpolation output size can be forced to fit any upsampling size + # on the fly if necessary. + default_overall_up_factor = 2**self.num_upsamplers + + # upsample size should be forwarded when sample is not a multiple of `default_overall_up_factor` + forward_upsample_size = False + upsample_size = None + + if any(s % default_overall_up_factor != 0 for s in sample.shape[-2:]): + logger.info("Forward upsample size to force interpolation output size.") + forward_upsample_size = True + + # ensure attention_mask is a bias, and give it a singleton query_tokens dimension + # expects mask of shape: + # [batch, key_tokens] + # adds singleton query_tokens dimension: + # [batch, 1, key_tokens] + # this helps to broadcast it as a bias over attention scores, which will be in one of the following shapes: + # [batch, heads, query_tokens, key_tokens] (e.g. torch sdp attn) + # [batch * heads, query_tokens, key_tokens] (e.g. xformers or classic attn) + if attention_mask is not None: + # assume that mask is expressed as: + # (1 = keep, 0 = discard) + # convert mask into a bias that can be added to attention scores: + # (keep = +0, discard = -10000.0) + attention_mask = (1 - attention_mask.to(sample.dtype)) * -10000.0 + attention_mask = attention_mask.unsqueeze(1) + + # convert encoder_attention_mask to a bias the same way we do for attention_mask + if encoder_attention_mask is not None: + encoder_attention_mask = (1 - encoder_attention_mask.to(sample.dtype)) * -10000.0 + encoder_attention_mask = encoder_attention_mask.unsqueeze(1) + + # 0. center input if necessary + if self.config.center_input_sample: + sample = 2 * sample - 1.0 + + # 1. time + timesteps = timestep + if not torch.is_tensor(timesteps): + # TODO: this requires sync between CPU and GPU. So try to pass timesteps as tensors if you can + # This would be a good case for the `match` statement (Python 3.10+) + is_mps = sample.device.type == "mps" + if isinstance(timestep, float): + dtype = torch.float32 if is_mps else torch.float64 + else: + dtype = torch.int32 if is_mps else torch.int64 + timesteps = torch.tensor([timesteps], dtype=dtype, device=sample.device) + elif len(timesteps.shape) == 0: + timesteps = timesteps[None].to(sample.device) + + # broadcast to batch dimension in a way that's compatible with ONNX/Core ML + timesteps = timesteps.expand(sample.shape[0]) + + t_emb = self.time_proj(timesteps) + + # `Timesteps` does not contain any weights and will always return f32 tensors + # but time_embedding might actually be running in fp16. so we need to cast here. + # there might be better ways to encapsulate this. + t_emb = t_emb.to(dtype=sample.dtype) + + emb = self.time_embedding(t_emb, timestep_cond) + aug_emb = None + + if self.class_embedding is not None: + if class_labels is None: + raise ValueError("class_labels should be provided when num_class_embeds > 0") + + if self.config.class_embed_type == "timestep": + class_labels = self.time_proj(class_labels) + + # `Timesteps` does not contain any weights and will always return f32 tensors + # there might be better ways to encapsulate this. + class_labels = class_labels.to(dtype=sample.dtype) + + class_emb = self.class_embedding(class_labels).to(dtype=sample.dtype) + + if self.config.class_embeddings_concat: + emb = torch.cat([emb, class_emb], dim=-1) + else: + emb = emb + class_emb + + if self.config.addition_embed_type == "text": + aug_emb = self.add_embedding(encoder_hidden_states) + elif self.config.addition_embed_type == "text_image": + # Kandinsky 2.1 - style + if "image_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'text_image' which requires the keyword argument `image_embeds` to be passed in `added_cond_kwargs`" + ) + + image_embs = added_cond_kwargs.get("image_embeds") + text_embs = added_cond_kwargs.get("text_embeds", encoder_hidden_states) + aug_emb = self.add_embedding(text_embs, image_embs) + elif self.config.addition_embed_type == "text_time": + if "text_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'text_time' which requires the keyword argument `text_embeds` to be passed in `added_cond_kwargs`" + ) + text_embeds = added_cond_kwargs.get("text_embeds") + if "time_ids" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'text_time' which requires the keyword argument `time_ids` to be passed in `added_cond_kwargs`" + ) + time_ids = added_cond_kwargs.get("time_ids") + time_embeds = self.add_time_proj(time_ids.flatten()) + time_embeds = time_embeds.reshape((text_embeds.shape[0], -1)) + + add_embeds = torch.concat([text_embeds, time_embeds], dim=-1) + add_embeds = add_embeds.to(emb.dtype) + aug_emb = self.add_embedding(add_embeds) + elif self.config.addition_embed_type == "image": + # Kandinsky 2.2 - style + if "image_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'image' which requires the keyword argument `image_embeds` to be passed in `added_cond_kwargs`" + ) + image_embs = added_cond_kwargs.get("image_embeds") + aug_emb = self.add_embedding(image_embs) + elif self.config.addition_embed_type == "image_hint": + # Kandinsky 2.2 - style + if "image_embeds" not in added_cond_kwargs or "hint" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'image_hint' which requires the keyword arguments `image_embeds` and `hint` to be passed in `added_cond_kwargs`" + ) + image_embs = added_cond_kwargs.get("image_embeds") + hint = added_cond_kwargs.get("hint") + aug_emb, hint = self.add_embedding(image_embs, hint) + sample = torch.cat([sample, hint], dim=1) + + emb = emb + aug_emb if aug_emb is not None else emb + + if self.time_embed_act is not None: + emb = self.time_embed_act(emb) + + if self.encoder_hid_proj is not None and self.config.encoder_hid_dim_type == "text_proj": + encoder_hidden_states = self.encoder_hid_proj(encoder_hidden_states) + elif self.encoder_hid_proj is not None and self.config.encoder_hid_dim_type == "text_image_proj": + # Kadinsky 2.1 - style + if "image_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `encoder_hid_dim_type` set to 'text_image_proj' which requires the keyword argument `image_embeds` to be passed in `added_conditions`" + ) + + image_embeds = added_cond_kwargs.get("image_embeds") + encoder_hidden_states = self.encoder_hid_proj(encoder_hidden_states, image_embeds) + elif self.encoder_hid_proj is not None and self.config.encoder_hid_dim_type == "image_proj": + # Kandinsky 2.2 - style + if "image_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `encoder_hid_dim_type` set to 'image_proj' which requires the keyword argument `image_embeds` to be passed in `added_conditions`" + ) + image_embeds = added_cond_kwargs.get("image_embeds") + encoder_hidden_states = self.encoder_hid_proj(image_embeds) + # 2. pre-process + sample = self.conv_in(sample) + + # 3. down + down_block_res_samples = (sample,) + for downsample_block in self.down_blocks: + if hasattr(downsample_block, "has_cross_attention") and downsample_block.has_cross_attention: + sample, res_samples = downsample_block( + hidden_states=sample, + temb=emb, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + encoder_attention_mask=encoder_attention_mask, + ) + else: + sample, res_samples = downsample_block(hidden_states=sample, temb=emb) + + down_block_res_samples += res_samples + + if down_block_additional_residuals is not None: + new_down_block_res_samples = () + + for down_block_res_sample, down_block_additional_residual in zip( + down_block_res_samples, down_block_additional_residuals + ): + down_block_res_sample = down_block_res_sample + down_block_additional_residual + new_down_block_res_samples = new_down_block_res_samples + (down_block_res_sample,) + + down_block_res_samples = new_down_block_res_samples + + # 4. mid + if self.mid_block is not None: + sample = self.mid_block( + sample, + emb, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + encoder_attention_mask=encoder_attention_mask, + ) + + if mid_block_additional_residual is not None: + sample = sample + mid_block_additional_residual + + # 5. up + for i, upsample_block in enumerate(self.up_blocks): + is_final_block = i == len(self.up_blocks) - 1 + + res_samples = down_block_res_samples[-len(upsample_block.resnets) :] + down_block_res_samples = down_block_res_samples[: -len(upsample_block.resnets)] + + # if we have not reached the final block and need to forward the + # upsample size, we do it here + if not is_final_block and forward_upsample_size: + upsample_size = down_block_res_samples[-1].shape[2:] + + if hasattr(upsample_block, "has_cross_attention") and upsample_block.has_cross_attention: + sample = upsample_block( + hidden_states=sample, + temb=emb, + res_hidden_states_tuple=res_samples, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + upsample_size=upsample_size, + attention_mask=attention_mask, + encoder_attention_mask=encoder_attention_mask, + ) + else: + sample = upsample_block( + hidden_states=sample, temb=emb, res_hidden_states_tuple=res_samples, upsample_size=upsample_size + ) + + # 6. post-process + if self.conv_norm_out: + sample = self.conv_norm_out(sample) + sample = self.conv_act(sample) + sample = self.conv_out(sample) + + if not return_dict: + return (sample,) + + return UNet2DConditionOutput(sample=sample) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_condition_flax.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_condition_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..de39bc75d2e392a423c9ea09e979b9f42d818dc1 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_2d_condition_flax.py @@ -0,0 +1,357 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Optional, Tuple, Union + +import flax +import flax.linen as nn +import jax +import jax.numpy as jnp +from flax.core.frozen_dict import FrozenDict + +from ..configuration_utils import ConfigMixin, flax_register_to_config +from ..utils import BaseOutput +from .embeddings_flax import FlaxTimestepEmbedding, FlaxTimesteps +from .modeling_flax_utils import FlaxModelMixin +from .unet_2d_blocks_flax import ( + FlaxCrossAttnDownBlock2D, + FlaxCrossAttnUpBlock2D, + FlaxDownBlock2D, + FlaxUNetMidBlock2DCrossAttn, + FlaxUpBlock2D, +) + + +@flax.struct.dataclass +class FlaxUNet2DConditionOutput(BaseOutput): + """ + The output of [`FlaxUNet2DConditionModel`]. + + Args: + sample (`jnp.ndarray` of shape `(batch_size, num_channels, height, width)`): + The hidden states output conditioned on `encoder_hidden_states` input. Output of last layer of model. + """ + + sample: jnp.ndarray + + +@flax_register_to_config +class FlaxUNet2DConditionModel(nn.Module, FlaxModelMixin, ConfigMixin): + r""" + A conditional 2D UNet model that takes a noisy sample, conditional state, and a timestep and returns a sample + shaped output. + + This model inherits from [`FlaxModelMixin`]. Check the superclass documentation for it's generic methods + implemented for all models (such as downloading or saving). + + This model is also a Flax Linen [flax.linen.Module](https://flax.readthedocs.io/en/latest/flax.linen.html#module) + subclass. Use it as a regular Flax Linen module and refer to the Flax documentation for all matters related to its + general usage and behavior. + + Inherent JAX features such as the following are supported: + - [Just-In-Time (JIT) compilation](https://jax.readthedocs.io/en/latest/jax.html#just-in-time-compilation-jit) + - [Automatic Differentiation](https://jax.readthedocs.io/en/latest/jax.html#automatic-differentiation) + - [Vectorization](https://jax.readthedocs.io/en/latest/jax.html#vectorization-vmap) + - [Parallelization](https://jax.readthedocs.io/en/latest/jax.html#parallelization-pmap) + + Parameters: + sample_size (`int`, *optional*): + The size of the input sample. + in_channels (`int`, *optional*, defaults to 4): + The number of channels in the input sample. + out_channels (`int`, *optional*, defaults to 4): + The number of channels in the output. + down_block_types (`Tuple[str]`, *optional*, defaults to `("FlaxCrossAttnDownBlock2D", "FlaxCrossAttnDownBlock2D", "FlaxCrossAttnDownBlock2D", "FlaxDownBlock2D")`): + The tuple of downsample blocks to use. + up_block_types (`Tuple[str]`, *optional*, defaults to `("FlaxUpBlock2D", "FlaxCrossAttnUpBlock2D", "FlaxCrossAttnUpBlock2D", "FlaxCrossAttnUpBlock2D")`): + The tuple of upsample blocks to use. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(320, 640, 1280, 1280)`): + The tuple of output channels for each block. + layers_per_block (`int`, *optional*, defaults to 2): + The number of layers per block. + attention_head_dim (`int` or `Tuple[int]`, *optional*, defaults to 8): + The dimension of the attention heads. + num_attention_heads (`int` or `Tuple[int]`, *optional*): + The number of attention heads. + cross_attention_dim (`int`, *optional*, defaults to 768): + The dimension of the cross attention features. + dropout (`float`, *optional*, defaults to 0): + Dropout probability for down, up and bottleneck blocks. + flip_sin_to_cos (`bool`, *optional*, defaults to `True`): + Whether to flip the sin to cos in the time embedding. + freq_shift (`int`, *optional*, defaults to 0): The frequency shift to apply to the time embedding. + use_memory_efficient_attention (`bool`, *optional*, defaults to `False`): + Enable memory efficient attention as described [here](https://arxiv.org/abs/2112.05682). + """ + + sample_size: int = 32 + in_channels: int = 4 + out_channels: int = 4 + down_block_types: Tuple[str] = ( + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "DownBlock2D", + ) + up_block_types: Tuple[str] = ("UpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D") + only_cross_attention: Union[bool, Tuple[bool]] = False + block_out_channels: Tuple[int] = (320, 640, 1280, 1280) + layers_per_block: int = 2 + attention_head_dim: Union[int, Tuple[int]] = 8 + num_attention_heads: Optional[Union[int, Tuple[int]]] = None + cross_attention_dim: int = 1280 + dropout: float = 0.0 + use_linear_projection: bool = False + dtype: jnp.dtype = jnp.float32 + flip_sin_to_cos: bool = True + freq_shift: int = 0 + use_memory_efficient_attention: bool = False + + def init_weights(self, rng: jax.random.KeyArray) -> FrozenDict: + # init input tensors + sample_shape = (1, self.in_channels, self.sample_size, self.sample_size) + sample = jnp.zeros(sample_shape, dtype=jnp.float32) + timesteps = jnp.ones((1,), dtype=jnp.int32) + encoder_hidden_states = jnp.zeros((1, 1, self.cross_attention_dim), dtype=jnp.float32) + + params_rng, dropout_rng = jax.random.split(rng) + rngs = {"params": params_rng, "dropout": dropout_rng} + + return self.init(rngs, sample, timesteps, encoder_hidden_states)["params"] + + def setup(self): + block_out_channels = self.block_out_channels + time_embed_dim = block_out_channels[0] * 4 + + if self.num_attention_heads is not None: + raise ValueError( + "At the moment it is not possible to define the number of attention heads via `num_attention_heads` because of a naming issue as described in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131. Passing `num_attention_heads` will only be supported in diffusers v0.19." + ) + + # If `num_attention_heads` is not defined (which is the case for most models) + # it will default to `attention_head_dim`. This looks weird upon first reading it and it is. + # The reason for this behavior is to correct for incorrectly named variables that were introduced + # when this library was created. The incorrect naming was only discovered much later in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131 + # Changing `attention_head_dim` to `num_attention_heads` for 40,000+ configurations is too backwards breaking + # which is why we correct for the naming here. + num_attention_heads = self.num_attention_heads or self.attention_head_dim + + # input + self.conv_in = nn.Conv( + block_out_channels[0], + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + # time + self.time_proj = FlaxTimesteps( + block_out_channels[0], flip_sin_to_cos=self.flip_sin_to_cos, freq_shift=self.config.freq_shift + ) + self.time_embedding = FlaxTimestepEmbedding(time_embed_dim, dtype=self.dtype) + + only_cross_attention = self.only_cross_attention + if isinstance(only_cross_attention, bool): + only_cross_attention = (only_cross_attention,) * len(self.down_block_types) + + if isinstance(num_attention_heads, int): + num_attention_heads = (num_attention_heads,) * len(self.down_block_types) + + # down + down_blocks = [] + output_channel = block_out_channels[0] + for i, down_block_type in enumerate(self.down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + if down_block_type == "CrossAttnDownBlock2D": + down_block = FlaxCrossAttnDownBlock2D( + in_channels=input_channel, + out_channels=output_channel, + dropout=self.dropout, + num_layers=self.layers_per_block, + num_attention_heads=num_attention_heads[i], + add_downsample=not is_final_block, + use_linear_projection=self.use_linear_projection, + only_cross_attention=only_cross_attention[i], + use_memory_efficient_attention=self.use_memory_efficient_attention, + dtype=self.dtype, + ) + else: + down_block = FlaxDownBlock2D( + in_channels=input_channel, + out_channels=output_channel, + dropout=self.dropout, + num_layers=self.layers_per_block, + add_downsample=not is_final_block, + dtype=self.dtype, + ) + + down_blocks.append(down_block) + self.down_blocks = down_blocks + + # mid + self.mid_block = FlaxUNetMidBlock2DCrossAttn( + in_channels=block_out_channels[-1], + dropout=self.dropout, + num_attention_heads=num_attention_heads[-1], + use_linear_projection=self.use_linear_projection, + use_memory_efficient_attention=self.use_memory_efficient_attention, + dtype=self.dtype, + ) + + # up + up_blocks = [] + reversed_block_out_channels = list(reversed(block_out_channels)) + reversed_num_attention_heads = list(reversed(num_attention_heads)) + only_cross_attention = list(reversed(only_cross_attention)) + output_channel = reversed_block_out_channels[0] + for i, up_block_type in enumerate(self.up_block_types): + prev_output_channel = output_channel + output_channel = reversed_block_out_channels[i] + input_channel = reversed_block_out_channels[min(i + 1, len(block_out_channels) - 1)] + + is_final_block = i == len(block_out_channels) - 1 + + if up_block_type == "CrossAttnUpBlock2D": + up_block = FlaxCrossAttnUpBlock2D( + in_channels=input_channel, + out_channels=output_channel, + prev_output_channel=prev_output_channel, + num_layers=self.layers_per_block + 1, + num_attention_heads=reversed_num_attention_heads[i], + add_upsample=not is_final_block, + dropout=self.dropout, + use_linear_projection=self.use_linear_projection, + only_cross_attention=only_cross_attention[i], + use_memory_efficient_attention=self.use_memory_efficient_attention, + dtype=self.dtype, + ) + else: + up_block = FlaxUpBlock2D( + in_channels=input_channel, + out_channels=output_channel, + prev_output_channel=prev_output_channel, + num_layers=self.layers_per_block + 1, + add_upsample=not is_final_block, + dropout=self.dropout, + dtype=self.dtype, + ) + + up_blocks.append(up_block) + prev_output_channel = output_channel + self.up_blocks = up_blocks + + # out + self.conv_norm_out = nn.GroupNorm(num_groups=32, epsilon=1e-5) + self.conv_out = nn.Conv( + self.out_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + def __call__( + self, + sample, + timesteps, + encoder_hidden_states, + down_block_additional_residuals=None, + mid_block_additional_residual=None, + return_dict: bool = True, + train: bool = False, + ) -> Union[FlaxUNet2DConditionOutput, Tuple]: + r""" + Args: + sample (`jnp.ndarray`): (batch, channel, height, width) noisy inputs tensor + timestep (`jnp.ndarray` or `float` or `int`): timesteps + encoder_hidden_states (`jnp.ndarray`): (batch_size, sequence_length, hidden_size) encoder hidden states + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`models.unet_2d_condition_flax.FlaxUNet2DConditionOutput`] instead of a + plain tuple. + train (`bool`, *optional*, defaults to `False`): + Use deterministic functions and disable dropout when not training. + + Returns: + [`~models.unet_2d_condition_flax.FlaxUNet2DConditionOutput`] or `tuple`: + [`~models.unet_2d_condition_flax.FlaxUNet2DConditionOutput`] if `return_dict` is True, otherwise a `tuple`. + When returning a tuple, the first element is the sample tensor. + """ + # 1. time + if not isinstance(timesteps, jnp.ndarray): + timesteps = jnp.array([timesteps], dtype=jnp.int32) + elif isinstance(timesteps, jnp.ndarray) and len(timesteps.shape) == 0: + timesteps = timesteps.astype(dtype=jnp.float32) + timesteps = jnp.expand_dims(timesteps, 0) + + t_emb = self.time_proj(timesteps) + t_emb = self.time_embedding(t_emb) + + # 2. pre-process + sample = jnp.transpose(sample, (0, 2, 3, 1)) + sample = self.conv_in(sample) + + # 3. down + down_block_res_samples = (sample,) + for down_block in self.down_blocks: + if isinstance(down_block, FlaxCrossAttnDownBlock2D): + sample, res_samples = down_block(sample, t_emb, encoder_hidden_states, deterministic=not train) + else: + sample, res_samples = down_block(sample, t_emb, deterministic=not train) + down_block_res_samples += res_samples + + if down_block_additional_residuals is not None: + new_down_block_res_samples = () + + for down_block_res_sample, down_block_additional_residual in zip( + down_block_res_samples, down_block_additional_residuals + ): + down_block_res_sample += down_block_additional_residual + new_down_block_res_samples += (down_block_res_sample,) + + down_block_res_samples = new_down_block_res_samples + + # 4. mid + sample = self.mid_block(sample, t_emb, encoder_hidden_states, deterministic=not train) + + if mid_block_additional_residual is not None: + sample += mid_block_additional_residual + + # 5. up + for up_block in self.up_blocks: + res_samples = down_block_res_samples[-(self.layers_per_block + 1) :] + down_block_res_samples = down_block_res_samples[: -(self.layers_per_block + 1)] + if isinstance(up_block, FlaxCrossAttnUpBlock2D): + sample = up_block( + sample, + temb=t_emb, + encoder_hidden_states=encoder_hidden_states, + res_hidden_states_tuple=res_samples, + deterministic=not train, + ) + else: + sample = up_block(sample, temb=t_emb, res_hidden_states_tuple=res_samples, deterministic=not train) + + # 6. post-process + sample = self.conv_norm_out(sample) + sample = nn.silu(sample) + sample = self.conv_out(sample) + sample = jnp.transpose(sample, (0, 3, 1, 2)) + + if not return_dict: + return (sample,) + + return FlaxUNet2DConditionOutput(sample=sample) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_3d_blocks.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_3d_blocks.py new file mode 100644 index 0000000000000000000000000000000000000000..ab5c393518e2ad8edf21069dfcd417392001569d --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_3d_blocks.py @@ -0,0 +1,679 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +from torch import nn + +from .resnet import Downsample2D, ResnetBlock2D, TemporalConvLayer, Upsample2D +from .transformer_2d import Transformer2DModel +from .transformer_temporal import TransformerTemporalModel + + +def get_down_block( + down_block_type, + num_layers, + in_channels, + out_channels, + temb_channels, + add_downsample, + resnet_eps, + resnet_act_fn, + num_attention_heads, + resnet_groups=None, + cross_attention_dim=None, + downsample_padding=None, + dual_cross_attention=False, + use_linear_projection=True, + only_cross_attention=False, + upcast_attention=False, + resnet_time_scale_shift="default", +): + if down_block_type == "DownBlock3D": + return DownBlock3D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "CrossAttnDownBlock3D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for CrossAttnDownBlock3D") + return CrossAttnDownBlock3D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + raise ValueError(f"{down_block_type} does not exist.") + + +def get_up_block( + up_block_type, + num_layers, + in_channels, + out_channels, + prev_output_channel, + temb_channels, + add_upsample, + resnet_eps, + resnet_act_fn, + num_attention_heads, + resnet_groups=None, + cross_attention_dim=None, + dual_cross_attention=False, + use_linear_projection=True, + only_cross_attention=False, + upcast_attention=False, + resnet_time_scale_shift="default", +): + if up_block_type == "UpBlock3D": + return UpBlock3D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "CrossAttnUpBlock3D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for CrossAttnUpBlock3D") + return CrossAttnUpBlock3D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + raise ValueError(f"{up_block_type} does not exist.") + + +class UNetMidBlock3DCrossAttn(nn.Module): + def __init__( + self, + in_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + output_scale_factor=1.0, + cross_attention_dim=1280, + dual_cross_attention=False, + use_linear_projection=True, + upcast_attention=False, + ): + super().__init__() + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + resnet_groups = resnet_groups if resnet_groups is not None else min(in_channels // 4, 32) + + # there is always at least one resnet + resnets = [ + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ] + temp_convs = [ + TemporalConvLayer( + in_channels, + in_channels, + dropout=0.1, + ) + ] + attentions = [] + temp_attentions = [] + + for _ in range(num_layers): + attentions.append( + Transformer2DModel( + in_channels // num_attention_heads, + num_attention_heads, + in_channels=in_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + upcast_attention=upcast_attention, + ) + ) + temp_attentions.append( + TransformerTemporalModel( + in_channels // num_attention_heads, + num_attention_heads, + in_channels=in_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + temp_convs.append( + TemporalConvLayer( + in_channels, + in_channels, + dropout=0.1, + ) + ) + + self.resnets = nn.ModuleList(resnets) + self.temp_convs = nn.ModuleList(temp_convs) + self.attentions = nn.ModuleList(attentions) + self.temp_attentions = nn.ModuleList(temp_attentions) + + def forward( + self, + hidden_states, + temb=None, + encoder_hidden_states=None, + attention_mask=None, + num_frames=1, + cross_attention_kwargs=None, + ): + hidden_states = self.resnets[0](hidden_states, temb) + hidden_states = self.temp_convs[0](hidden_states, num_frames=num_frames) + for attn, temp_attn, resnet, temp_conv in zip( + self.attentions, self.temp_attentions, self.resnets[1:], self.temp_convs[1:] + ): + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + return_dict=False, + )[0] + hidden_states = temp_attn( + hidden_states, num_frames=num_frames, cross_attention_kwargs=cross_attention_kwargs, return_dict=False + )[0] + hidden_states = resnet(hidden_states, temb) + hidden_states = temp_conv(hidden_states, num_frames=num_frames) + + return hidden_states + + +class CrossAttnDownBlock3D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + downsample_padding=1, + add_downsample=True, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + ): + super().__init__() + resnets = [] + attentions = [] + temp_attentions = [] + temp_convs = [] + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + temp_convs.append( + TemporalConvLayer( + out_channels, + out_channels, + dropout=0.1, + ) + ) + attentions.append( + Transformer2DModel( + out_channels // num_attention_heads, + num_attention_heads, + in_channels=out_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + ) + ) + temp_attentions.append( + TransformerTemporalModel( + out_channels // num_attention_heads, + num_attention_heads, + in_channels=out_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + self.resnets = nn.ModuleList(resnets) + self.temp_convs = nn.ModuleList(temp_convs) + self.attentions = nn.ModuleList(attentions) + self.temp_attentions = nn.ModuleList(temp_attentions) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states, + temb=None, + encoder_hidden_states=None, + attention_mask=None, + num_frames=1, + cross_attention_kwargs=None, + ): + # TODO(Patrick, William) - attention mask is not used + output_states = () + + for resnet, temp_conv, attn, temp_attn in zip( + self.resnets, self.temp_convs, self.attentions, self.temp_attentions + ): + hidden_states = resnet(hidden_states, temb) + hidden_states = temp_conv(hidden_states, num_frames=num_frames) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + return_dict=False, + )[0] + hidden_states = temp_attn( + hidden_states, num_frames=num_frames, cross_attention_kwargs=cross_attention_kwargs, return_dict=False + )[0] + + output_states += (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + output_states += (hidden_states,) + + return hidden_states, output_states + + +class DownBlock3D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + resnets = [] + temp_convs = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + temp_convs.append( + TemporalConvLayer( + out_channels, + out_channels, + dropout=0.1, + ) + ) + + self.resnets = nn.ModuleList(resnets) + self.temp_convs = nn.ModuleList(temp_convs) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, temb=None, num_frames=1): + output_states = () + + for resnet, temp_conv in zip(self.resnets, self.temp_convs): + hidden_states = resnet(hidden_states, temb) + hidden_states = temp_conv(hidden_states, num_frames=num_frames) + + output_states += (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + output_states += (hidden_states,) + + return hidden_states, output_states + + +class CrossAttnUpBlock3D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + prev_output_channel: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + add_upsample=True, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + ): + super().__init__() + resnets = [] + temp_convs = [] + attentions = [] + temp_attentions = [] + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + temp_convs.append( + TemporalConvLayer( + out_channels, + out_channels, + dropout=0.1, + ) + ) + attentions.append( + Transformer2DModel( + out_channels // num_attention_heads, + num_attention_heads, + in_channels=out_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + ) + ) + temp_attentions.append( + TransformerTemporalModel( + out_channels // num_attention_heads, + num_attention_heads, + in_channels=out_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + self.resnets = nn.ModuleList(resnets) + self.temp_convs = nn.ModuleList(temp_convs) + self.attentions = nn.ModuleList(attentions) + self.temp_attentions = nn.ModuleList(temp_attentions) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states, + res_hidden_states_tuple, + temb=None, + encoder_hidden_states=None, + upsample_size=None, + attention_mask=None, + num_frames=1, + cross_attention_kwargs=None, + ): + # TODO(Patrick, William) - attention mask is not used + for resnet, temp_conv, attn, temp_attn in zip( + self.resnets, self.temp_convs, self.attentions, self.temp_attentions + ): + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + hidden_states = resnet(hidden_states, temb) + hidden_states = temp_conv(hidden_states, num_frames=num_frames) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + return_dict=False, + )[0] + hidden_states = temp_attn( + hidden_states, num_frames=num_frames, cross_attention_kwargs=cross_attention_kwargs, return_dict=False + )[0] + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, upsample_size) + + return hidden_states + + +class UpBlock3D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_upsample=True, + ): + super().__init__() + resnets = [] + temp_convs = [] + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + temp_convs.append( + TemporalConvLayer( + out_channels, + out_channels, + dropout=0.1, + ) + ) + + self.resnets = nn.ModuleList(resnets) + self.temp_convs = nn.ModuleList(temp_convs) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None, num_frames=1): + for resnet, temp_conv in zip(self.resnets, self.temp_convs): + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + hidden_states = resnet(hidden_states, temb) + hidden_states = temp_conv(hidden_states, num_frames=num_frames) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, upsample_size) + + return hidden_states diff --git a/gf/sd-webui-rich-text/diffusers_official/models/unet_3d_condition.py b/gf/sd-webui-rich-text/diffusers_official/models/unet_3d_condition.py new file mode 100644 index 0000000000000000000000000000000000000000..ff2a8f1179ef9654b5234d63528468e59e371b10 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/unet_3d_condition.py @@ -0,0 +1,627 @@ +# Copyright 2023 Alibaba DAMO-VILAB and The HuggingFace Team. All rights reserved. +# Copyright 2023 The ModelScope Team. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Any, Dict, List, Optional, Tuple, Union + +import torch +import torch.nn as nn +import torch.utils.checkpoint + +from ..configuration_utils import ConfigMixin, register_to_config +from ..loaders import UNet2DConditionLoadersMixin +from ..utils import BaseOutput, logging +from .attention_processor import AttentionProcessor, AttnProcessor +from .embeddings import TimestepEmbedding, Timesteps +from .modeling_utils import ModelMixin +from .transformer_temporal import TransformerTemporalModel +from .unet_3d_blocks import ( + CrossAttnDownBlock3D, + CrossAttnUpBlock3D, + DownBlock3D, + UNetMidBlock3DCrossAttn, + UpBlock3D, + get_down_block, + get_up_block, +) + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +@dataclass +class UNet3DConditionOutput(BaseOutput): + """ + The output of [`UNet3DConditionModel`]. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size, num_frames, num_channels, height, width)`): + The hidden states output conditioned on `encoder_hidden_states` input. Output of last layer of model. + """ + + sample: torch.FloatTensor + + +class UNet3DConditionModel(ModelMixin, ConfigMixin, UNet2DConditionLoadersMixin): + r""" + A conditional 3D UNet model that takes a noisy sample, conditional state, and a timestep and returns a sample + shaped output. + + This model inherits from [`ModelMixin`]. Check the superclass documentation for it's generic methods implemented + for all models (such as downloading or saving). + + Parameters: + sample_size (`int` or `Tuple[int, int]`, *optional*, defaults to `None`): + Height and width of input/output sample. + in_channels (`int`, *optional*, defaults to 4): The number of channels in the input sample. + out_channels (`int`, *optional*, defaults to 4): The number of channels in the output. + down_block_types (`Tuple[str]`, *optional*, defaults to `("CrossAttnDownBlock2D", "CrossAttnDownBlock2D", "CrossAttnDownBlock2D", "DownBlock2D")`): + The tuple of downsample blocks to use. + up_block_types (`Tuple[str]`, *optional*, defaults to `("UpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D")`): + The tuple of upsample blocks to use. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(320, 640, 1280, 1280)`): + The tuple of output channels for each block. + layers_per_block (`int`, *optional*, defaults to 2): The number of layers per block. + downsample_padding (`int`, *optional*, defaults to 1): The padding to use for the downsampling convolution. + mid_block_scale_factor (`float`, *optional*, defaults to 1.0): The scale factor to use for the mid block. + act_fn (`str`, *optional*, defaults to `"silu"`): The activation function to use. + norm_num_groups (`int`, *optional*, defaults to 32): The number of groups to use for the normalization. + If `None`, normalization and activation layers is skipped in post-processing. + norm_eps (`float`, *optional*, defaults to 1e-5): The epsilon to use for the normalization. + cross_attention_dim (`int`, *optional*, defaults to 1280): The dimension of the cross attention features. + attention_head_dim (`int`, *optional*, defaults to 8): The dimension of the attention heads. + num_attention_heads (`int`, *optional*): The number of attention heads. + """ + + _supports_gradient_checkpointing = False + + @register_to_config + def __init__( + self, + sample_size: Optional[int] = None, + in_channels: int = 4, + out_channels: int = 4, + down_block_types: Tuple[str] = ( + "CrossAttnDownBlock3D", + "CrossAttnDownBlock3D", + "CrossAttnDownBlock3D", + "DownBlock3D", + ), + up_block_types: Tuple[str] = ("UpBlock3D", "CrossAttnUpBlock3D", "CrossAttnUpBlock3D", "CrossAttnUpBlock3D"), + block_out_channels: Tuple[int] = (320, 640, 1280, 1280), + layers_per_block: int = 2, + downsample_padding: int = 1, + mid_block_scale_factor: float = 1, + act_fn: str = "silu", + norm_num_groups: Optional[int] = 32, + norm_eps: float = 1e-5, + cross_attention_dim: int = 1024, + attention_head_dim: Union[int, Tuple[int]] = 64, + num_attention_heads: Optional[Union[int, Tuple[int]]] = None, + ): + super().__init__() + + self.sample_size = sample_size + + if num_attention_heads is not None: + raise NotImplementedError( + "At the moment it is not possible to define the number of attention heads via `num_attention_heads` because of a naming issue as described in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131. Passing `num_attention_heads` will only be supported in diffusers v0.19." + ) + + # If `num_attention_heads` is not defined (which is the case for most models) + # it will default to `attention_head_dim`. This looks weird upon first reading it and it is. + # The reason for this behavior is to correct for incorrectly named variables that were introduced + # when this library was created. The incorrect naming was only discovered much later in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131 + # Changing `attention_head_dim` to `num_attention_heads` for 40,000+ configurations is too backwards breaking + # which is why we correct for the naming here. + num_attention_heads = num_attention_heads or attention_head_dim + + # Check inputs + if len(down_block_types) != len(up_block_types): + raise ValueError( + f"Must provide the same number of `down_block_types` as `up_block_types`. `down_block_types`: {down_block_types}. `up_block_types`: {up_block_types}." + ) + + if len(block_out_channels) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `block_out_channels` as `down_block_types`. `block_out_channels`: {block_out_channels}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(num_attention_heads, int) and len(num_attention_heads) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `num_attention_heads` as `down_block_types`. `num_attention_heads`: {num_attention_heads}. `down_block_types`: {down_block_types}." + ) + + # input + conv_in_kernel = 3 + conv_out_kernel = 3 + conv_in_padding = (conv_in_kernel - 1) // 2 + self.conv_in = nn.Conv2d( + in_channels, block_out_channels[0], kernel_size=conv_in_kernel, padding=conv_in_padding + ) + + # time + time_embed_dim = block_out_channels[0] * 4 + self.time_proj = Timesteps(block_out_channels[0], True, 0) + timestep_input_dim = block_out_channels[0] + + self.time_embedding = TimestepEmbedding( + timestep_input_dim, + time_embed_dim, + act_fn=act_fn, + ) + + self.transformer_in = TransformerTemporalModel( + num_attention_heads=8, + attention_head_dim=attention_head_dim, + in_channels=block_out_channels[0], + num_layers=1, + ) + + # class embedding + self.down_blocks = nn.ModuleList([]) + self.up_blocks = nn.ModuleList([]) + + if isinstance(num_attention_heads, int): + num_attention_heads = (num_attention_heads,) * len(down_block_types) + + # down + output_channel = block_out_channels[0] + for i, down_block_type in enumerate(down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + down_block = get_down_block( + down_block_type, + num_layers=layers_per_block, + in_channels=input_channel, + out_channels=output_channel, + temb_channels=time_embed_dim, + add_downsample=not is_final_block, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads[i], + downsample_padding=downsample_padding, + dual_cross_attention=False, + ) + self.down_blocks.append(down_block) + + # mid + self.mid_block = UNetMidBlock3DCrossAttn( + in_channels=block_out_channels[-1], + temb_channels=time_embed_dim, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + output_scale_factor=mid_block_scale_factor, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads[-1], + resnet_groups=norm_num_groups, + dual_cross_attention=False, + ) + + # count how many layers upsample the images + self.num_upsamplers = 0 + + # up + reversed_block_out_channels = list(reversed(block_out_channels)) + reversed_num_attention_heads = list(reversed(num_attention_heads)) + + output_channel = reversed_block_out_channels[0] + for i, up_block_type in enumerate(up_block_types): + is_final_block = i == len(block_out_channels) - 1 + + prev_output_channel = output_channel + output_channel = reversed_block_out_channels[i] + input_channel = reversed_block_out_channels[min(i + 1, len(block_out_channels) - 1)] + + # add upsample block for all BUT final layer + if not is_final_block: + add_upsample = True + self.num_upsamplers += 1 + else: + add_upsample = False + + up_block = get_up_block( + up_block_type, + num_layers=layers_per_block + 1, + in_channels=input_channel, + out_channels=output_channel, + prev_output_channel=prev_output_channel, + temb_channels=time_embed_dim, + add_upsample=add_upsample, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + cross_attention_dim=cross_attention_dim, + num_attention_heads=reversed_num_attention_heads[i], + dual_cross_attention=False, + ) + self.up_blocks.append(up_block) + prev_output_channel = output_channel + + # out + if norm_num_groups is not None: + self.conv_norm_out = nn.GroupNorm( + num_channels=block_out_channels[0], num_groups=norm_num_groups, eps=norm_eps + ) + self.conv_act = nn.SiLU() + else: + self.conv_norm_out = None + self.conv_act = None + + conv_out_padding = (conv_out_kernel - 1) // 2 + self.conv_out = nn.Conv2d( + block_out_channels[0], out_channels, kernel_size=conv_out_kernel, padding=conv_out_padding + ) + + @property + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.attn_processors + def attn_processors(self) -> Dict[str, AttentionProcessor]: + r""" + Returns: + `dict` of attention processors: A dictionary containing all attention processors used in the model with + indexed by its weight name. + """ + # set recursively + processors = {} + + def fn_recursive_add_processors(name: str, module: torch.nn.Module, processors: Dict[str, AttentionProcessor]): + if hasattr(module, "set_processor"): + processors[f"{name}.processor"] = module.processor + + for sub_name, child in module.named_children(): + fn_recursive_add_processors(f"{name}.{sub_name}", child, processors) + + return processors + + for name, module in self.named_children(): + fn_recursive_add_processors(name, module, processors) + + return processors + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_attention_slice + def set_attention_slice(self, slice_size): + r""" + Enable sliced attention computation. + + When this option is enabled, the attention module splits the input tensor in slices to compute attention in + several steps. This is useful for saving some memory in exchange for a small decrease in speed. + + Args: + slice_size (`str` or `int` or `list(int)`, *optional*, defaults to `"auto"`): + When `"auto"`, input to the attention heads is halved, so attention is computed in two steps. If + `"max"`, maximum amount of memory is saved by running only one slice at a time. If a number is + provided, uses as many slices as `attention_head_dim // slice_size`. In this case, `attention_head_dim` + must be a multiple of `slice_size`. + """ + sliceable_head_dims = [] + + def fn_recursive_retrieve_sliceable_dims(module: torch.nn.Module): + if hasattr(module, "set_attention_slice"): + sliceable_head_dims.append(module.sliceable_head_dim) + + for child in module.children(): + fn_recursive_retrieve_sliceable_dims(child) + + # retrieve number of attention layers + for module in self.children(): + fn_recursive_retrieve_sliceable_dims(module) + + num_sliceable_layers = len(sliceable_head_dims) + + if slice_size == "auto": + # half the attention head size is usually a good trade-off between + # speed and memory + slice_size = [dim // 2 for dim in sliceable_head_dims] + elif slice_size == "max": + # make smallest slice possible + slice_size = num_sliceable_layers * [1] + + slice_size = num_sliceable_layers * [slice_size] if not isinstance(slice_size, list) else slice_size + + if len(slice_size) != len(sliceable_head_dims): + raise ValueError( + f"You have provided {len(slice_size)}, but {self.config} has {len(sliceable_head_dims)} different" + f" attention layers. Make sure to match `len(slice_size)` to be {len(sliceable_head_dims)}." + ) + + for i in range(len(slice_size)): + size = slice_size[i] + dim = sliceable_head_dims[i] + if size is not None and size > dim: + raise ValueError(f"size {size} has to be smaller or equal to {dim}.") + + # Recursively walk through all the children. + # Any children which exposes the set_attention_slice method + # gets the message + def fn_recursive_set_attention_slice(module: torch.nn.Module, slice_size: List[int]): + if hasattr(module, "set_attention_slice"): + module.set_attention_slice(slice_size.pop()) + + for child in module.children(): + fn_recursive_set_attention_slice(child, slice_size) + + reversed_slice_size = list(reversed(slice_size)) + for module in self.children(): + fn_recursive_set_attention_slice(module, reversed_slice_size) + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_attn_processor + def set_attn_processor(self, processor: Union[AttentionProcessor, Dict[str, AttentionProcessor]]): + r""" + Sets the attention processor to use to compute attention. + + Parameters: + processor (`dict` of `AttentionProcessor` or only `AttentionProcessor`): + The instantiated processor class or a dictionary of processor classes that will be set as the processor + for **all** `Attention` layers. + + If `processor` is a dict, the key needs to define the path to the corresponding cross attention + processor. This is strongly recommended when setting trainable attention processors. + + """ + count = len(self.attn_processors.keys()) + + if isinstance(processor, dict) and len(processor) != count: + raise ValueError( + f"A dict of processors was passed, but the number of processors {len(processor)} does not match the" + f" number of attention layers: {count}. Please make sure to pass {count} processor classes." + ) + + def fn_recursive_attn_processor(name: str, module: torch.nn.Module, processor): + if hasattr(module, "set_processor"): + if not isinstance(processor, dict): + module.set_processor(processor) + else: + module.set_processor(processor.pop(f"{name}.processor")) + + for sub_name, child in module.named_children(): + fn_recursive_attn_processor(f"{name}.{sub_name}", child, processor) + + for name, module in self.named_children(): + fn_recursive_attn_processor(name, module, processor) + + def enable_forward_chunking(self, chunk_size=None, dim=0): + """ + Sets the attention processor to use [feed forward + chunking](https://huggingface.co/blog/reformer#2-chunked-feed-forward-layers). + + Parameters: + chunk_size (`int`, *optional*): + The chunk size of the feed-forward layers. If not specified, will run feed-forward layer individually + over each tensor of dim=`dim`. + dim (`int`, *optional*, defaults to `0`): + The dimension over which the feed-forward computation should be chunked. Choose between dim=0 (batch) + or dim=1 (sequence length). + """ + if dim not in [0, 1]: + raise ValueError(f"Make sure to set `dim` to either 0 or 1, not {dim}") + + # By default chunk size is 1 + chunk_size = chunk_size or 1 + + def fn_recursive_feed_forward(module: torch.nn.Module, chunk_size: int, dim: int): + if hasattr(module, "set_chunk_feed_forward"): + module.set_chunk_feed_forward(chunk_size=chunk_size, dim=dim) + + for child in module.children(): + fn_recursive_feed_forward(child, chunk_size, dim) + + for module in self.children(): + fn_recursive_feed_forward(module, chunk_size, dim) + + def disable_forward_chunking(self): + def fn_recursive_feed_forward(module: torch.nn.Module, chunk_size: int, dim: int): + if hasattr(module, "set_chunk_feed_forward"): + module.set_chunk_feed_forward(chunk_size=chunk_size, dim=dim) + + for child in module.children(): + fn_recursive_feed_forward(child, chunk_size, dim) + + for module in self.children(): + fn_recursive_feed_forward(module, None, 0) + + # Copied from diffusers.models.unet_2d_condition.UNet2DConditionModel.set_default_attn_processor + def set_default_attn_processor(self): + """ + Disables custom attention processors and sets the default attention implementation. + """ + self.set_attn_processor(AttnProcessor()) + + def _set_gradient_checkpointing(self, module, value=False): + if isinstance(module, (CrossAttnDownBlock3D, DownBlock3D, CrossAttnUpBlock3D, UpBlock3D)): + module.gradient_checkpointing = value + + def forward( + self, + sample: torch.FloatTensor, + timestep: Union[torch.Tensor, float, int], + encoder_hidden_states: torch.Tensor, + class_labels: Optional[torch.Tensor] = None, + timestep_cond: Optional[torch.Tensor] = None, + attention_mask: Optional[torch.Tensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + down_block_additional_residuals: Optional[Tuple[torch.Tensor]] = None, + mid_block_additional_residual: Optional[torch.Tensor] = None, + return_dict: bool = True, + ) -> Union[UNet3DConditionOutput, Tuple]: + r""" + The [`UNet3DConditionModel`] forward method. + + Args: + sample (`torch.FloatTensor`): + The noisy input tensor with the following shape `(batch, num_frames, channel, height, width`. + timestep (`torch.FloatTensor` or `float` or `int`): The number of timesteps to denoise an input. + encoder_hidden_states (`torch.FloatTensor`): + The encoder hidden states with shape `(batch, sequence_length, feature_dim)`. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.unet_3d_condition.UNet3DConditionOutput`] instead of a plain + tuple. + cross_attention_kwargs (`dict`, *optional*): + A kwargs dictionary that if specified is passed along to the [`AttnProcessor`]. + + Returns: + [`~models.unet_3d_condition.UNet3DConditionOutput`] or `tuple`: + If `return_dict` is True, an [`~models.unet_3d_condition.UNet3DConditionOutput`] is returned, otherwise + a `tuple` is returned where the first element is the sample tensor. + """ + # By default samples have to be AT least a multiple of the overall upsampling factor. + # The overall upsampling factor is equal to 2 ** (# num of upsampling layears). + # However, the upsampling interpolation output size can be forced to fit any upsampling size + # on the fly if necessary. + default_overall_up_factor = 2**self.num_upsamplers + + # upsample size should be forwarded when sample is not a multiple of `default_overall_up_factor` + forward_upsample_size = False + upsample_size = None + + if any(s % default_overall_up_factor != 0 for s in sample.shape[-2:]): + logger.info("Forward upsample size to force interpolation output size.") + forward_upsample_size = True + + # prepare attention_mask + if attention_mask is not None: + attention_mask = (1 - attention_mask.to(sample.dtype)) * -10000.0 + attention_mask = attention_mask.unsqueeze(1) + + # 1. time + timesteps = timestep + if not torch.is_tensor(timesteps): + # TODO: this requires sync between CPU and GPU. So try to pass timesteps as tensors if you can + # This would be a good case for the `match` statement (Python 3.10+) + is_mps = sample.device.type == "mps" + if isinstance(timestep, float): + dtype = torch.float32 if is_mps else torch.float64 + else: + dtype = torch.int32 if is_mps else torch.int64 + timesteps = torch.tensor([timesteps], dtype=dtype, device=sample.device) + elif len(timesteps.shape) == 0: + timesteps = timesteps[None].to(sample.device) + + # broadcast to batch dimension in a way that's compatible with ONNX/Core ML + num_frames = sample.shape[2] + timesteps = timesteps.expand(sample.shape[0]) + + t_emb = self.time_proj(timesteps) + + # timesteps does not contain any weights and will always return f32 tensors + # but time_embedding might actually be running in fp16. so we need to cast here. + # there might be better ways to encapsulate this. + t_emb = t_emb.to(dtype=self.dtype) + + emb = self.time_embedding(t_emb, timestep_cond) + emb = emb.repeat_interleave(repeats=num_frames, dim=0) + encoder_hidden_states = encoder_hidden_states.repeat_interleave(repeats=num_frames, dim=0) + + # 2. pre-process + sample = sample.permute(0, 2, 1, 3, 4).reshape((sample.shape[0] * num_frames, -1) + sample.shape[3:]) + sample = self.conv_in(sample) + + sample = self.transformer_in( + sample, + num_frames=num_frames, + cross_attention_kwargs=cross_attention_kwargs, + return_dict=False, + )[0] + + # 3. down + down_block_res_samples = (sample,) + for downsample_block in self.down_blocks: + if hasattr(downsample_block, "has_cross_attention") and downsample_block.has_cross_attention: + sample, res_samples = downsample_block( + hidden_states=sample, + temb=emb, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + num_frames=num_frames, + cross_attention_kwargs=cross_attention_kwargs, + ) + else: + sample, res_samples = downsample_block(hidden_states=sample, temb=emb, num_frames=num_frames) + + down_block_res_samples += res_samples + + if down_block_additional_residuals is not None: + new_down_block_res_samples = () + + for down_block_res_sample, down_block_additional_residual in zip( + down_block_res_samples, down_block_additional_residuals + ): + down_block_res_sample = down_block_res_sample + down_block_additional_residual + new_down_block_res_samples += (down_block_res_sample,) + + down_block_res_samples = new_down_block_res_samples + + # 4. mid + if self.mid_block is not None: + sample = self.mid_block( + sample, + emb, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + num_frames=num_frames, + cross_attention_kwargs=cross_attention_kwargs, + ) + + if mid_block_additional_residual is not None: + sample = sample + mid_block_additional_residual + + # 5. up + for i, upsample_block in enumerate(self.up_blocks): + is_final_block = i == len(self.up_blocks) - 1 + + res_samples = down_block_res_samples[-len(upsample_block.resnets) :] + down_block_res_samples = down_block_res_samples[: -len(upsample_block.resnets)] + + # if we have not reached the final block and need to forward the + # upsample size, we do it here + if not is_final_block and forward_upsample_size: + upsample_size = down_block_res_samples[-1].shape[2:] + + if hasattr(upsample_block, "has_cross_attention") and upsample_block.has_cross_attention: + sample = upsample_block( + hidden_states=sample, + temb=emb, + res_hidden_states_tuple=res_samples, + encoder_hidden_states=encoder_hidden_states, + upsample_size=upsample_size, + attention_mask=attention_mask, + num_frames=num_frames, + cross_attention_kwargs=cross_attention_kwargs, + ) + else: + sample = upsample_block( + hidden_states=sample, + temb=emb, + res_hidden_states_tuple=res_samples, + upsample_size=upsample_size, + num_frames=num_frames, + ) + + # 6. post-process + if self.conv_norm_out: + sample = self.conv_norm_out(sample) + sample = self.conv_act(sample) + + sample = self.conv_out(sample) + + # reshape to (batch, channel, framerate, width, height) + sample = sample[None, :].reshape((-1, num_frames) + sample.shape[1:]).permute(0, 2, 1, 3, 4) + + if not return_dict: + return (sample,) + + return UNet3DConditionOutput(sample=sample) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/vae.py b/gf/sd-webui-rich-text/diffusers_official/models/vae.py new file mode 100644 index 0000000000000000000000000000000000000000..edd516dd380aa6f5888174bbd5f3df86be187feb --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/vae.py @@ -0,0 +1,441 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Optional + +import numpy as np +import torch +import torch.nn as nn + +from ..utils import BaseOutput, is_torch_version, randn_tensor +from .attention_processor import SpatialNorm +from .unet_2d_blocks import UNetMidBlock2D, get_down_block, get_up_block + + +@dataclass +class DecoderOutput(BaseOutput): + """ + Output of decoding method. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)`): + The decoded output sample from the last layer of the model. + """ + + sample: torch.FloatTensor + + +class Encoder(nn.Module): + def __init__( + self, + in_channels=3, + out_channels=3, + down_block_types=("DownEncoderBlock2D",), + block_out_channels=(64,), + layers_per_block=2, + norm_num_groups=32, + act_fn="silu", + double_z=True, + ): + super().__init__() + self.layers_per_block = layers_per_block + + self.conv_in = torch.nn.Conv2d( + in_channels, + block_out_channels[0], + kernel_size=3, + stride=1, + padding=1, + ) + + self.mid_block = None + self.down_blocks = nn.ModuleList([]) + + # down + output_channel = block_out_channels[0] + for i, down_block_type in enumerate(down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + down_block = get_down_block( + down_block_type, + num_layers=self.layers_per_block, + in_channels=input_channel, + out_channels=output_channel, + add_downsample=not is_final_block, + resnet_eps=1e-6, + downsample_padding=0, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + attention_head_dim=output_channel, + temb_channels=None, + ) + self.down_blocks.append(down_block) + + # mid + self.mid_block = UNetMidBlock2D( + in_channels=block_out_channels[-1], + resnet_eps=1e-6, + resnet_act_fn=act_fn, + output_scale_factor=1, + resnet_time_scale_shift="default", + attention_head_dim=block_out_channels[-1], + resnet_groups=norm_num_groups, + temb_channels=None, + ) + + # out + self.conv_norm_out = nn.GroupNorm(num_channels=block_out_channels[-1], num_groups=norm_num_groups, eps=1e-6) + self.conv_act = nn.SiLU() + + conv_out_channels = 2 * out_channels if double_z else out_channels + self.conv_out = nn.Conv2d(block_out_channels[-1], conv_out_channels, 3, padding=1) + + self.gradient_checkpointing = False + + def forward(self, x): + sample = x + sample = self.conv_in(sample) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + # down + if is_torch_version(">=", "1.11.0"): + for down_block in self.down_blocks: + sample = torch.utils.checkpoint.checkpoint( + create_custom_forward(down_block), sample, use_reentrant=False + ) + # middle + sample = torch.utils.checkpoint.checkpoint( + create_custom_forward(self.mid_block), sample, use_reentrant=False + ) + else: + for down_block in self.down_blocks: + sample = torch.utils.checkpoint.checkpoint(create_custom_forward(down_block), sample) + # middle + sample = torch.utils.checkpoint.checkpoint(create_custom_forward(self.mid_block), sample) + + else: + # down + for down_block in self.down_blocks: + sample = down_block(sample) + + # middle + sample = self.mid_block(sample) + + # post-process + sample = self.conv_norm_out(sample) + sample = self.conv_act(sample) + sample = self.conv_out(sample) + + return sample + + +class Decoder(nn.Module): + def __init__( + self, + in_channels=3, + out_channels=3, + up_block_types=("UpDecoderBlock2D",), + block_out_channels=(64,), + layers_per_block=2, + norm_num_groups=32, + act_fn="silu", + norm_type="group", # group, spatial + ): + super().__init__() + self.layers_per_block = layers_per_block + + self.conv_in = nn.Conv2d( + in_channels, + block_out_channels[-1], + kernel_size=3, + stride=1, + padding=1, + ) + + self.mid_block = None + self.up_blocks = nn.ModuleList([]) + + temb_channels = in_channels if norm_type == "spatial" else None + + # mid + self.mid_block = UNetMidBlock2D( + in_channels=block_out_channels[-1], + resnet_eps=1e-6, + resnet_act_fn=act_fn, + output_scale_factor=1, + resnet_time_scale_shift="default" if norm_type == "group" else norm_type, + attention_head_dim=block_out_channels[-1], + resnet_groups=norm_num_groups, + temb_channels=temb_channels, + ) + + # up + reversed_block_out_channels = list(reversed(block_out_channels)) + output_channel = reversed_block_out_channels[0] + for i, up_block_type in enumerate(up_block_types): + prev_output_channel = output_channel + output_channel = reversed_block_out_channels[i] + + is_final_block = i == len(block_out_channels) - 1 + + up_block = get_up_block( + up_block_type, + num_layers=self.layers_per_block + 1, + in_channels=prev_output_channel, + out_channels=output_channel, + prev_output_channel=None, + add_upsample=not is_final_block, + resnet_eps=1e-6, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + attention_head_dim=output_channel, + temb_channels=temb_channels, + resnet_time_scale_shift=norm_type, + ) + self.up_blocks.append(up_block) + prev_output_channel = output_channel + + # out + if norm_type == "spatial": + self.conv_norm_out = SpatialNorm(block_out_channels[0], temb_channels) + else: + self.conv_norm_out = nn.GroupNorm(num_channels=block_out_channels[0], num_groups=norm_num_groups, eps=1e-6) + self.conv_act = nn.SiLU() + self.conv_out = nn.Conv2d(block_out_channels[0], out_channels, 3, padding=1) + + self.gradient_checkpointing = False + + def forward(self, z, latent_embeds=None): + sample = z + sample = self.conv_in(sample) + + upscale_dtype = next(iter(self.up_blocks.parameters())).dtype + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + # middle + sample = torch.utils.checkpoint.checkpoint( + create_custom_forward(self.mid_block), sample, latent_embeds, use_reentrant=False + ) + sample = sample.to(upscale_dtype) + + # up + for up_block in self.up_blocks: + sample = torch.utils.checkpoint.checkpoint( + create_custom_forward(up_block), sample, latent_embeds, use_reentrant=False + ) + else: + # middle + sample = torch.utils.checkpoint.checkpoint( + create_custom_forward(self.mid_block), sample, latent_embeds + ) + sample = sample.to(upscale_dtype) + + # up + for up_block in self.up_blocks: + sample = torch.utils.checkpoint.checkpoint(create_custom_forward(up_block), sample, latent_embeds) + else: + # middle + sample = self.mid_block(sample, latent_embeds) + sample = sample.to(upscale_dtype) + + # up + for up_block in self.up_blocks: + sample = up_block(sample, latent_embeds) + + # post-process + if latent_embeds is None: + sample = self.conv_norm_out(sample) + else: + sample = self.conv_norm_out(sample, latent_embeds) + sample = self.conv_act(sample) + sample = self.conv_out(sample) + + return sample + + +class VectorQuantizer(nn.Module): + """ + Improved version over VectorQuantizer, can be used as a drop-in replacement. Mostly avoids costly matrix + multiplications and allows for post-hoc remapping of indices. + """ + + # NOTE: due to a bug the beta term was applied to the wrong term. for + # backwards compatibility we use the buggy version by default, but you can + # specify legacy=False to fix it. + def __init__( + self, n_e, vq_embed_dim, beta, remap=None, unknown_index="random", sane_index_shape=False, legacy=True + ): + super().__init__() + self.n_e = n_e + self.vq_embed_dim = vq_embed_dim + self.beta = beta + self.legacy = legacy + + self.embedding = nn.Embedding(self.n_e, self.vq_embed_dim) + self.embedding.weight.data.uniform_(-1.0 / self.n_e, 1.0 / self.n_e) + + self.remap = remap + if self.remap is not None: + self.register_buffer("used", torch.tensor(np.load(self.remap))) + self.re_embed = self.used.shape[0] + self.unknown_index = unknown_index # "random" or "extra" or integer + if self.unknown_index == "extra": + self.unknown_index = self.re_embed + self.re_embed = self.re_embed + 1 + print( + f"Remapping {self.n_e} indices to {self.re_embed} indices. " + f"Using {self.unknown_index} for unknown indices." + ) + else: + self.re_embed = n_e + + self.sane_index_shape = sane_index_shape + + def remap_to_used(self, inds): + ishape = inds.shape + assert len(ishape) > 1 + inds = inds.reshape(ishape[0], -1) + used = self.used.to(inds) + match = (inds[:, :, None] == used[None, None, ...]).long() + new = match.argmax(-1) + unknown = match.sum(2) < 1 + if self.unknown_index == "random": + new[unknown] = torch.randint(0, self.re_embed, size=new[unknown].shape).to(device=new.device) + else: + new[unknown] = self.unknown_index + return new.reshape(ishape) + + def unmap_to_all(self, inds): + ishape = inds.shape + assert len(ishape) > 1 + inds = inds.reshape(ishape[0], -1) + used = self.used.to(inds) + if self.re_embed > self.used.shape[0]: # extra token + inds[inds >= self.used.shape[0]] = 0 # simply set to zero + back = torch.gather(used[None, :][inds.shape[0] * [0], :], 1, inds) + return back.reshape(ishape) + + def forward(self, z): + # reshape z -> (batch, height, width, channel) and flatten + z = z.permute(0, 2, 3, 1).contiguous() + z_flattened = z.view(-1, self.vq_embed_dim) + + # distances from z to embeddings e_j (z - e)^2 = z^2 + e^2 - 2 e * z + min_encoding_indices = torch.argmin(torch.cdist(z_flattened, self.embedding.weight), dim=1) + + z_q = self.embedding(min_encoding_indices).view(z.shape) + perplexity = None + min_encodings = None + + # compute loss for embedding + if not self.legacy: + loss = self.beta * torch.mean((z_q.detach() - z) ** 2) + torch.mean((z_q - z.detach()) ** 2) + else: + loss = torch.mean((z_q.detach() - z) ** 2) + self.beta * torch.mean((z_q - z.detach()) ** 2) + + # preserve gradients + z_q = z + (z_q - z).detach() + + # reshape back to match original input shape + z_q = z_q.permute(0, 3, 1, 2).contiguous() + + if self.remap is not None: + min_encoding_indices = min_encoding_indices.reshape(z.shape[0], -1) # add batch axis + min_encoding_indices = self.remap_to_used(min_encoding_indices) + min_encoding_indices = min_encoding_indices.reshape(-1, 1) # flatten + + if self.sane_index_shape: + min_encoding_indices = min_encoding_indices.reshape(z_q.shape[0], z_q.shape[2], z_q.shape[3]) + + return z_q, loss, (perplexity, min_encodings, min_encoding_indices) + + def get_codebook_entry(self, indices, shape): + # shape specifying (batch, height, width, channel) + if self.remap is not None: + indices = indices.reshape(shape[0], -1) # add batch axis + indices = self.unmap_to_all(indices) + indices = indices.reshape(-1) # flatten again + + # get quantized latent vectors + z_q = self.embedding(indices) + + if shape is not None: + z_q = z_q.view(shape) + # reshape back to match original input shape + z_q = z_q.permute(0, 3, 1, 2).contiguous() + + return z_q + + +class DiagonalGaussianDistribution(object): + def __init__(self, parameters, deterministic=False): + self.parameters = parameters + self.mean, self.logvar = torch.chunk(parameters, 2, dim=1) + self.logvar = torch.clamp(self.logvar, -30.0, 20.0) + self.deterministic = deterministic + self.std = torch.exp(0.5 * self.logvar) + self.var = torch.exp(self.logvar) + if self.deterministic: + self.var = self.std = torch.zeros_like( + self.mean, device=self.parameters.device, dtype=self.parameters.dtype + ) + + def sample(self, generator: Optional[torch.Generator] = None) -> torch.FloatTensor: + # make sure sample is on the same device as the parameters and has same dtype + sample = randn_tensor( + self.mean.shape, generator=generator, device=self.parameters.device, dtype=self.parameters.dtype + ) + x = self.mean + self.std * sample + return x + + def kl(self, other=None): + if self.deterministic: + return torch.Tensor([0.0]) + else: + if other is None: + return 0.5 * torch.sum(torch.pow(self.mean, 2) + self.var - 1.0 - self.logvar, dim=[1, 2, 3]) + else: + return 0.5 * torch.sum( + torch.pow(self.mean - other.mean, 2) / other.var + + self.var / other.var + - 1.0 + - self.logvar + + other.logvar, + dim=[1, 2, 3], + ) + + def nll(self, sample, dims=[1, 2, 3]): + if self.deterministic: + return torch.Tensor([0.0]) + logtwopi = np.log(2.0 * np.pi) + return 0.5 * torch.sum(logtwopi + self.logvar + torch.pow(sample - self.mean, 2) / self.var, dim=dims) + + def mode(self): + return self.mean diff --git a/gf/sd-webui-rich-text/diffusers_official/models/vae_flax.py b/gf/sd-webui-rich-text/diffusers_official/models/vae_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..b8f5b1d0e399ab8e58d81d396d19b6f082192f5a --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/vae_flax.py @@ -0,0 +1,869 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# JAX implementation of VQGAN from taming-transformers https://github.com/CompVis/taming-transformers + +import math +from functools import partial +from typing import Tuple + +import flax +import flax.linen as nn +import jax +import jax.numpy as jnp +from flax.core.frozen_dict import FrozenDict + +from ..configuration_utils import ConfigMixin, flax_register_to_config +from ..utils import BaseOutput +from .modeling_flax_utils import FlaxModelMixin + + +@flax.struct.dataclass +class FlaxDecoderOutput(BaseOutput): + """ + Output of decoding method. + + Args: + sample (`jnp.ndarray` of shape `(batch_size, num_channels, height, width)`): + The decoded output sample from the last layer of the model. + dtype (`jnp.dtype`, *optional*, defaults to `jnp.float32`): + The `dtype` of the parameters. + """ + + sample: jnp.ndarray + + +@flax.struct.dataclass +class FlaxAutoencoderKLOutput(BaseOutput): + """ + Output of AutoencoderKL encoding method. + + Args: + latent_dist (`FlaxDiagonalGaussianDistribution`): + Encoded outputs of `Encoder` represented as the mean and logvar of `FlaxDiagonalGaussianDistribution`. + `FlaxDiagonalGaussianDistribution` allows for sampling latents from the distribution. + """ + + latent_dist: "FlaxDiagonalGaussianDistribution" + + +class FlaxUpsample2D(nn.Module): + """ + Flax implementation of 2D Upsample layer + + Args: + in_channels (`int`): + Input channels + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + + in_channels: int + dtype: jnp.dtype = jnp.float32 + + def setup(self): + self.conv = nn.Conv( + self.in_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + def __call__(self, hidden_states): + batch, height, width, channels = hidden_states.shape + hidden_states = jax.image.resize( + hidden_states, + shape=(batch, height * 2, width * 2, channels), + method="nearest", + ) + hidden_states = self.conv(hidden_states) + return hidden_states + + +class FlaxDownsample2D(nn.Module): + """ + Flax implementation of 2D Downsample layer + + Args: + in_channels (`int`): + Input channels + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + + in_channels: int + dtype: jnp.dtype = jnp.float32 + + def setup(self): + self.conv = nn.Conv( + self.in_channels, + kernel_size=(3, 3), + strides=(2, 2), + padding="VALID", + dtype=self.dtype, + ) + + def __call__(self, hidden_states): + pad = ((0, 0), (0, 1), (0, 1), (0, 0)) # pad height and width dim + hidden_states = jnp.pad(hidden_states, pad_width=pad) + hidden_states = self.conv(hidden_states) + return hidden_states + + +class FlaxResnetBlock2D(nn.Module): + """ + Flax implementation of 2D Resnet Block. + + Args: + in_channels (`int`): + Input channels + out_channels (`int`): + Output channels + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + groups (:obj:`int`, *optional*, defaults to `32`): + The number of groups to use for group norm. + use_nin_shortcut (:obj:`bool`, *optional*, defaults to `None`): + Whether to use `nin_shortcut`. This activates a new layer inside ResNet block + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + + in_channels: int + out_channels: int = None + dropout: float = 0.0 + groups: int = 32 + use_nin_shortcut: bool = None + dtype: jnp.dtype = jnp.float32 + + def setup(self): + out_channels = self.in_channels if self.out_channels is None else self.out_channels + + self.norm1 = nn.GroupNorm(num_groups=self.groups, epsilon=1e-6) + self.conv1 = nn.Conv( + out_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + self.norm2 = nn.GroupNorm(num_groups=self.groups, epsilon=1e-6) + self.dropout_layer = nn.Dropout(self.dropout) + self.conv2 = nn.Conv( + out_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + use_nin_shortcut = self.in_channels != out_channels if self.use_nin_shortcut is None else self.use_nin_shortcut + + self.conv_shortcut = None + if use_nin_shortcut: + self.conv_shortcut = nn.Conv( + out_channels, + kernel_size=(1, 1), + strides=(1, 1), + padding="VALID", + dtype=self.dtype, + ) + + def __call__(self, hidden_states, deterministic=True): + residual = hidden_states + hidden_states = self.norm1(hidden_states) + hidden_states = nn.swish(hidden_states) + hidden_states = self.conv1(hidden_states) + + hidden_states = self.norm2(hidden_states) + hidden_states = nn.swish(hidden_states) + hidden_states = self.dropout_layer(hidden_states, deterministic) + hidden_states = self.conv2(hidden_states) + + if self.conv_shortcut is not None: + residual = self.conv_shortcut(residual) + + return hidden_states + residual + + +class FlaxAttentionBlock(nn.Module): + r""" + Flax Convolutional based multi-head attention block for diffusion-based VAE. + + Parameters: + channels (:obj:`int`): + Input channels + num_head_channels (:obj:`int`, *optional*, defaults to `None`): + Number of attention heads + num_groups (:obj:`int`, *optional*, defaults to `32`): + The number of groups to use for group norm + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + + """ + channels: int + num_head_channels: int = None + num_groups: int = 32 + dtype: jnp.dtype = jnp.float32 + + def setup(self): + self.num_heads = self.channels // self.num_head_channels if self.num_head_channels is not None else 1 + + dense = partial(nn.Dense, self.channels, dtype=self.dtype) + + self.group_norm = nn.GroupNorm(num_groups=self.num_groups, epsilon=1e-6) + self.query, self.key, self.value = dense(), dense(), dense() + self.proj_attn = dense() + + def transpose_for_scores(self, projection): + new_projection_shape = projection.shape[:-1] + (self.num_heads, -1) + # move heads to 2nd position (B, T, H * D) -> (B, T, H, D) + new_projection = projection.reshape(new_projection_shape) + # (B, T, H, D) -> (B, H, T, D) + new_projection = jnp.transpose(new_projection, (0, 2, 1, 3)) + return new_projection + + def __call__(self, hidden_states): + residual = hidden_states + batch, height, width, channels = hidden_states.shape + + hidden_states = self.group_norm(hidden_states) + + hidden_states = hidden_states.reshape((batch, height * width, channels)) + + query = self.query(hidden_states) + key = self.key(hidden_states) + value = self.value(hidden_states) + + # transpose + query = self.transpose_for_scores(query) + key = self.transpose_for_scores(key) + value = self.transpose_for_scores(value) + + # compute attentions + scale = 1 / math.sqrt(math.sqrt(self.channels / self.num_heads)) + attn_weights = jnp.einsum("...qc,...kc->...qk", query * scale, key * scale) + attn_weights = nn.softmax(attn_weights, axis=-1) + + # attend to values + hidden_states = jnp.einsum("...kc,...qk->...qc", value, attn_weights) + + hidden_states = jnp.transpose(hidden_states, (0, 2, 1, 3)) + new_hidden_states_shape = hidden_states.shape[:-2] + (self.channels,) + hidden_states = hidden_states.reshape(new_hidden_states_shape) + + hidden_states = self.proj_attn(hidden_states) + hidden_states = hidden_states.reshape((batch, height, width, channels)) + hidden_states = hidden_states + residual + return hidden_states + + +class FlaxDownEncoderBlock2D(nn.Module): + r""" + Flax Resnet blocks-based Encoder block for diffusion-based VAE. + + Parameters: + in_channels (:obj:`int`): + Input channels + out_channels (:obj:`int`): + Output channels + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + num_layers (:obj:`int`, *optional*, defaults to 1): + Number of Resnet layer block + resnet_groups (:obj:`int`, *optional*, defaults to `32`): + The number of groups to use for the Resnet block group norm + add_downsample (:obj:`bool`, *optional*, defaults to `True`): + Whether to add downsample layer + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int + out_channels: int + dropout: float = 0.0 + num_layers: int = 1 + resnet_groups: int = 32 + add_downsample: bool = True + dtype: jnp.dtype = jnp.float32 + + def setup(self): + resnets = [] + for i in range(self.num_layers): + in_channels = self.in_channels if i == 0 else self.out_channels + + res_block = FlaxResnetBlock2D( + in_channels=in_channels, + out_channels=self.out_channels, + dropout=self.dropout, + groups=self.resnet_groups, + dtype=self.dtype, + ) + resnets.append(res_block) + self.resnets = resnets + + if self.add_downsample: + self.downsamplers_0 = FlaxDownsample2D(self.out_channels, dtype=self.dtype) + + def __call__(self, hidden_states, deterministic=True): + for resnet in self.resnets: + hidden_states = resnet(hidden_states, deterministic=deterministic) + + if self.add_downsample: + hidden_states = self.downsamplers_0(hidden_states) + + return hidden_states + + +class FlaxUpDecoderBlock2D(nn.Module): + r""" + Flax Resnet blocks-based Decoder block for diffusion-based VAE. + + Parameters: + in_channels (:obj:`int`): + Input channels + out_channels (:obj:`int`): + Output channels + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + num_layers (:obj:`int`, *optional*, defaults to 1): + Number of Resnet layer block + resnet_groups (:obj:`int`, *optional*, defaults to `32`): + The number of groups to use for the Resnet block group norm + add_upsample (:obj:`bool`, *optional*, defaults to `True`): + Whether to add upsample layer + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int + out_channels: int + dropout: float = 0.0 + num_layers: int = 1 + resnet_groups: int = 32 + add_upsample: bool = True + dtype: jnp.dtype = jnp.float32 + + def setup(self): + resnets = [] + for i in range(self.num_layers): + in_channels = self.in_channels if i == 0 else self.out_channels + res_block = FlaxResnetBlock2D( + in_channels=in_channels, + out_channels=self.out_channels, + dropout=self.dropout, + groups=self.resnet_groups, + dtype=self.dtype, + ) + resnets.append(res_block) + + self.resnets = resnets + + if self.add_upsample: + self.upsamplers_0 = FlaxUpsample2D(self.out_channels, dtype=self.dtype) + + def __call__(self, hidden_states, deterministic=True): + for resnet in self.resnets: + hidden_states = resnet(hidden_states, deterministic=deterministic) + + if self.add_upsample: + hidden_states = self.upsamplers_0(hidden_states) + + return hidden_states + + +class FlaxUNetMidBlock2D(nn.Module): + r""" + Flax Unet Mid-Block module. + + Parameters: + in_channels (:obj:`int`): + Input channels + dropout (:obj:`float`, *optional*, defaults to 0.0): + Dropout rate + num_layers (:obj:`int`, *optional*, defaults to 1): + Number of Resnet layer block + resnet_groups (:obj:`int`, *optional*, defaults to `32`): + The number of groups to use for the Resnet and Attention block group norm + num_attention_heads (:obj:`int`, *optional*, defaults to `1`): + Number of attention heads for each attention block + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int + dropout: float = 0.0 + num_layers: int = 1 + resnet_groups: int = 32 + num_attention_heads: int = 1 + dtype: jnp.dtype = jnp.float32 + + def setup(self): + resnet_groups = self.resnet_groups if self.resnet_groups is not None else min(self.in_channels // 4, 32) + + # there is always at least one resnet + resnets = [ + FlaxResnetBlock2D( + in_channels=self.in_channels, + out_channels=self.in_channels, + dropout=self.dropout, + groups=resnet_groups, + dtype=self.dtype, + ) + ] + + attentions = [] + + for _ in range(self.num_layers): + attn_block = FlaxAttentionBlock( + channels=self.in_channels, + num_head_channels=self.num_attention_heads, + num_groups=resnet_groups, + dtype=self.dtype, + ) + attentions.append(attn_block) + + res_block = FlaxResnetBlock2D( + in_channels=self.in_channels, + out_channels=self.in_channels, + dropout=self.dropout, + groups=resnet_groups, + dtype=self.dtype, + ) + resnets.append(res_block) + + self.resnets = resnets + self.attentions = attentions + + def __call__(self, hidden_states, deterministic=True): + hidden_states = self.resnets[0](hidden_states, deterministic=deterministic) + for attn, resnet in zip(self.attentions, self.resnets[1:]): + hidden_states = attn(hidden_states) + hidden_states = resnet(hidden_states, deterministic=deterministic) + + return hidden_states + + +class FlaxEncoder(nn.Module): + r""" + Flax Implementation of VAE Encoder. + + This model is a Flax Linen [flax.linen.Module](https://flax.readthedocs.io/en/latest/flax.linen.html#module) + subclass. Use it as a regular Flax linen Module and refer to the Flax documentation for all matter related to + general usage and behavior. + + Finally, this model supports inherent JAX features such as: + - [Just-In-Time (JIT) compilation](https://jax.readthedocs.io/en/latest/jax.html#just-in-time-compilation-jit) + - [Automatic Differentiation](https://jax.readthedocs.io/en/latest/jax.html#automatic-differentiation) + - [Vectorization](https://jax.readthedocs.io/en/latest/jax.html#vectorization-vmap) + - [Parallelization](https://jax.readthedocs.io/en/latest/jax.html#parallelization-pmap) + + Parameters: + in_channels (:obj:`int`, *optional*, defaults to 3): + Input channels + out_channels (:obj:`int`, *optional*, defaults to 3): + Output channels + down_block_types (:obj:`Tuple[str]`, *optional*, defaults to `(DownEncoderBlock2D)`): + DownEncoder block type + block_out_channels (:obj:`Tuple[str]`, *optional*, defaults to `(64,)`): + Tuple containing the number of output channels for each block + layers_per_block (:obj:`int`, *optional*, defaults to `2`): + Number of Resnet layer for each block + norm_num_groups (:obj:`int`, *optional*, defaults to `32`): + norm num group + act_fn (:obj:`str`, *optional*, defaults to `silu`): + Activation function + double_z (:obj:`bool`, *optional*, defaults to `False`): + Whether to double the last output channels + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + Parameters `dtype` + """ + in_channels: int = 3 + out_channels: int = 3 + down_block_types: Tuple[str] = ("DownEncoderBlock2D",) + block_out_channels: Tuple[int] = (64,) + layers_per_block: int = 2 + norm_num_groups: int = 32 + act_fn: str = "silu" + double_z: bool = False + dtype: jnp.dtype = jnp.float32 + + def setup(self): + block_out_channels = self.block_out_channels + # in + self.conv_in = nn.Conv( + block_out_channels[0], + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + # downsampling + down_blocks = [] + output_channel = block_out_channels[0] + for i, _ in enumerate(self.down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + down_block = FlaxDownEncoderBlock2D( + in_channels=input_channel, + out_channels=output_channel, + num_layers=self.layers_per_block, + resnet_groups=self.norm_num_groups, + add_downsample=not is_final_block, + dtype=self.dtype, + ) + down_blocks.append(down_block) + self.down_blocks = down_blocks + + # middle + self.mid_block = FlaxUNetMidBlock2D( + in_channels=block_out_channels[-1], + resnet_groups=self.norm_num_groups, + num_attention_heads=None, + dtype=self.dtype, + ) + + # end + conv_out_channels = 2 * self.out_channels if self.double_z else self.out_channels + self.conv_norm_out = nn.GroupNorm(num_groups=self.norm_num_groups, epsilon=1e-6) + self.conv_out = nn.Conv( + conv_out_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + def __call__(self, sample, deterministic: bool = True): + # in + sample = self.conv_in(sample) + + # downsampling + for block in self.down_blocks: + sample = block(sample, deterministic=deterministic) + + # middle + sample = self.mid_block(sample, deterministic=deterministic) + + # end + sample = self.conv_norm_out(sample) + sample = nn.swish(sample) + sample = self.conv_out(sample) + + return sample + + +class FlaxDecoder(nn.Module): + r""" + Flax Implementation of VAE Decoder. + + This model is a Flax Linen [flax.linen.Module](https://flax.readthedocs.io/en/latest/flax.linen.html#module) + subclass. Use it as a regular Flax linen Module and refer to the Flax documentation for all matter related to + general usage and behavior. + + Finally, this model supports inherent JAX features such as: + - [Just-In-Time (JIT) compilation](https://jax.readthedocs.io/en/latest/jax.html#just-in-time-compilation-jit) + - [Automatic Differentiation](https://jax.readthedocs.io/en/latest/jax.html#automatic-differentiation) + - [Vectorization](https://jax.readthedocs.io/en/latest/jax.html#vectorization-vmap) + - [Parallelization](https://jax.readthedocs.io/en/latest/jax.html#parallelization-pmap) + + Parameters: + in_channels (:obj:`int`, *optional*, defaults to 3): + Input channels + out_channels (:obj:`int`, *optional*, defaults to 3): + Output channels + up_block_types (:obj:`Tuple[str]`, *optional*, defaults to `(UpDecoderBlock2D)`): + UpDecoder block type + block_out_channels (:obj:`Tuple[str]`, *optional*, defaults to `(64,)`): + Tuple containing the number of output channels for each block + layers_per_block (:obj:`int`, *optional*, defaults to `2`): + Number of Resnet layer for each block + norm_num_groups (:obj:`int`, *optional*, defaults to `32`): + norm num group + act_fn (:obj:`str`, *optional*, defaults to `silu`): + Activation function + double_z (:obj:`bool`, *optional*, defaults to `False`): + Whether to double the last output channels + dtype (:obj:`jnp.dtype`, *optional*, defaults to jnp.float32): + parameters `dtype` + """ + in_channels: int = 3 + out_channels: int = 3 + up_block_types: Tuple[str] = ("UpDecoderBlock2D",) + block_out_channels: int = (64,) + layers_per_block: int = 2 + norm_num_groups: int = 32 + act_fn: str = "silu" + dtype: jnp.dtype = jnp.float32 + + def setup(self): + block_out_channels = self.block_out_channels + + # z to block_in + self.conv_in = nn.Conv( + block_out_channels[-1], + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + # middle + self.mid_block = FlaxUNetMidBlock2D( + in_channels=block_out_channels[-1], + resnet_groups=self.norm_num_groups, + num_attention_heads=None, + dtype=self.dtype, + ) + + # upsampling + reversed_block_out_channels = list(reversed(block_out_channels)) + output_channel = reversed_block_out_channels[0] + up_blocks = [] + for i, _ in enumerate(self.up_block_types): + prev_output_channel = output_channel + output_channel = reversed_block_out_channels[i] + + is_final_block = i == len(block_out_channels) - 1 + + up_block = FlaxUpDecoderBlock2D( + in_channels=prev_output_channel, + out_channels=output_channel, + num_layers=self.layers_per_block + 1, + resnet_groups=self.norm_num_groups, + add_upsample=not is_final_block, + dtype=self.dtype, + ) + up_blocks.append(up_block) + prev_output_channel = output_channel + + self.up_blocks = up_blocks + + # end + self.conv_norm_out = nn.GroupNorm(num_groups=self.norm_num_groups, epsilon=1e-6) + self.conv_out = nn.Conv( + self.out_channels, + kernel_size=(3, 3), + strides=(1, 1), + padding=((1, 1), (1, 1)), + dtype=self.dtype, + ) + + def __call__(self, sample, deterministic: bool = True): + # z to block_in + sample = self.conv_in(sample) + + # middle + sample = self.mid_block(sample, deterministic=deterministic) + + # upsampling + for block in self.up_blocks: + sample = block(sample, deterministic=deterministic) + + sample = self.conv_norm_out(sample) + sample = nn.swish(sample) + sample = self.conv_out(sample) + + return sample + + +class FlaxDiagonalGaussianDistribution(object): + def __init__(self, parameters, deterministic=False): + # Last axis to account for channels-last + self.mean, self.logvar = jnp.split(parameters, 2, axis=-1) + self.logvar = jnp.clip(self.logvar, -30.0, 20.0) + self.deterministic = deterministic + self.std = jnp.exp(0.5 * self.logvar) + self.var = jnp.exp(self.logvar) + if self.deterministic: + self.var = self.std = jnp.zeros_like(self.mean) + + def sample(self, key): + return self.mean + self.std * jax.random.normal(key, self.mean.shape) + + def kl(self, other=None): + if self.deterministic: + return jnp.array([0.0]) + + if other is None: + return 0.5 * jnp.sum(self.mean**2 + self.var - 1.0 - self.logvar, axis=[1, 2, 3]) + + return 0.5 * jnp.sum( + jnp.square(self.mean - other.mean) / other.var + self.var / other.var - 1.0 - self.logvar + other.logvar, + axis=[1, 2, 3], + ) + + def nll(self, sample, axis=[1, 2, 3]): + if self.deterministic: + return jnp.array([0.0]) + + logtwopi = jnp.log(2.0 * jnp.pi) + return 0.5 * jnp.sum(logtwopi + self.logvar + jnp.square(sample - self.mean) / self.var, axis=axis) + + def mode(self): + return self.mean + + +@flax_register_to_config +class FlaxAutoencoderKL(nn.Module, FlaxModelMixin, ConfigMixin): + r""" + Flax implementation of a VAE model with KL loss for decoding latent representations. + + This model inherits from [`FlaxModelMixin`]. Check the superclass documentation for it's generic methods + implemented for all models (such as downloading or saving). + + This model is a Flax Linen [flax.linen.Module](https://flax.readthedocs.io/en/latest/flax.linen.html#module) + subclass. Use it as a regular Flax Linen module and refer to the Flax documentation for all matter related to its + general usage and behavior. + + Inherent JAX features such as the following are supported: + + - [Just-In-Time (JIT) compilation](https://jax.readthedocs.io/en/latest/jax.html#just-in-time-compilation-jit) + - [Automatic Differentiation](https://jax.readthedocs.io/en/latest/jax.html#automatic-differentiation) + - [Vectorization](https://jax.readthedocs.io/en/latest/jax.html#vectorization-vmap) + - [Parallelization](https://jax.readthedocs.io/en/latest/jax.html#parallelization-pmap) + + Parameters: + in_channels (`int`, *optional*, defaults to 3): + Number of channels in the input image. + out_channels (`int`, *optional*, defaults to 3): + Number of channels in the output. + down_block_types (`Tuple[str]`, *optional*, defaults to `(DownEncoderBlock2D)`): + Tuple of downsample block types. + up_block_types (`Tuple[str]`, *optional*, defaults to `(UpDecoderBlock2D)`): + Tuple of upsample block types. + block_out_channels (`Tuple[str]`, *optional*, defaults to `(64,)`): + Tuple of block output channels. + layers_per_block (`int`, *optional*, defaults to `2`): + Number of ResNet layer for each block. + act_fn (`str`, *optional*, defaults to `silu`): + The activation function to use. + latent_channels (`int`, *optional*, defaults to `4`): + Number of channels in the latent space. + norm_num_groups (`int`, *optional*, defaults to `32`): + The number of groups for normalization. + sample_size (`int`, *optional*, defaults to 32): + Sample input size. + scaling_factor (`float`, *optional*, defaults to 0.18215): + The component-wise standard deviation of the trained latent space computed using the first batch of the + training set. This is used to scale the latent space to have unit variance when training the diffusion + model. The latents are scaled with the formula `z = z * scaling_factor` before being passed to the + diffusion model. When decoding, the latents are scaled back to the original scale with the formula: `z = 1 + / scaling_factor * z`. For more details, refer to sections 4.3.2 and D.1 of the [High-Resolution Image + Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752) paper. + dtype (`jnp.dtype`, *optional*, defaults to `jnp.float32`): + The `dtype` of the parameters. + """ + in_channels: int = 3 + out_channels: int = 3 + down_block_types: Tuple[str] = ("DownEncoderBlock2D",) + up_block_types: Tuple[str] = ("UpDecoderBlock2D",) + block_out_channels: Tuple[int] = (64,) + layers_per_block: int = 1 + act_fn: str = "silu" + latent_channels: int = 4 + norm_num_groups: int = 32 + sample_size: int = 32 + scaling_factor: float = 0.18215 + dtype: jnp.dtype = jnp.float32 + + def setup(self): + self.encoder = FlaxEncoder( + in_channels=self.config.in_channels, + out_channels=self.config.latent_channels, + down_block_types=self.config.down_block_types, + block_out_channels=self.config.block_out_channels, + layers_per_block=self.config.layers_per_block, + act_fn=self.config.act_fn, + norm_num_groups=self.config.norm_num_groups, + double_z=True, + dtype=self.dtype, + ) + self.decoder = FlaxDecoder( + in_channels=self.config.latent_channels, + out_channels=self.config.out_channels, + up_block_types=self.config.up_block_types, + block_out_channels=self.config.block_out_channels, + layers_per_block=self.config.layers_per_block, + norm_num_groups=self.config.norm_num_groups, + act_fn=self.config.act_fn, + dtype=self.dtype, + ) + self.quant_conv = nn.Conv( + 2 * self.config.latent_channels, + kernel_size=(1, 1), + strides=(1, 1), + padding="VALID", + dtype=self.dtype, + ) + self.post_quant_conv = nn.Conv( + self.config.latent_channels, + kernel_size=(1, 1), + strides=(1, 1), + padding="VALID", + dtype=self.dtype, + ) + + def init_weights(self, rng: jax.random.KeyArray) -> FrozenDict: + # init input tensors + sample_shape = (1, self.in_channels, self.sample_size, self.sample_size) + sample = jnp.zeros(sample_shape, dtype=jnp.float32) + + params_rng, dropout_rng, gaussian_rng = jax.random.split(rng, 3) + rngs = {"params": params_rng, "dropout": dropout_rng, "gaussian": gaussian_rng} + + return self.init(rngs, sample)["params"] + + def encode(self, sample, deterministic: bool = True, return_dict: bool = True): + sample = jnp.transpose(sample, (0, 2, 3, 1)) + + hidden_states = self.encoder(sample, deterministic=deterministic) + moments = self.quant_conv(hidden_states) + posterior = FlaxDiagonalGaussianDistribution(moments) + + if not return_dict: + return (posterior,) + + return FlaxAutoencoderKLOutput(latent_dist=posterior) + + def decode(self, latents, deterministic: bool = True, return_dict: bool = True): + if latents.shape[-1] != self.config.latent_channels: + latents = jnp.transpose(latents, (0, 2, 3, 1)) + + hidden_states = self.post_quant_conv(latents) + hidden_states = self.decoder(hidden_states, deterministic=deterministic) + + hidden_states = jnp.transpose(hidden_states, (0, 3, 1, 2)) + + if not return_dict: + return (hidden_states,) + + return FlaxDecoderOutput(sample=hidden_states) + + def __call__(self, sample, sample_posterior=False, deterministic: bool = True, return_dict: bool = True): + posterior = self.encode(sample, deterministic=deterministic, return_dict=return_dict) + if sample_posterior: + rng = self.make_rng("gaussian") + hidden_states = posterior.latent_dist.sample(rng) + else: + hidden_states = posterior.latent_dist.mode() + + sample = self.decode(hidden_states, return_dict=return_dict).sample + + if not return_dict: + return (sample,) + + return FlaxDecoderOutput(sample=sample) diff --git a/gf/sd-webui-rich-text/diffusers_official/models/vq_model.py b/gf/sd-webui-rich-text/diffusers_official/models/vq_model.py new file mode 100644 index 0000000000000000000000000000000000000000..687449e8c7557473c0af994b30ef4c7dfba9718c --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/models/vq_model.py @@ -0,0 +1,167 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import torch +import torch.nn as nn + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, apply_forward_hook +from .modeling_utils import ModelMixin +from .vae import Decoder, DecoderOutput, Encoder, VectorQuantizer + + +@dataclass +class VQEncoderOutput(BaseOutput): + """ + Output of VQModel encoding method. + + Args: + latents (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)`): + The encoded output sample from the last layer of the model. + """ + + latents: torch.FloatTensor + + +class VQModel(ModelMixin, ConfigMixin): + r""" + A VQ-VAE model for decoding latent representations. + + This model inherits from [`ModelMixin`]. Check the superclass documentation for it's generic methods implemented + for all models (such as downloading or saving). + + Parameters: + in_channels (int, *optional*, defaults to 3): Number of channels in the input image. + out_channels (int, *optional*, defaults to 3): Number of channels in the output. + down_block_types (`Tuple[str]`, *optional*, defaults to `("DownEncoderBlock2D",)`): + Tuple of downsample block types. + up_block_types (`Tuple[str]`, *optional*, defaults to `("UpDecoderBlock2D",)`): + Tuple of upsample block types. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(64,)`): + Tuple of block output channels. + act_fn (`str`, *optional*, defaults to `"silu"`): The activation function to use. + latent_channels (`int`, *optional*, defaults to `3`): Number of channels in the latent space. + sample_size (`int`, *optional*, defaults to `32`): Sample input size. + num_vq_embeddings (`int`, *optional*, defaults to `256`): Number of codebook vectors in the VQ-VAE. + vq_embed_dim (`int`, *optional*): Hidden dim of codebook vectors in the VQ-VAE. + scaling_factor (`float`, *optional*, defaults to `0.18215`): + The component-wise standard deviation of the trained latent space computed using the first batch of the + training set. This is used to scale the latent space to have unit variance when training the diffusion + model. The latents are scaled with the formula `z = z * scaling_factor` before being passed to the + diffusion model. When decoding, the latents are scaled back to the original scale with the formula: `z = 1 + / scaling_factor * z`. For more details, refer to sections 4.3.2 and D.1 of the [High-Resolution Image + Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752) paper. + """ + + @register_to_config + def __init__( + self, + in_channels: int = 3, + out_channels: int = 3, + down_block_types: Tuple[str] = ("DownEncoderBlock2D",), + up_block_types: Tuple[str] = ("UpDecoderBlock2D",), + block_out_channels: Tuple[int] = (64,), + layers_per_block: int = 1, + act_fn: str = "silu", + latent_channels: int = 3, + sample_size: int = 32, + num_vq_embeddings: int = 256, + norm_num_groups: int = 32, + vq_embed_dim: Optional[int] = None, + scaling_factor: float = 0.18215, + norm_type: str = "group", # group, spatial + ): + super().__init__() + + # pass init params to Encoder + self.encoder = Encoder( + in_channels=in_channels, + out_channels=latent_channels, + down_block_types=down_block_types, + block_out_channels=block_out_channels, + layers_per_block=layers_per_block, + act_fn=act_fn, + norm_num_groups=norm_num_groups, + double_z=False, + ) + + vq_embed_dim = vq_embed_dim if vq_embed_dim is not None else latent_channels + + self.quant_conv = nn.Conv2d(latent_channels, vq_embed_dim, 1) + self.quantize = VectorQuantizer(num_vq_embeddings, vq_embed_dim, beta=0.25, remap=None, sane_index_shape=False) + self.post_quant_conv = nn.Conv2d(vq_embed_dim, latent_channels, 1) + + # pass init params to Decoder + self.decoder = Decoder( + in_channels=latent_channels, + out_channels=out_channels, + up_block_types=up_block_types, + block_out_channels=block_out_channels, + layers_per_block=layers_per_block, + act_fn=act_fn, + norm_num_groups=norm_num_groups, + norm_type=norm_type, + ) + + @apply_forward_hook + def encode(self, x: torch.FloatTensor, return_dict: bool = True) -> VQEncoderOutput: + h = self.encoder(x) + h = self.quant_conv(h) + + if not return_dict: + return (h,) + + return VQEncoderOutput(latents=h) + + @apply_forward_hook + def decode( + self, h: torch.FloatTensor, force_not_quantize: bool = False, return_dict: bool = True + ) -> Union[DecoderOutput, torch.FloatTensor]: + # also go through quantization layer + if not force_not_quantize: + quant, emb_loss, info = self.quantize(h) + else: + quant = h + quant2 = self.post_quant_conv(quant) + dec = self.decoder(quant2, quant if self.config.norm_type == "spatial" else None) + + if not return_dict: + return (dec,) + + return DecoderOutput(sample=dec) + + def forward(self, sample: torch.FloatTensor, return_dict: bool = True) -> Union[DecoderOutput, torch.FloatTensor]: + r""" + The [`VQModel`] forward method. + + Args: + sample (`torch.FloatTensor`): Input sample. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`models.vq_model.VQEncoderOutput`] instead of a plain tuple. + + Returns: + [`~models.vq_model.VQEncoderOutput`] or `tuple`: + If return_dict is True, a [`~models.vq_model.VQEncoderOutput`] is returned, otherwise a plain `tuple` + is returned. + """ + x = sample + h = self.encode(x).latents + dec = self.decode(h).sample + + if not return_dict: + return (dec,) + + return DecoderOutput(sample=dec) diff --git a/gf/sd-webui-rich-text/diffusers_official/optimization.py b/gf/sd-webui-rich-text/diffusers_official/optimization.py new file mode 100644 index 0000000000000000000000000000000000000000..46e6125a0f5565b80ced30dfc147f8168ef35a5c --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/optimization.py @@ -0,0 +1,354 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""PyTorch optimization for diffusion models.""" + +import math +from enum import Enum +from typing import Optional, Union + +from torch.optim import Optimizer +from torch.optim.lr_scheduler import LambdaLR + +from .utils import logging + + +logger = logging.get_logger(__name__) + + +class SchedulerType(Enum): + LINEAR = "linear" + COSINE = "cosine" + COSINE_WITH_RESTARTS = "cosine_with_restarts" + POLYNOMIAL = "polynomial" + CONSTANT = "constant" + CONSTANT_WITH_WARMUP = "constant_with_warmup" + PIECEWISE_CONSTANT = "piecewise_constant" + + +def get_constant_schedule(optimizer: Optimizer, last_epoch: int = -1): + """ + Create a schedule with a constant learning rate, using the learning rate set in optimizer. + + Args: + optimizer ([`~torch.optim.Optimizer`]): + The optimizer for which to schedule the learning rate. + last_epoch (`int`, *optional*, defaults to -1): + The index of the last epoch when resuming training. + + Return: + `torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule. + """ + return LambdaLR(optimizer, lambda _: 1, last_epoch=last_epoch) + + +def get_constant_schedule_with_warmup(optimizer: Optimizer, num_warmup_steps: int, last_epoch: int = -1): + """ + Create a schedule with a constant learning rate preceded by a warmup period during which the learning rate + increases linearly between 0 and the initial lr set in the optimizer. + + Args: + optimizer ([`~torch.optim.Optimizer`]): + The optimizer for which to schedule the learning rate. + num_warmup_steps (`int`): + The number of steps for the warmup phase. + last_epoch (`int`, *optional*, defaults to -1): + The index of the last epoch when resuming training. + + Return: + `torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule. + """ + + def lr_lambda(current_step: int): + if current_step < num_warmup_steps: + return float(current_step) / float(max(1.0, num_warmup_steps)) + return 1.0 + + return LambdaLR(optimizer, lr_lambda, last_epoch=last_epoch) + + +def get_piecewise_constant_schedule(optimizer: Optimizer, step_rules: str, last_epoch: int = -1): + """ + Create a schedule with a constant learning rate, using the learning rate set in optimizer. + + Args: + optimizer ([`~torch.optim.Optimizer`]): + The optimizer for which to schedule the learning rate. + step_rules (`string`): + The rules for the learning rate. ex: rule_steps="1:10,0.1:20,0.01:30,0.005" it means that the learning rate + if multiple 1 for the first 10 steps, mutiple 0.1 for the next 20 steps, multiple 0.01 for the next 30 + steps and multiple 0.005 for the other steps. + last_epoch (`int`, *optional*, defaults to -1): + The index of the last epoch when resuming training. + + Return: + `torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule. + """ + + rules_dict = {} + rule_list = step_rules.split(",") + for rule_str in rule_list[:-1]: + value_str, steps_str = rule_str.split(":") + steps = int(steps_str) + value = float(value_str) + rules_dict[steps] = value + last_lr_multiple = float(rule_list[-1]) + + def create_rules_function(rules_dict, last_lr_multiple): + def rule_func(steps: int) -> float: + sorted_steps = sorted(rules_dict.keys()) + for i, sorted_step in enumerate(sorted_steps): + if steps < sorted_step: + return rules_dict[sorted_steps[i]] + return last_lr_multiple + + return rule_func + + rules_func = create_rules_function(rules_dict, last_lr_multiple) + + return LambdaLR(optimizer, rules_func, last_epoch=last_epoch) + + +def get_linear_schedule_with_warmup(optimizer, num_warmup_steps, num_training_steps, last_epoch=-1): + """ + Create a schedule with a learning rate that decreases linearly from the initial lr set in the optimizer to 0, after + a warmup period during which it increases linearly from 0 to the initial lr set in the optimizer. + + Args: + optimizer ([`~torch.optim.Optimizer`]): + The optimizer for which to schedule the learning rate. + num_warmup_steps (`int`): + The number of steps for the warmup phase. + num_training_steps (`int`): + The total number of training steps. + last_epoch (`int`, *optional*, defaults to -1): + The index of the last epoch when resuming training. + + Return: + `torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule. + """ + + def lr_lambda(current_step: int): + if current_step < num_warmup_steps: + return float(current_step) / float(max(1, num_warmup_steps)) + return max( + 0.0, float(num_training_steps - current_step) / float(max(1, num_training_steps - num_warmup_steps)) + ) + + return LambdaLR(optimizer, lr_lambda, last_epoch) + + +def get_cosine_schedule_with_warmup( + optimizer: Optimizer, num_warmup_steps: int, num_training_steps: int, num_cycles: float = 0.5, last_epoch: int = -1 +): + """ + Create a schedule with a learning rate that decreases following the values of the cosine function between the + initial lr set in the optimizer to 0, after a warmup period during which it increases linearly between 0 and the + initial lr set in the optimizer. + + Args: + optimizer ([`~torch.optim.Optimizer`]): + The optimizer for which to schedule the learning rate. + num_warmup_steps (`int`): + The number of steps for the warmup phase. + num_training_steps (`int`): + The total number of training steps. + num_periods (`float`, *optional*, defaults to 0.5): + The number of periods of the cosine function in a schedule (the default is to just decrease from the max + value to 0 following a half-cosine). + last_epoch (`int`, *optional*, defaults to -1): + The index of the last epoch when resuming training. + + Return: + `torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule. + """ + + def lr_lambda(current_step): + if current_step < num_warmup_steps: + return float(current_step) / float(max(1, num_warmup_steps)) + progress = float(current_step - num_warmup_steps) / float(max(1, num_training_steps - num_warmup_steps)) + return max(0.0, 0.5 * (1.0 + math.cos(math.pi * float(num_cycles) * 2.0 * progress))) + + return LambdaLR(optimizer, lr_lambda, last_epoch) + + +def get_cosine_with_hard_restarts_schedule_with_warmup( + optimizer: Optimizer, num_warmup_steps: int, num_training_steps: int, num_cycles: int = 1, last_epoch: int = -1 +): + """ + Create a schedule with a learning rate that decreases following the values of the cosine function between the + initial lr set in the optimizer to 0, with several hard restarts, after a warmup period during which it increases + linearly between 0 and the initial lr set in the optimizer. + + Args: + optimizer ([`~torch.optim.Optimizer`]): + The optimizer for which to schedule the learning rate. + num_warmup_steps (`int`): + The number of steps for the warmup phase. + num_training_steps (`int`): + The total number of training steps. + num_cycles (`int`, *optional*, defaults to 1): + The number of hard restarts to use. + last_epoch (`int`, *optional*, defaults to -1): + The index of the last epoch when resuming training. + + Return: + `torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule. + """ + + def lr_lambda(current_step): + if current_step < num_warmup_steps: + return float(current_step) / float(max(1, num_warmup_steps)) + progress = float(current_step - num_warmup_steps) / float(max(1, num_training_steps - num_warmup_steps)) + if progress >= 1.0: + return 0.0 + return max(0.0, 0.5 * (1.0 + math.cos(math.pi * ((float(num_cycles) * progress) % 1.0)))) + + return LambdaLR(optimizer, lr_lambda, last_epoch) + + +def get_polynomial_decay_schedule_with_warmup( + optimizer, num_warmup_steps, num_training_steps, lr_end=1e-7, power=1.0, last_epoch=-1 +): + """ + Create a schedule with a learning rate that decreases as a polynomial decay from the initial lr set in the + optimizer to end lr defined by *lr_end*, after a warmup period during which it increases linearly from 0 to the + initial lr set in the optimizer. + + Args: + optimizer ([`~torch.optim.Optimizer`]): + The optimizer for which to schedule the learning rate. + num_warmup_steps (`int`): + The number of steps for the warmup phase. + num_training_steps (`int`): + The total number of training steps. + lr_end (`float`, *optional*, defaults to 1e-7): + The end LR. + power (`float`, *optional*, defaults to 1.0): + Power factor. + last_epoch (`int`, *optional*, defaults to -1): + The index of the last epoch when resuming training. + + Note: *power* defaults to 1.0 as in the fairseq implementation, which in turn is based on the original BERT + implementation at + https://github.com/google-research/bert/blob/f39e881b169b9d53bea03d2d341b31707a6c052b/optimization.py#L37 + + Return: + `torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule. + + """ + + lr_init = optimizer.defaults["lr"] + if not (lr_init > lr_end): + raise ValueError(f"lr_end ({lr_end}) must be be smaller than initial lr ({lr_init})") + + def lr_lambda(current_step: int): + if current_step < num_warmup_steps: + return float(current_step) / float(max(1, num_warmup_steps)) + elif current_step > num_training_steps: + return lr_end / lr_init # as LambdaLR multiplies by lr_init + else: + lr_range = lr_init - lr_end + decay_steps = num_training_steps - num_warmup_steps + pct_remaining = 1 - (current_step - num_warmup_steps) / decay_steps + decay = lr_range * pct_remaining**power + lr_end + return decay / lr_init # as LambdaLR multiplies by lr_init + + return LambdaLR(optimizer, lr_lambda, last_epoch) + + +TYPE_TO_SCHEDULER_FUNCTION = { + SchedulerType.LINEAR: get_linear_schedule_with_warmup, + SchedulerType.COSINE: get_cosine_schedule_with_warmup, + SchedulerType.COSINE_WITH_RESTARTS: get_cosine_with_hard_restarts_schedule_with_warmup, + SchedulerType.POLYNOMIAL: get_polynomial_decay_schedule_with_warmup, + SchedulerType.CONSTANT: get_constant_schedule, + SchedulerType.CONSTANT_WITH_WARMUP: get_constant_schedule_with_warmup, + SchedulerType.PIECEWISE_CONSTANT: get_piecewise_constant_schedule, +} + + +def get_scheduler( + name: Union[str, SchedulerType], + optimizer: Optimizer, + step_rules: Optional[str] = None, + num_warmup_steps: Optional[int] = None, + num_training_steps: Optional[int] = None, + num_cycles: int = 1, + power: float = 1.0, + last_epoch: int = -1, +): + """ + Unified API to get any scheduler from its name. + + Args: + name (`str` or `SchedulerType`): + The name of the scheduler to use. + optimizer (`torch.optim.Optimizer`): + The optimizer that will be used during training. + step_rules (`str`, *optional*): + A string representing the step rules to use. This is only used by the `PIECEWISE_CONSTANT` scheduler. + num_warmup_steps (`int`, *optional*): + The number of warmup steps to do. This is not required by all schedulers (hence the argument being + optional), the function will raise an error if it's unset and the scheduler type requires it. + num_training_steps (`int``, *optional*): + The number of training steps to do. This is not required by all schedulers (hence the argument being + optional), the function will raise an error if it's unset and the scheduler type requires it. + num_cycles (`int`, *optional*): + The number of hard restarts used in `COSINE_WITH_RESTARTS` scheduler. + power (`float`, *optional*, defaults to 1.0): + Power factor. See `POLYNOMIAL` scheduler + last_epoch (`int`, *optional*, defaults to -1): + The index of the last epoch when resuming training. + """ + name = SchedulerType(name) + schedule_func = TYPE_TO_SCHEDULER_FUNCTION[name] + if name == SchedulerType.CONSTANT: + return schedule_func(optimizer, last_epoch=last_epoch) + + if name == SchedulerType.PIECEWISE_CONSTANT: + return schedule_func(optimizer, step_rules=step_rules, last_epoch=last_epoch) + + # All other schedulers require `num_warmup_steps` + if num_warmup_steps is None: + raise ValueError(f"{name} requires `num_warmup_steps`, please provide that argument.") + + if name == SchedulerType.CONSTANT_WITH_WARMUP: + return schedule_func(optimizer, num_warmup_steps=num_warmup_steps, last_epoch=last_epoch) + + # All other schedulers require `num_training_steps` + if num_training_steps is None: + raise ValueError(f"{name} requires `num_training_steps`, please provide that argument.") + + if name == SchedulerType.COSINE_WITH_RESTARTS: + return schedule_func( + optimizer, + num_warmup_steps=num_warmup_steps, + num_training_steps=num_training_steps, + num_cycles=num_cycles, + last_epoch=last_epoch, + ) + + if name == SchedulerType.POLYNOMIAL: + return schedule_func( + optimizer, + num_warmup_steps=num_warmup_steps, + num_training_steps=num_training_steps, + power=power, + last_epoch=last_epoch, + ) + + return schedule_func( + optimizer, num_warmup_steps=num_warmup_steps, num_training_steps=num_training_steps, last_epoch=last_epoch + ) diff --git a/gf/sd-webui-rich-text/diffusers_official/pipeline_utils.py b/gf/sd-webui-rich-text/diffusers_official/pipeline_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..87709d5f616cdfb195ed4527e4b630a86136c29c --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipeline_utils.py @@ -0,0 +1,29 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and + +# limitations under the License. + +# NOTE: This file is deprecated and will be removed in a future version. +# It only exists so that temporarely `from diffusers.pipelines import DiffusionPipeline` works + +from .pipelines import DiffusionPipeline, ImagePipelineOutput # noqa: F401 +from .utils import deprecate + + +deprecate( + "pipelines_utils", + "0.22.0", + "Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.", + standard_warn=False, + stacklevel=3, +) diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/__init__.py b/gf/sd-webui-rich-text/diffusers_official/pipelines/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..d13e9823192afeb1c61a321229e203f6ab81fd8d --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipelines/__init__.py @@ -0,0 +1,83 @@ +from ..utils import ( + OptionalDependencyNotAvailable, + is_flax_available, + is_invisible_watermark_available, + is_k_diffusion_available, + is_librosa_available, + is_note_seq_available, + is_onnx_available, + is_torch_available, + is_transformers_available, +) + + +try: + if not is_torch_available(): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_pt_objects import * # noqa F403 +else: + from .pipeline_utils import AudioPipelineOutput, DiffusionPipeline, ImagePipelineOutput + +try: + if not (is_torch_available() and is_librosa_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_torch_and_librosa_objects import * # noqa F403 + +try: + if not (is_torch_available() and is_transformers_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_torch_and_transformers_objects import * # noqa F403 + + +try: + if not (is_torch_available() and is_transformers_available() and is_invisible_watermark_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_torch_and_transformers_and_invisible_watermark_objects import * # noqa F403 +else: + from .stable_diffusion_xl import StableDiffusionXLImg2ImgPipeline, StableDiffusionXLPipeline + +try: + if not is_onnx_available(): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_onnx_objects import * # noqa F403 +else: + from .onnx_utils import OnnxRuntimeModel + +try: + if not (is_torch_available() and is_transformers_available() and is_onnx_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_torch_and_transformers_and_onnx_objects import * # noqa F403 + + +try: + if not (is_torch_available() and is_transformers_available() and is_k_diffusion_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_torch_and_transformers_and_k_diffusion_objects import * # noqa F403 + +try: + if not is_flax_available(): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_flax_objects import * # noqa F403 +else: + from .pipeline_flax_utils import FlaxDiffusionPipeline + + +try: + if not (is_flax_available() and is_transformers_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_flax_and_transformers_objects import * # noqa F403 + +try: + if not (is_transformers_available() and is_torch_available() and is_note_seq_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_transformers_and_torch_and_note_seq_objects import * # noqa F403 diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/__init__.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/__init__.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..70fe512af3d318d57cc6fe751d2dec03430487b3 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/__init__.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/onnx_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/onnx_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..4e8b4274fc033a9e3a684f061b9c1fbdc63c49a4 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/onnx_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/pipeline_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/pipeline_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..29bd5b9951923e1c6b31c6b21f0f02aede81e3c7 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/pipelines/__pycache__/pipeline_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/onnx_utils.py b/gf/sd-webui-rich-text/diffusers_official/pipelines/onnx_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..07c32e4e84bfee0241733a077fef9c0dec06905e --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipelines/onnx_utils.py @@ -0,0 +1,212 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +import os +import shutil +from pathlib import Path +from typing import Optional, Union + +import numpy as np +from huggingface_hub import hf_hub_download + +from ..utils import ONNX_EXTERNAL_WEIGHTS_NAME, ONNX_WEIGHTS_NAME, is_onnx_available, logging + + +if is_onnx_available(): + import onnxruntime as ort + + +logger = logging.get_logger(__name__) + +ORT_TO_NP_TYPE = { + "tensor(bool)": np.bool_, + "tensor(int8)": np.int8, + "tensor(uint8)": np.uint8, + "tensor(int16)": np.int16, + "tensor(uint16)": np.uint16, + "tensor(int32)": np.int32, + "tensor(uint32)": np.uint32, + "tensor(int64)": np.int64, + "tensor(uint64)": np.uint64, + "tensor(float16)": np.float16, + "tensor(float)": np.float32, + "tensor(double)": np.float64, +} + + +class OnnxRuntimeModel: + def __init__(self, model=None, **kwargs): + logger.info("`diffusers.OnnxRuntimeModel` is experimental and might change in the future.") + self.model = model + self.model_save_dir = kwargs.get("model_save_dir", None) + self.latest_model_name = kwargs.get("latest_model_name", ONNX_WEIGHTS_NAME) + + def __call__(self, **kwargs): + inputs = {k: np.array(v) for k, v in kwargs.items()} + return self.model.run(None, inputs) + + @staticmethod + def load_model(path: Union[str, Path], provider=None, sess_options=None): + """ + Loads an ONNX Inference session with an ExecutionProvider. Default provider is `CPUExecutionProvider` + + Arguments: + path (`str` or `Path`): + Directory from which to load + provider(`str`, *optional*): + Onnxruntime execution provider to use for loading the model, defaults to `CPUExecutionProvider` + """ + if provider is None: + logger.info("No onnxruntime provider specified, using CPUExecutionProvider") + provider = "CPUExecutionProvider" + + return ort.InferenceSession(path, providers=[provider], sess_options=sess_options) + + def _save_pretrained(self, save_directory: Union[str, Path], file_name: Optional[str] = None, **kwargs): + """ + Save a model and its configuration file to a directory, so that it can be re-loaded using the + [`~optimum.onnxruntime.modeling_ort.ORTModel.from_pretrained`] class method. It will always save the + latest_model_name. + + Arguments: + save_directory (`str` or `Path`): + Directory where to save the model file. + file_name(`str`, *optional*): + Overwrites the default model file name from `"model.onnx"` to `file_name`. This allows you to save the + model with a different name. + """ + model_file_name = file_name if file_name is not None else ONNX_WEIGHTS_NAME + + src_path = self.model_save_dir.joinpath(self.latest_model_name) + dst_path = Path(save_directory).joinpath(model_file_name) + try: + shutil.copyfile(src_path, dst_path) + except shutil.SameFileError: + pass + + # copy external weights (for models >2GB) + src_path = self.model_save_dir.joinpath(ONNX_EXTERNAL_WEIGHTS_NAME) + if src_path.exists(): + dst_path = Path(save_directory).joinpath(ONNX_EXTERNAL_WEIGHTS_NAME) + try: + shutil.copyfile(src_path, dst_path) + except shutil.SameFileError: + pass + + def save_pretrained( + self, + save_directory: Union[str, os.PathLike], + **kwargs, + ): + """ + Save a model to a directory, so that it can be re-loaded using the [`~OnnxModel.from_pretrained`] class + method.: + + Arguments: + save_directory (`str` or `os.PathLike`): + Directory to which to save. Will be created if it doesn't exist. + """ + if os.path.isfile(save_directory): + logger.error(f"Provided path ({save_directory}) should be a directory, not a file") + return + + os.makedirs(save_directory, exist_ok=True) + + # saving model weights/files + self._save_pretrained(save_directory, **kwargs) + + @classmethod + def _from_pretrained( + cls, + model_id: Union[str, Path], + use_auth_token: Optional[Union[bool, str, None]] = None, + revision: Optional[Union[str, None]] = None, + force_download: bool = False, + cache_dir: Optional[str] = None, + file_name: Optional[str] = None, + provider: Optional[str] = None, + sess_options: Optional["ort.SessionOptions"] = None, + **kwargs, + ): + """ + Load a model from a directory or the HF Hub. + + Arguments: + model_id (`str` or `Path`): + Directory from which to load + use_auth_token (`str` or `bool`): + Is needed to load models from a private or gated repository + revision (`str`): + Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id + cache_dir (`Union[str, Path]`, *optional*): + Path to a directory in which a downloaded pretrained model configuration should be cached if the + standard cache should not be used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + file_name(`str`): + Overwrites the default model file name from `"model.onnx"` to `file_name`. This allows you to load + different model files from the same repository or directory. + provider(`str`): + The ONNX runtime provider, e.g. `CPUExecutionProvider` or `CUDAExecutionProvider`. + kwargs (`Dict`, *optional*): + kwargs will be passed to the model during initialization + """ + model_file_name = file_name if file_name is not None else ONNX_WEIGHTS_NAME + # load model from local directory + if os.path.isdir(model_id): + model = OnnxRuntimeModel.load_model( + os.path.join(model_id, model_file_name), provider=provider, sess_options=sess_options + ) + kwargs["model_save_dir"] = Path(model_id) + # load model from hub + else: + # download model + model_cache_path = hf_hub_download( + repo_id=model_id, + filename=model_file_name, + use_auth_token=use_auth_token, + revision=revision, + cache_dir=cache_dir, + force_download=force_download, + ) + kwargs["model_save_dir"] = Path(model_cache_path).parent + kwargs["latest_model_name"] = Path(model_cache_path).name + model = OnnxRuntimeModel.load_model(model_cache_path, provider=provider, sess_options=sess_options) + return cls(model=model, **kwargs) + + @classmethod + def from_pretrained( + cls, + model_id: Union[str, Path], + force_download: bool = True, + use_auth_token: Optional[str] = None, + cache_dir: Optional[str] = None, + **model_kwargs, + ): + revision = None + if len(str(model_id).split("@")) == 2: + model_id, revision = model_id.split("@") + + return cls._from_pretrained( + model_id=model_id, + revision=revision, + cache_dir=cache_dir, + force_download=force_download, + use_auth_token=use_auth_token, + **model_kwargs, + ) diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/pipeline_flax_utils.py b/gf/sd-webui-rich-text/diffusers_official/pipelines/pipeline_flax_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..e1c4b9f53953c1f18e3cc90088dfddd612cbfa63 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipelines/pipeline_flax_utils.py @@ -0,0 +1,568 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import importlib +import inspect +import os +from typing import Any, Dict, List, Optional, Union + +import flax +import numpy as np +import PIL +from flax.core.frozen_dict import FrozenDict +from huggingface_hub import snapshot_download +from PIL import Image +from tqdm.auto import tqdm + +from ..configuration_utils import ConfigMixin +from ..models.modeling_flax_utils import FLAX_WEIGHTS_NAME, FlaxModelMixin +from ..schedulers.scheduling_utils_flax import SCHEDULER_CONFIG_NAME, FlaxSchedulerMixin +from ..utils import CONFIG_NAME, DIFFUSERS_CACHE, BaseOutput, http_user_agent, is_transformers_available, logging + + +if is_transformers_available(): + from transformers import FlaxPreTrainedModel + +INDEX_FILE = "diffusion_flax_model.bin" + + +logger = logging.get_logger(__name__) + + +LOADABLE_CLASSES = { + "diffusers": { + "FlaxModelMixin": ["save_pretrained", "from_pretrained"], + "FlaxSchedulerMixin": ["save_pretrained", "from_pretrained"], + "FlaxDiffusionPipeline": ["save_pretrained", "from_pretrained"], + }, + "transformers": { + "PreTrainedTokenizer": ["save_pretrained", "from_pretrained"], + "PreTrainedTokenizerFast": ["save_pretrained", "from_pretrained"], + "FlaxPreTrainedModel": ["save_pretrained", "from_pretrained"], + "FeatureExtractionMixin": ["save_pretrained", "from_pretrained"], + "ProcessorMixin": ["save_pretrained", "from_pretrained"], + "ImageProcessingMixin": ["save_pretrained", "from_pretrained"], + }, +} + +ALL_IMPORTABLE_CLASSES = {} +for library in LOADABLE_CLASSES: + ALL_IMPORTABLE_CLASSES.update(LOADABLE_CLASSES[library]) + + +def import_flax_or_no_model(module, class_name): + try: + # 1. First make sure that if a Flax object is present, import this one + class_obj = getattr(module, "Flax" + class_name) + except AttributeError: + # 2. If this doesn't work, it's not a model and we don't append "Flax" + class_obj = getattr(module, class_name) + except AttributeError: + raise ValueError(f"Neither Flax{class_name} nor {class_name} exist in {module}") + + return class_obj + + +@flax.struct.dataclass +class FlaxImagePipelineOutput(BaseOutput): + """ + Output class for image pipelines. + + Args: + images (`List[PIL.Image.Image]` or `np.ndarray`) + List of denoised PIL images of length `batch_size` or NumPy array of shape `(batch_size, height, width, + num_channels)`. + """ + + images: Union[List[PIL.Image.Image], np.ndarray] + + +class FlaxDiffusionPipeline(ConfigMixin): + r""" + Base class for all models. + + [`FlaxDiffusionPipeline`] takes care of storing all components (models, schedulers, processors) for diffusion + pipelines and handles methods for loading, downloading and saving models as well as a few methods common to all + pipelines to: + + - enabling/disabling the progress bar for the denoising iteration + + Class attributes: + + - **config_name** ([`str`]) -- name of the config file that will store the class and module names of all + components of the diffusion pipeline. + """ + config_name = "model_index.json" + + def register_modules(self, **kwargs): + # import it here to avoid circular import + from diffusers import pipelines + + for name, module in kwargs.items(): + if module is None: + register_dict = {name: (None, None)} + else: + # retrieve library + library = module.__module__.split(".")[0] + + # check if the module is a pipeline module + pipeline_dir = module.__module__.split(".")[-2] + path = module.__module__.split(".") + is_pipeline_module = pipeline_dir in path and hasattr(pipelines, pipeline_dir) + + # if library is not in LOADABLE_CLASSES, then it is a custom module. + # Or if it's a pipeline module, then the module is inside the pipeline + # folder so we set the library to module name. + if library not in LOADABLE_CLASSES or is_pipeline_module: + library = pipeline_dir + + # retrieve class_name + class_name = module.__class__.__name__ + + register_dict = {name: (library, class_name)} + + # save model index config + self.register_to_config(**register_dict) + + # set models + setattr(self, name, module) + + def save_pretrained(self, save_directory: Union[str, os.PathLike], params: Union[Dict, FrozenDict]): + # TODO: handle inference_state + """ + Save all variables of the pipeline that can be saved and loaded as well as the pipelines configuration file to + a directory. A pipeline variable can be saved and loaded if its class implements both a save and loading + method. The pipeline can easily be re-loaded using the `[`~FlaxDiffusionPipeline.from_pretrained`]` class + method. + + Arguments: + save_directory (`str` or `os.PathLike`): + Directory to which to save. Will be created if it doesn't exist. + """ + self.save_config(save_directory) + + model_index_dict = dict(self.config) + model_index_dict.pop("_class_name") + model_index_dict.pop("_diffusers_version") + model_index_dict.pop("_module", None) + + for pipeline_component_name in model_index_dict.keys(): + sub_model = getattr(self, pipeline_component_name) + if sub_model is None: + # edge case for saving a pipeline with safety_checker=None + continue + + model_cls = sub_model.__class__ + + save_method_name = None + # search for the model's base class in LOADABLE_CLASSES + for library_name, library_classes in LOADABLE_CLASSES.items(): + library = importlib.import_module(library_name) + for base_class, save_load_methods in library_classes.items(): + class_candidate = getattr(library, base_class, None) + if class_candidate is not None and issubclass(model_cls, class_candidate): + # if we found a suitable base class in LOADABLE_CLASSES then grab its save method + save_method_name = save_load_methods[0] + break + if save_method_name is not None: + break + + save_method = getattr(sub_model, save_method_name) + expects_params = "params" in set(inspect.signature(save_method).parameters.keys()) + + if expects_params: + save_method( + os.path.join(save_directory, pipeline_component_name), params=params[pipeline_component_name] + ) + else: + save_method(os.path.join(save_directory, pipeline_component_name)) + + @classmethod + def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.PathLike]], **kwargs): + r""" + Instantiate a Flax diffusion pipeline from pre-trained pipeline weights. + + The pipeline is set in evaluation mode by default using `model.eval()` (Dropout modules are deactivated). + + The warning *Weights from XXX not initialized from pretrained model* means that the weights of XXX do not come + pretrained with the rest of the model. It is up to you to train those weights with a downstream fine-tuning + task. + + The warning *Weights from XXX not used in YYY* means that the layer XXX is not used by YYY, therefore those + weights are discarded. + + Parameters: + pretrained_model_name_or_path (`str` or `os.PathLike`, *optional*): + Can be either: + + - A string, the *repo id* of a pretrained pipeline hosted inside a model repo on + https://huggingface.co/ Valid repo ids have to be located under a user or organization name, like + `CompVis/ldm-text2im-large-256`. + - A path to a *directory* containing pipeline weights saved using + [`~FlaxDiffusionPipeline.save_pretrained`], e.g., `./my_pipeline_directory/`. + dtype (`str` or `jnp.dtype`, *optional*): + Override the default `jnp.dtype` and load the model under this dtype. If `"auto"` is passed the dtype + will be automatically derived from the model's weights. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to delete incompletely received files. Will attempt to resume the download if such a + file exists. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + output_loading_info(`bool`, *optional*, defaults to `False`): + Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. + local_files_only(`bool`, *optional*, defaults to `False`): + Whether or not to only look at local files (i.e., do not try to download the model). + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated + when running `huggingface-cli login` (stored in `~/.huggingface`). + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a + git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any + identifier allowed by git. + mirror (`str`, *optional*): + Mirror source to accelerate downloads in China. If you are from China and have an accessibility + problem, you can set this option to resolve it. Note that we do not guarantee the timeliness or safety. + Please refer to the mirror site for more information. specify the folder name here. + + kwargs (remaining dictionary of keyword arguments, *optional*): + Can be used to overwrite load - and saveable variables - *i.e.* the pipeline components - of the + specific pipeline class. The overwritten components are then directly passed to the pipelines + `__init__` method. See example below for more information. + + + + It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated + models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v1-5"` + + + + + + Activate the special ["offline-mode"](https://huggingface.co/diffusers/installation.html#offline-mode) to use + this method in a firewalled environment. + + + + Examples: + + ```py + >>> from diffusers import FlaxDiffusionPipeline + + >>> # Download pipeline from huggingface.co and cache. + >>> # Requires to be logged in to Hugging Face hub, + >>> # see more in [the documentation](https://huggingface.co/docs/hub/security-tokens) + >>> pipeline, params = FlaxDiffusionPipeline.from_pretrained( + ... "runwayml/stable-diffusion-v1-5", + ... revision="bf16", + ... dtype=jnp.bfloat16, + ... ) + + >>> # Download pipeline, but use a different scheduler + >>> from diffusers import FlaxDPMSolverMultistepScheduler + + >>> model_id = "runwayml/stable-diffusion-v1-5" + >>> dpmpp, dpmpp_state = FlaxDPMSolverMultistepScheduler.from_pretrained( + ... model_id, + ... subfolder="scheduler", + ... ) + + >>> dpm_pipe, dpm_params = FlaxStableDiffusionPipeline.from_pretrained( + ... model_id, revision="bf16", dtype=jnp.bfloat16, scheduler=dpmpp + ... ) + >>> dpm_params["scheduler"] = dpmpp_state + ``` + """ + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + resume_download = kwargs.pop("resume_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", False) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + from_pt = kwargs.pop("from_pt", False) + use_memory_efficient_attention = kwargs.pop("use_memory_efficient_attention", False) + dtype = kwargs.pop("dtype", None) + + # 1. Download the checkpoints and configs + # use snapshot download here to get it working from from_pretrained + if not os.path.isdir(pretrained_model_name_or_path): + config_dict = cls.load_config( + pretrained_model_name_or_path, + cache_dir=cache_dir, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + ) + # make sure we only download sub-folders and `diffusers` filenames + folder_names = [k for k in config_dict.keys() if not k.startswith("_")] + allow_patterns = [os.path.join(k, "*") for k in folder_names] + allow_patterns += [FLAX_WEIGHTS_NAME, SCHEDULER_CONFIG_NAME, CONFIG_NAME, cls.config_name] + + # make sure we don't download PyTorch weights, unless when using from_pt + ignore_patterns = "*.bin" if not from_pt else [] + + if cls != FlaxDiffusionPipeline: + requested_pipeline_class = cls.__name__ + else: + requested_pipeline_class = config_dict.get("_class_name", cls.__name__) + requested_pipeline_class = ( + requested_pipeline_class + if requested_pipeline_class.startswith("Flax") + else "Flax" + requested_pipeline_class + ) + + user_agent = {"pipeline_class": requested_pipeline_class} + user_agent = http_user_agent(user_agent) + + # download all allow_patterns + cached_folder = snapshot_download( + pretrained_model_name_or_path, + cache_dir=cache_dir, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + allow_patterns=allow_patterns, + ignore_patterns=ignore_patterns, + user_agent=user_agent, + ) + else: + cached_folder = pretrained_model_name_or_path + + config_dict = cls.load_config(cached_folder) + + # 2. Load the pipeline class, if using custom module then load it from the hub + # if we load from explicit class, let's use it + if cls != FlaxDiffusionPipeline: + pipeline_class = cls + else: + diffusers_module = importlib.import_module(cls.__module__.split(".")[0]) + class_name = ( + config_dict["_class_name"] + if config_dict["_class_name"].startswith("Flax") + else "Flax" + config_dict["_class_name"] + ) + pipeline_class = getattr(diffusers_module, class_name) + + # some modules can be passed directly to the init + # in this case they are already instantiated in `kwargs` + # extract them here + expected_modules, optional_kwargs = cls._get_signature_keys(pipeline_class) + passed_class_obj = {k: kwargs.pop(k) for k in expected_modules if k in kwargs} + + init_dict, _, _ = pipeline_class.extract_init_dict(config_dict, **kwargs) + + init_kwargs = {} + + # inference_params + params = {} + + # import it here to avoid circular import + from diffusers import pipelines + + # 3. Load each module in the pipeline + for name, (library_name, class_name) in init_dict.items(): + if class_name is None: + # edge case for when the pipeline was saved with safety_checker=None + init_kwargs[name] = None + continue + + is_pipeline_module = hasattr(pipelines, library_name) + loaded_sub_model = None + sub_model_should_be_defined = True + + # if the model is in a pipeline module, then we load it from the pipeline + if name in passed_class_obj: + # 1. check that passed_class_obj has correct parent class + if not is_pipeline_module: + library = importlib.import_module(library_name) + class_obj = getattr(library, class_name) + importable_classes = LOADABLE_CLASSES[library_name] + class_candidates = {c: getattr(library, c, None) for c in importable_classes.keys()} + + expected_class_obj = None + for class_name, class_candidate in class_candidates.items(): + if class_candidate is not None and issubclass(class_obj, class_candidate): + expected_class_obj = class_candidate + + if not issubclass(passed_class_obj[name].__class__, expected_class_obj): + raise ValueError( + f"{passed_class_obj[name]} is of type: {type(passed_class_obj[name])}, but should be" + f" {expected_class_obj}" + ) + elif passed_class_obj[name] is None: + logger.warning( + f"You have passed `None` for {name} to disable its functionality in {pipeline_class}. Note" + f" that this might lead to problems when using {pipeline_class} and is not recommended." + ) + sub_model_should_be_defined = False + else: + logger.warning( + f"You have passed a non-standard module {passed_class_obj[name]}. We cannot verify whether it" + " has the correct type" + ) + + # set passed class object + loaded_sub_model = passed_class_obj[name] + elif is_pipeline_module: + pipeline_module = getattr(pipelines, library_name) + class_obj = import_flax_or_no_model(pipeline_module, class_name) + + importable_classes = ALL_IMPORTABLE_CLASSES + class_candidates = {c: class_obj for c in importable_classes.keys()} + else: + # else we just import it from the library. + library = importlib.import_module(library_name) + class_obj = import_flax_or_no_model(library, class_name) + + importable_classes = LOADABLE_CLASSES[library_name] + class_candidates = {c: getattr(library, c, None) for c in importable_classes.keys()} + + if loaded_sub_model is None and sub_model_should_be_defined: + load_method_name = None + for class_name, class_candidate in class_candidates.items(): + if class_candidate is not None and issubclass(class_obj, class_candidate): + load_method_name = importable_classes[class_name][1] + + load_method = getattr(class_obj, load_method_name) + + # check if the module is in a subdirectory + if os.path.isdir(os.path.join(cached_folder, name)): + loadable_folder = os.path.join(cached_folder, name) + else: + loaded_sub_model = cached_folder + + if issubclass(class_obj, FlaxModelMixin): + loaded_sub_model, loaded_params = load_method( + loadable_folder, + from_pt=from_pt, + use_memory_efficient_attention=use_memory_efficient_attention, + dtype=dtype, + ) + params[name] = loaded_params + elif is_transformers_available() and issubclass(class_obj, FlaxPreTrainedModel): + if from_pt: + # TODO(Suraj): Fix this in Transformers. We should be able to use `_do_init=False` here + loaded_sub_model = load_method(loadable_folder, from_pt=from_pt) + loaded_params = loaded_sub_model.params + del loaded_sub_model._params + else: + loaded_sub_model, loaded_params = load_method(loadable_folder, _do_init=False) + params[name] = loaded_params + elif issubclass(class_obj, FlaxSchedulerMixin): + loaded_sub_model, scheduler_state = load_method(loadable_folder) + params[name] = scheduler_state + else: + loaded_sub_model = load_method(loadable_folder) + + init_kwargs[name] = loaded_sub_model # UNet(...), # DiffusionSchedule(...) + + # 4. Potentially add passed objects if expected + missing_modules = set(expected_modules) - set(init_kwargs.keys()) + passed_modules = list(passed_class_obj.keys()) + + if len(missing_modules) > 0 and missing_modules <= set(passed_modules): + for module in missing_modules: + init_kwargs[module] = passed_class_obj.get(module, None) + elif len(missing_modules) > 0: + passed_modules = set(list(init_kwargs.keys()) + list(passed_class_obj.keys())) - optional_kwargs + raise ValueError( + f"Pipeline {pipeline_class} expected {expected_modules}, but only {passed_modules} were passed." + ) + + model = pipeline_class(**init_kwargs, dtype=dtype) + return model, params + + @staticmethod + def _get_signature_keys(obj): + parameters = inspect.signature(obj.__init__).parameters + required_parameters = {k: v for k, v in parameters.items() if v.default == inspect._empty} + optional_parameters = set({k for k, v in parameters.items() if v.default != inspect._empty}) + expected_modules = set(required_parameters.keys()) - {"self"} + return expected_modules, optional_parameters + + @property + def components(self) -> Dict[str, Any]: + r""" + + The `self.components` property can be useful to run different pipelines with the same weights and + configurations to not have to re-allocate memory. + + Examples: + + ```py + >>> from diffusers import ( + ... FlaxStableDiffusionPipeline, + ... FlaxStableDiffusionImg2ImgPipeline, + ... ) + + >>> text2img = FlaxStableDiffusionPipeline.from_pretrained( + ... "runwayml/stable-diffusion-v1-5", revision="bf16", dtype=jnp.bfloat16 + ... ) + >>> img2img = FlaxStableDiffusionImg2ImgPipeline(**text2img.components) + ``` + + Returns: + A dictionary containing all the modules needed to initialize the pipeline. + """ + expected_modules, optional_parameters = self._get_signature_keys(self) + components = { + k: getattr(self, k) for k in self.config.keys() if not k.startswith("_") and k not in optional_parameters + } + + if set(components.keys()) != expected_modules: + raise ValueError( + f"{self} has been incorrectly initialized or {self.__class__} is incorrectly implemented. Expected" + f" {expected_modules} to be defined, but {components} are defined." + ) + + return components + + @staticmethod + def numpy_to_pil(images): + """ + Convert a numpy image or a batch of images to a PIL image. + """ + if images.ndim == 3: + images = images[None, ...] + images = (images * 255).round().astype("uint8") + if images.shape[-1] == 1: + # special case for grayscale (single channel) images + pil_images = [Image.fromarray(image.squeeze(), mode="L") for image in images] + else: + pil_images = [Image.fromarray(image) for image in images] + + return pil_images + + # TODO: make it compatible with jax.lax + def progress_bar(self, iterable): + if not hasattr(self, "_progress_bar_config"): + self._progress_bar_config = {} + elif not isinstance(self._progress_bar_config, dict): + raise ValueError( + f"`self._progress_bar_config` should be of type `dict`, but is {type(self._progress_bar_config)}." + ) + + return tqdm(iterable, **self._progress_bar_config) + + def set_progress_bar_config(self, **kwargs): + self._progress_bar_config = kwargs diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/pipeline_utils.py b/gf/sd-webui-rich-text/diffusers_official/pipelines/pipeline_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..3700fe3e1b8b1cb27653734a27b847645606e17e --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipelines/pipeline_utils.py @@ -0,0 +1,1500 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import fnmatch +import importlib +import inspect +import os +import re +import sys +import warnings +from dataclasses import dataclass +from pathlib import Path +from typing import Any, Callable, Dict, List, Optional, Union + +import numpy as np +import PIL +import torch +from huggingface_hub import hf_hub_download, model_info, snapshot_download +from packaging import version +from requests.exceptions import HTTPError +from tqdm.auto import tqdm + +import diffusers + +from .. import __version__ +from ..configuration_utils import ConfigMixin +from ..models.modeling_utils import _LOW_CPU_MEM_USAGE_DEFAULT +from ..schedulers.scheduling_utils import SCHEDULER_CONFIG_NAME +from ..utils import ( + CONFIG_NAME, + DEPRECATED_REVISION_ARGS, + DIFFUSERS_CACHE, + HF_HUB_OFFLINE, + SAFETENSORS_WEIGHTS_NAME, + WEIGHTS_NAME, + BaseOutput, + deprecate, + get_class_from_dynamic_module, + is_accelerate_available, + is_accelerate_version, + is_compiled_module, + is_safetensors_available, + is_torch_version, + is_transformers_available, + logging, + numpy_to_pil, +) + + +if is_transformers_available(): + import transformers + from transformers import PreTrainedModel + from transformers.utils import FLAX_WEIGHTS_NAME as TRANSFORMERS_FLAX_WEIGHTS_NAME + from transformers.utils import SAFE_WEIGHTS_NAME as TRANSFORMERS_SAFE_WEIGHTS_NAME + from transformers.utils import WEIGHTS_NAME as TRANSFORMERS_WEIGHTS_NAME + +from ..utils import FLAX_WEIGHTS_NAME, ONNX_EXTERNAL_WEIGHTS_NAME, ONNX_WEIGHTS_NAME + + +if is_accelerate_available(): + import accelerate + + +INDEX_FILE = "diffusion_pytorch_model.bin" +CUSTOM_PIPELINE_FILE_NAME = "pipeline.py" +DUMMY_MODULES_FOLDER = "diffusers.utils" +TRANSFORMERS_DUMMY_MODULES_FOLDER = "transformers.utils" + + +logger = logging.get_logger(__name__) + + +LOADABLE_CLASSES = { + "diffusers": { + "ModelMixin": ["save_pretrained", "from_pretrained"], + "SchedulerMixin": ["save_pretrained", "from_pretrained"], + "DiffusionPipeline": ["save_pretrained", "from_pretrained"], + "OnnxRuntimeModel": ["save_pretrained", "from_pretrained"], + }, + "transformers": { + "PreTrainedTokenizer": ["save_pretrained", "from_pretrained"], + "PreTrainedTokenizerFast": ["save_pretrained", "from_pretrained"], + "PreTrainedModel": ["save_pretrained", "from_pretrained"], + "FeatureExtractionMixin": ["save_pretrained", "from_pretrained"], + "ProcessorMixin": ["save_pretrained", "from_pretrained"], + "ImageProcessingMixin": ["save_pretrained", "from_pretrained"], + }, + "onnxruntime.training": { + "ORTModule": ["save_pretrained", "from_pretrained"], + }, +} + +ALL_IMPORTABLE_CLASSES = {} +for library in LOADABLE_CLASSES: + ALL_IMPORTABLE_CLASSES.update(LOADABLE_CLASSES[library]) + + +@dataclass +class ImagePipelineOutput(BaseOutput): + """ + Output class for image pipelines. + + Args: + images (`List[PIL.Image.Image]` or `np.ndarray`) + List of denoised PIL images of length `batch_size` or NumPy array of shape `(batch_size, height, width, + num_channels)`. + """ + + images: Union[List[PIL.Image.Image], np.ndarray] + + +@dataclass +class AudioPipelineOutput(BaseOutput): + """ + Output class for audio pipelines. + + Args: + audios (`np.ndarray`) + List of denoised audio samples of a NumPy array of shape `(batch_size, num_channels, sample_rate)`. + """ + + audios: np.ndarray + + +def is_safetensors_compatible(filenames, variant=None, passed_components=None) -> bool: + """ + Checking for safetensors compatibility: + - By default, all models are saved with the default pytorch serialization, so we use the list of default pytorch + files to know which safetensors files are needed. + - The model is safetensors compatible only if there is a matching safetensors file for every default pytorch file. + + Converting default pytorch serialized filenames to safetensors serialized filenames: + - For models from the diffusers library, just replace the ".bin" extension with ".safetensors" + - For models from the transformers library, the filename changes from "pytorch_model" to "model", and the ".bin" + extension is replaced with ".safetensors" + """ + pt_filenames = [] + + sf_filenames = set() + + passed_components = passed_components or [] + + for filename in filenames: + _, extension = os.path.splitext(filename) + + if len(filename.split("/")) == 2 and filename.split("/")[0] in passed_components: + continue + + if extension == ".bin": + pt_filenames.append(filename) + elif extension == ".safetensors": + sf_filenames.add(filename) + + for filename in pt_filenames: + # filename = 'foo/bar/baz.bam' -> path = 'foo/bar', filename = 'baz', extention = '.bam' + path, filename = os.path.split(filename) + filename, extension = os.path.splitext(filename) + + if filename.startswith("pytorch_model"): + filename = filename.replace("pytorch_model", "model") + else: + filename = filename + + expected_sf_filename = os.path.join(path, filename) + expected_sf_filename = f"{expected_sf_filename}.safetensors" + + if expected_sf_filename not in sf_filenames: + logger.warning(f"{expected_sf_filename} not found") + return False + + return True + + +def variant_compatible_siblings(filenames, variant=None) -> Union[List[os.PathLike], str]: + weight_names = [ + WEIGHTS_NAME, + SAFETENSORS_WEIGHTS_NAME, + FLAX_WEIGHTS_NAME, + ONNX_WEIGHTS_NAME, + ONNX_EXTERNAL_WEIGHTS_NAME, + ] + + if is_transformers_available(): + weight_names += [TRANSFORMERS_WEIGHTS_NAME, TRANSFORMERS_SAFE_WEIGHTS_NAME, TRANSFORMERS_FLAX_WEIGHTS_NAME] + + # model_pytorch, diffusion_model_pytorch, ... + weight_prefixes = [w.split(".")[0] for w in weight_names] + # .bin, .safetensors, ... + weight_suffixs = [w.split(".")[-1] for w in weight_names] + # -00001-of-00002 + transformers_index_format = r"\d{5}-of-\d{5}" + + if variant is not None: + # `diffusion_pytorch_model.fp16.bin` as well as `model.fp16-00001-of-00002.safetensors` + variant_file_re = re.compile( + rf"({'|'.join(weight_prefixes)})\.({variant}|{variant}-{transformers_index_format})\.({'|'.join(weight_suffixs)})$" + ) + # `text_encoder/pytorch_model.bin.index.fp16.json` + variant_index_re = re.compile( + rf"({'|'.join(weight_prefixes)})\.({'|'.join(weight_suffixs)})\.index\.{variant}\.json$" + ) + + # `diffusion_pytorch_model.bin` as well as `model-00001-of-00002.safetensors` + non_variant_file_re = re.compile( + rf"({'|'.join(weight_prefixes)})(-{transformers_index_format})?\.({'|'.join(weight_suffixs)})$" + ) + # `text_encoder/pytorch_model.bin.index.json` + non_variant_index_re = re.compile(rf"({'|'.join(weight_prefixes)})\.({'|'.join(weight_suffixs)})\.index\.json") + + if variant is not None: + variant_weights = {f for f in filenames if variant_file_re.match(f.split("/")[-1]) is not None} + variant_indexes = {f for f in filenames if variant_index_re.match(f.split("/")[-1]) is not None} + variant_filenames = variant_weights | variant_indexes + else: + variant_filenames = set() + + non_variant_weights = {f for f in filenames if non_variant_file_re.match(f.split("/")[-1]) is not None} + non_variant_indexes = {f for f in filenames if non_variant_index_re.match(f.split("/")[-1]) is not None} + non_variant_filenames = non_variant_weights | non_variant_indexes + + # all variant filenames will be used by default + usable_filenames = set(variant_filenames) + + def convert_to_variant(filename): + if "index" in filename: + variant_filename = filename.replace("index", f"index.{variant}") + elif re.compile(f"^(.*?){transformers_index_format}").match(filename) is not None: + variant_filename = f"{filename.split('-')[0]}.{variant}-{'-'.join(filename.split('-')[1:])}" + else: + variant_filename = f"{filename.split('.')[0]}.{variant}.{filename.split('.')[1]}" + return variant_filename + + for f in non_variant_filenames: + variant_filename = convert_to_variant(f) + if variant_filename not in usable_filenames: + usable_filenames.add(f) + + return usable_filenames, variant_filenames + + +def warn_deprecated_model_variant(pretrained_model_name_or_path, use_auth_token, variant, revision, model_filenames): + info = model_info( + pretrained_model_name_or_path, + use_auth_token=use_auth_token, + revision=None, + ) + filenames = {sibling.rfilename for sibling in info.siblings} + comp_model_filenames, _ = variant_compatible_siblings(filenames, variant=revision) + comp_model_filenames = [".".join(f.split(".")[:1] + f.split(".")[2:]) for f in comp_model_filenames] + + if set(comp_model_filenames) == set(model_filenames): + warnings.warn( + f"You are loading the variant {revision} from {pretrained_model_name_or_path} via `revision='{revision}'` even though you can load it via `variant=`{revision}`. Loading model variants via `revision='{revision}'` is deprecated and will be removed in diffusers v1. Please use `variant='{revision}'` instead.", + FutureWarning, + ) + else: + warnings.warn( + f"You are loading the variant {revision} from {pretrained_model_name_or_path} via `revision='{revision}'`. This behavior is deprecated and will be removed in diffusers v1. One should use `variant='{revision}'` instead. However, it appears that {pretrained_model_name_or_path} currently does not have the required variant filenames in the 'main' branch. \n The Diffusers team and community would be very grateful if you could open an issue: https://github.com/huggingface/diffusers/issues/new with the title '{pretrained_model_name_or_path} is missing {revision} files' so that the correct variant file can be added.", + FutureWarning, + ) + + +def maybe_raise_or_warn( + library_name, library, class_name, importable_classes, passed_class_obj, name, is_pipeline_module +): + """Simple helper method to raise or warn in case incorrect module has been passed""" + if not is_pipeline_module: + library = importlib.import_module(library_name) + class_obj = getattr(library, class_name) + class_candidates = {c: getattr(library, c, None) for c in importable_classes.keys()} + + expected_class_obj = None + for class_name, class_candidate in class_candidates.items(): + if class_candidate is not None and issubclass(class_obj, class_candidate): + expected_class_obj = class_candidate + + # Dynamo wraps the original model in a private class. + # I didn't find a public API to get the original class. + sub_model = passed_class_obj[name] + model_cls = sub_model.__class__ + if is_compiled_module(sub_model): + model_cls = sub_model._orig_mod.__class__ + + if not issubclass(model_cls, expected_class_obj): + raise ValueError( + f"{passed_class_obj[name]} is of type: {model_cls}, but should be" f" {expected_class_obj}" + ) + else: + logger.warning( + f"You have passed a non-standard module {passed_class_obj[name]}. We cannot verify whether it" + " has the correct type" + ) + + +def get_class_obj_and_candidates(library_name, class_name, importable_classes, pipelines, is_pipeline_module): + """Simple helper method to retrieve class object of module as well as potential parent class objects""" + if is_pipeline_module: + pipeline_module = getattr(pipelines, library_name) + + class_obj = getattr(pipeline_module, class_name) + class_candidates = {c: class_obj for c in importable_classes.keys()} + else: + # else we just import it from the library. + library = importlib.import_module(library_name) + + class_obj = getattr(library, class_name) + class_candidates = {c: getattr(library, c, None) for c in importable_classes.keys()} + + return class_obj, class_candidates + + +def _get_pipeline_class(class_obj, config, custom_pipeline=None, cache_dir=None, revision=None): + if custom_pipeline is not None: + if custom_pipeline.endswith(".py"): + path = Path(custom_pipeline) + # decompose into folder & file + file_name = path.name + custom_pipeline = path.parent.absolute() + else: + file_name = CUSTOM_PIPELINE_FILE_NAME + + return get_class_from_dynamic_module( + custom_pipeline, module_file=file_name, cache_dir=cache_dir, revision=revision + ) + + if class_obj != DiffusionPipeline: + return class_obj + + diffusers_module = importlib.import_module(class_obj.__module__.split(".")[0]) + return getattr(diffusers_module, config["_class_name"]) + + +def load_sub_model( + library_name: str, + class_name: str, + importable_classes: List[Any], + pipelines: Any, + is_pipeline_module: bool, + pipeline_class: Any, + torch_dtype: torch.dtype, + provider: Any, + sess_options: Any, + device_map: Optional[Union[Dict[str, torch.device], str]], + max_memory: Optional[Dict[Union[int, str], Union[int, str]]], + offload_folder: Optional[Union[str, os.PathLike]], + offload_state_dict: bool, + model_variants: Dict[str, str], + name: str, + from_flax: bool, + variant: str, + low_cpu_mem_usage: bool, + cached_folder: Union[str, os.PathLike], +): + """Helper method to load the module `name` from `library_name` and `class_name`""" + # retrieve class candidates + class_obj, class_candidates = get_class_obj_and_candidates( + library_name, class_name, importable_classes, pipelines, is_pipeline_module + ) + + load_method_name = None + # retrive load method name + for class_name, class_candidate in class_candidates.items(): + if class_candidate is not None and issubclass(class_obj, class_candidate): + load_method_name = importable_classes[class_name][1] + + # if load method name is None, then we have a dummy module -> raise Error + if load_method_name is None: + none_module = class_obj.__module__ + is_dummy_path = none_module.startswith(DUMMY_MODULES_FOLDER) or none_module.startswith( + TRANSFORMERS_DUMMY_MODULES_FOLDER + ) + if is_dummy_path and "dummy" in none_module: + # call class_obj for nice error message of missing requirements + class_obj() + + raise ValueError( + f"The component {class_obj} of {pipeline_class} cannot be loaded as it does not seem to have" + f" any of the loading methods defined in {ALL_IMPORTABLE_CLASSES}." + ) + + load_method = getattr(class_obj, load_method_name) + + # add kwargs to loading method + loading_kwargs = {} + if issubclass(class_obj, torch.nn.Module): + loading_kwargs["torch_dtype"] = torch_dtype + if issubclass(class_obj, diffusers.OnnxRuntimeModel): + loading_kwargs["provider"] = provider + loading_kwargs["sess_options"] = sess_options + + is_diffusers_model = issubclass(class_obj, diffusers.ModelMixin) + + if is_transformers_available(): + transformers_version = version.parse(version.parse(transformers.__version__).base_version) + else: + transformers_version = "N/A" + + is_transformers_model = ( + is_transformers_available() + and issubclass(class_obj, PreTrainedModel) + and transformers_version >= version.parse("4.20.0") + ) + + # When loading a transformers model, if the device_map is None, the weights will be initialized as opposed to diffusers. + # To make default loading faster we set the `low_cpu_mem_usage=low_cpu_mem_usage` flag which is `True` by default. + # This makes sure that the weights won't be initialized which significantly speeds up loading. + if is_diffusers_model or is_transformers_model: + loading_kwargs["device_map"] = device_map + loading_kwargs["max_memory"] = max_memory + loading_kwargs["offload_folder"] = offload_folder + loading_kwargs["offload_state_dict"] = offload_state_dict + loading_kwargs["variant"] = model_variants.pop(name, None) + if from_flax: + loading_kwargs["from_flax"] = True + + # the following can be deleted once the minimum required `transformers` version + # is higher than 4.27 + if ( + is_transformers_model + and loading_kwargs["variant"] is not None + and transformers_version < version.parse("4.27.0") + ): + raise ImportError( + f"When passing `variant='{variant}'`, please make sure to upgrade your `transformers` version to at least 4.27.0.dev0" + ) + elif is_transformers_model and loading_kwargs["variant"] is None: + loading_kwargs.pop("variant") + + # if `from_flax` and model is transformer model, can currently not load with `low_cpu_mem_usage` + if not (from_flax and is_transformers_model): + loading_kwargs["low_cpu_mem_usage"] = low_cpu_mem_usage + else: + loading_kwargs["low_cpu_mem_usage"] = False + + # check if the module is in a subdirectory + if os.path.isdir(os.path.join(cached_folder, name)): + loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs) + else: + # else load from the root directory + loaded_sub_model = load_method(cached_folder, **loading_kwargs) + + return loaded_sub_model + + +class DiffusionPipeline(ConfigMixin): + r""" + Base class for all pipelines. + + [`DiffusionPipeline`] stores all components (models, schedulers, and processors) for diffusion pipelines and + provides methods for loading, downloading and saving models. It also includes methods to: + + - move all PyTorch modules to the device of your choice + - enabling/disabling the progress bar for the denoising iteration + + Class attributes: + + - **config_name** (`str`) -- The configuration filename that stores the class and module names of all the + diffusion pipeline's components. + - **_optional_components** (List[`str`]) -- List of all optional components that don't have to be passed to the + pipeline to function (should be overridden by subclasses). + """ + config_name = "model_index.json" + _optional_components = [] + + def register_modules(self, **kwargs): + # import it here to avoid circular import + from diffusers import pipelines + + for name, module in kwargs.items(): + # retrieve library + if module is None: + register_dict = {name: (None, None)} + else: + # register the config from the original module, not the dynamo compiled one + if is_compiled_module(module): + not_compiled_module = module._orig_mod + else: + not_compiled_module = module + + library = not_compiled_module.__module__.split(".")[0] + + # check if the module is a pipeline module + module_path_items = not_compiled_module.__module__.split(".") + pipeline_dir = module_path_items[-2] if len(module_path_items) > 2 else None + + path = not_compiled_module.__module__.split(".") + is_pipeline_module = pipeline_dir in path and hasattr(pipelines, pipeline_dir) + + # if library is not in LOADABLE_CLASSES, then it is a custom module. + # Or if it's a pipeline module, then the module is inside the pipeline + # folder so we set the library to module name. + if is_pipeline_module: + library = pipeline_dir + elif library not in LOADABLE_CLASSES: + library = not_compiled_module.__module__ + + # retrieve class_name + class_name = not_compiled_module.__class__.__name__ + + register_dict = {name: (library, class_name)} + + # save model index config + self.register_to_config(**register_dict) + + # set models + setattr(self, name, module) + + def __setattr__(self, name: str, value: Any): + if name in self.__dict__ and hasattr(self.config, name): + # We need to overwrite the config if name exists in config + if isinstance(getattr(self.config, name), (tuple, list)): + if value is not None and self.config[name][0] is not None: + class_library_tuple = (value.__module__.split(".")[0], value.__class__.__name__) + else: + class_library_tuple = (None, None) + + self.register_to_config(**{name: class_library_tuple}) + else: + self.register_to_config(**{name: value}) + + super().__setattr__(name, value) + + def save_pretrained( + self, + save_directory: Union[str, os.PathLike], + safe_serialization: bool = False, + variant: Optional[str] = None, + ): + """ + Save all saveable variables of the pipeline to a directory. A pipeline variable can be saved and loaded if its + class implements both a save and loading method. The pipeline is easily reloaded using the + [`~DiffusionPipeline.from_pretrained`] class method. + + Arguments: + save_directory (`str` or `os.PathLike`): + Directory to save a pipeline to. Will be created if it doesn't exist. + safe_serialization (`bool`, *optional*, defaults to `False`): + Whether to save the model using `safetensors` or the traditional PyTorch way with `pickle`. + variant (`str`, *optional*): + If specified, weights are saved in the format `pytorch_model..bin`. + """ + model_index_dict = dict(self.config) + model_index_dict.pop("_class_name", None) + model_index_dict.pop("_diffusers_version", None) + model_index_dict.pop("_module", None) + + expected_modules, optional_kwargs = self._get_signature_keys(self) + + def is_saveable_module(name, value): + if name not in expected_modules: + return False + if name in self._optional_components and value[0] is None: + return False + return True + + model_index_dict = {k: v for k, v in model_index_dict.items() if is_saveable_module(k, v)} + for pipeline_component_name in model_index_dict.keys(): + sub_model = getattr(self, pipeline_component_name) + model_cls = sub_model.__class__ + + # Dynamo wraps the original model in a private class. + # I didn't find a public API to get the original class. + if is_compiled_module(sub_model): + sub_model = sub_model._orig_mod + model_cls = sub_model.__class__ + + save_method_name = None + # search for the model's base class in LOADABLE_CLASSES + for library_name, library_classes in LOADABLE_CLASSES.items(): + if library_name in sys.modules: + library = importlib.import_module(library_name) + else: + logger.info( + f"{library_name} is not installed. Cannot save {pipeline_component_name} as {library_classes} from {library_name}" + ) + + for base_class, save_load_methods in library_classes.items(): + class_candidate = getattr(library, base_class, None) + if class_candidate is not None and issubclass(model_cls, class_candidate): + # if we found a suitable base class in LOADABLE_CLASSES then grab its save method + save_method_name = save_load_methods[0] + break + if save_method_name is not None: + break + + if save_method_name is None: + logger.warn(f"self.{pipeline_component_name}={sub_model} of type {type(sub_model)} cannot be saved.") + # make sure that unsaveable components are not tried to be loaded afterward + self.register_to_config(**{pipeline_component_name: (None, None)}) + continue + + save_method = getattr(sub_model, save_method_name) + + # Call the save method with the argument safe_serialization only if it's supported + save_method_signature = inspect.signature(save_method) + save_method_accept_safe = "safe_serialization" in save_method_signature.parameters + save_method_accept_variant = "variant" in save_method_signature.parameters + + save_kwargs = {} + if save_method_accept_safe: + save_kwargs["safe_serialization"] = safe_serialization + if save_method_accept_variant: + save_kwargs["variant"] = variant + + save_method(os.path.join(save_directory, pipeline_component_name), **save_kwargs) + + # finally save the config + self.save_config(save_directory) + + def to( + self, + torch_device: Optional[Union[str, torch.device]] = None, + torch_dtype: Optional[torch.dtype] = None, + silence_dtype_warnings: bool = False, + ): + if torch_device is None and torch_dtype is None: + return self + + # throw warning if pipeline is in "offloaded"-mode but user tries to manually set to GPU. + def module_is_sequentially_offloaded(module): + if not is_accelerate_available() or is_accelerate_version("<", "0.14.0"): + return False + + return hasattr(module, "_hf_hook") and not isinstance( + module._hf_hook, (accelerate.hooks.CpuOffload, accelerate.hooks.AlignDevicesHook) + ) + + def module_is_offloaded(module): + if not is_accelerate_available() or is_accelerate_version("<", "0.17.0.dev0"): + return False + + return hasattr(module, "_hf_hook") and isinstance(module._hf_hook, accelerate.hooks.CpuOffload) + + # .to("cuda") would raise an error if the pipeline is sequentially offloaded, so we raise our own to make it clearer + pipeline_is_sequentially_offloaded = any( + module_is_sequentially_offloaded(module) for _, module in self.components.items() + ) + if pipeline_is_sequentially_offloaded and torch.device(torch_device).type == "cuda": + raise ValueError( + "It seems like you have activated sequential model offloading by calling `enable_sequential_cpu_offload`, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline `.to('cpu')` or consider removing the move altogether if you use sequential offloading." + ) + + # Display a warning in this case (the operation succeeds but the benefits are lost) + pipeline_is_offloaded = any(module_is_offloaded(module) for _, module in self.components.items()) + if pipeline_is_offloaded and torch.device(torch_device).type == "cuda": + logger.warning( + f"It seems like you have activated model offloading by calling `enable_model_cpu_offload`, but are now manually moving the pipeline to GPU. It is strongly recommended against doing so as memory gains from offloading are likely to be lost. Offloading automatically takes care of moving the individual components {', '.join(self.components.keys())} to GPU when needed. To make sure offloading works as expected, you should consider moving the pipeline back to CPU: `pipeline.to('cpu')` or removing the move altogether if you use offloading." + ) + + module_names, _ = self._get_signature_keys(self) + modules = [getattr(self, n, None) for n in module_names] + modules = [m for m in modules if isinstance(m, torch.nn.Module)] + + is_offloaded = pipeline_is_offloaded or pipeline_is_sequentially_offloaded + for module in modules: + is_loaded_in_8bit = hasattr(module, "is_loaded_in_8bit") and module.is_loaded_in_8bit + + if is_loaded_in_8bit and torch_dtype is not None: + logger.warning( + f"The module '{module.__class__.__name__}' has been loaded in 8bit and conversion to {torch_dtype} is not yet supported. Module is still in 8bit precision." + ) + + if is_loaded_in_8bit and torch_device is not None: + logger.warning( + f"The module '{module.__class__.__name__}' has been loaded in 8bit and moving it to {torch_dtype} via `.to()` is not yet supported. Module is still on {module.device}." + ) + else: + module.to(torch_device, torch_dtype) + + if ( + module.dtype == torch.float16 + and str(torch_device) in ["cpu"] + and not silence_dtype_warnings + and not is_offloaded + ): + logger.warning( + "Pipelines loaded with `torch_dtype=torch.float16` cannot run with `cpu` device. It" + " is not recommended to move them to `cpu` as running them will fail. Please make" + " sure to use an accelerator to run the pipeline in inference, due to the lack of" + " support for`float16` operations on this device in PyTorch. Please, remove the" + " `torch_dtype=torch.float16` argument, or use another device for inference." + ) + return self + + @property + def device(self) -> torch.device: + r""" + Returns: + `torch.device`: The torch device on which the pipeline is located. + """ + module_names, _ = self._get_signature_keys(self) + modules = [getattr(self, n, None) for n in module_names] + modules = [m for m in modules if isinstance(m, torch.nn.Module)] + + for module in modules: + return module.device + + return torch.device("cpu") + + @classmethod + def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.PathLike]], **kwargs): + r""" + Instantiate a PyTorch diffusion pipeline from pretrained pipeline weights. + + The pipeline is set in evaluation mode (`model.eval()`) by default. + + If you get the error message below, you need to finetune the weights for your downstream task: + + ``` + Some weights of UNet2DConditionModel were not initialized from the model checkpoint at runwayml/stable-diffusion-v1-5 and are newly initialized because the shapes did not match: + - conv_in.weight: found shape torch.Size([320, 4, 3, 3]) in the checkpoint and torch.Size([320, 9, 3, 3]) in the model instantiated + You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. + ``` + + Parameters: + pretrained_model_name_or_path (`str` or `os.PathLike`, *optional*): + Can be either: + + - A string, the *repo id* (for example `CompVis/ldm-text2im-large-256`) of a pretrained pipeline + hosted on the Hub. + - A path to a *directory* (for example `./my_pipeline_directory/`) containing pipeline weights + saved using + [`~DiffusionPipeline.save_pretrained`]. + torch_dtype (`str` or `torch.dtype`, *optional*): + Override the default `torch.dtype` and load the model with another dtype. If "auto" is passed, the + dtype is automatically derived from the model's weights. + custom_pipeline (`str`, *optional*): + + + + 🧪 This is an experimental feature and may change in the future. + + + + Can be either: + + - A string, the *repo id* (for example `hf-internal-testing/diffusers-dummy-pipeline`) of a custom + pipeline hosted on the Hub. The repository must contain a file called pipeline.py that defines + the custom pipeline. + - A string, the *file name* of a community pipeline hosted on GitHub under + [Community](https://github.com/huggingface/diffusers/tree/main/examples/community). Valid file + names must match the file name and not the pipeline script (`clip_guided_stable_diffusion` + instead of `clip_guided_stable_diffusion.py`). Community pipelines are always loaded from the + current main branch of GitHub. + - A path to a directory (`./my_pipeline_directory/`) containing a custom pipeline. The directory + must contain a file called `pipeline.py` that defines the custom pipeline. + + + For more information on how to load and create custom pipelines, please have a look at [Loading and + Adding Custom + Pipelines](https://huggingface.co/docs/diffusers/using-diffusers/custom_pipeline_overview) + + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory where a downloaded pretrained model configuration is cached if the standard cache + is not used. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + output_loading_info(`bool`, *optional*, defaults to `False`): + Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. + local_files_only (`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to `True`, the model + won't be downloaded from the Hub. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from + `diffusers-cli login` (stored in `~/.huggingface`) is used. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + custom_revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, or a commit id similar to + `revision` when loading a custom pipeline from the Hub. It can be a 🤗 Diffusers version when loading a + custom pipeline from GitHub, otherwise it defaults to `"main"` when loading from the Hub. + mirror (`str`, *optional*): + Mirror source to resolve accessibility issues if you’re downloading a model in China. We do not + guarantee the timeliness or safety of the source, and you should refer to the mirror site for more + information. + device_map (`str` or `Dict[str, Union[int, str, torch.device]]`, *optional*): + A map that specifies where each submodule should go. It doesn’t need to be defined for each + parameter/buffer name; once a given module name is inside, every submodule of it will be sent to the + same device. + + Set `device_map="auto"` to have 🤗 Accelerate automatically compute the most optimized `device_map`. For + more information about each option see [designing a device + map](https://hf.co/docs/accelerate/main/en/usage_guides/big_modeling#designing-a-device-map). + max_memory (`Dict`, *optional*): + A dictionary device identifier for the maximum memory. Will default to the maximum memory available for + each GPU and the available CPU RAM if unset. + offload_folder (`str` or `os.PathLike`, *optional*): + The path to offload weights if device_map contains the value `"disk"`. + offload_state_dict (`bool`, *optional*): + If `True`, temporarily offloads the CPU state dict to the hard drive to avoid running out of CPU RAM if + the weight of the CPU state dict + the biggest shard of the checkpoint does not fit. Defaults to `True` + when there is some disk offload. + low_cpu_mem_usage (`bool`, *optional*, defaults to `True` if torch version >= 1.9.0 else `False`): + Speed up model loading only loading the pretrained weights and not initializing the weights. This also + tries to not use more than 1x model size in CPU memory (including peak memory) while loading the model. + Only supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this + argument to `True` will raise an error. + use_safetensors (`bool`, *optional*, defaults to `None`): + If set to `None`, the safetensors weights are downloaded if they're available **and** if the + safetensors library is installed. If set to `True`, the model is forcibly loaded from safetensors + weights. If set to `False`, safetensors weights are not loaded. + kwargs (remaining dictionary of keyword arguments, *optional*): + Can be used to overwrite load and saveable variables (the pipeline components of the specific pipeline + class). The overwritten components are passed directly to the pipelines `__init__` method. See example + below for more information. + variant (`str`, *optional*): + Load weights from a specified variant filename such as `"fp16"` or `"ema"`. This is ignored when + loading `from_flax`. + + + + To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with + `huggingface-cli login`. + + + + Examples: + + ```py + >>> from diffusers import DiffusionPipeline + + >>> # Download pipeline from huggingface.co and cache. + >>> pipeline = DiffusionPipeline.from_pretrained("CompVis/ldm-text2im-large-256") + + >>> # Download pipeline that requires an authorization token + >>> # For more information on access tokens, please refer to this section + >>> # of the documentation](https://huggingface.co/docs/hub/security-tokens) + >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") + + >>> # Use a different scheduler + >>> from diffusers import LMSDiscreteScheduler + + >>> scheduler = LMSDiscreteScheduler.from_config(pipeline.scheduler.config) + >>> pipeline.scheduler = scheduler + ``` + """ + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + resume_download = kwargs.pop("resume_download", False) + force_download = kwargs.pop("force_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", HF_HUB_OFFLINE) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + from_flax = kwargs.pop("from_flax", False) + torch_dtype = kwargs.pop("torch_dtype", None) + custom_pipeline = kwargs.pop("custom_pipeline", None) + custom_revision = kwargs.pop("custom_revision", None) + provider = kwargs.pop("provider", None) + sess_options = kwargs.pop("sess_options", None) + device_map = kwargs.pop("device_map", None) + max_memory = kwargs.pop("max_memory", None) + offload_folder = kwargs.pop("offload_folder", None) + offload_state_dict = kwargs.pop("offload_state_dict", False) + low_cpu_mem_usage = kwargs.pop("low_cpu_mem_usage", _LOW_CPU_MEM_USAGE_DEFAULT) + variant = kwargs.pop("variant", None) + use_safetensors = kwargs.pop("use_safetensors", None if is_safetensors_available() else False) + + # 1. Download the checkpoints and configs + # use snapshot download here to get it working from from_pretrained + if not os.path.isdir(pretrained_model_name_or_path): + cached_folder = cls.download( + pretrained_model_name_or_path, + cache_dir=cache_dir, + resume_download=resume_download, + force_download=force_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + from_flax=from_flax, + use_safetensors=use_safetensors, + custom_pipeline=custom_pipeline, + custom_revision=custom_revision, + variant=variant, + **kwargs, + ) + else: + cached_folder = pretrained_model_name_or_path + + config_dict = cls.load_config(cached_folder) + + # pop out "_ignore_files" as it is only needed for download + config_dict.pop("_ignore_files", None) + + # 2. Define which model components should load variants + # We retrieve the information by matching whether variant + # model checkpoints exist in the subfolders + model_variants = {} + if variant is not None: + for folder in os.listdir(cached_folder): + folder_path = os.path.join(cached_folder, folder) + is_folder = os.path.isdir(folder_path) and folder in config_dict + variant_exists = is_folder and any( + p.split(".")[1].startswith(variant) for p in os.listdir(folder_path) + ) + if variant_exists: + model_variants[folder] = variant + + # 3. Load the pipeline class, if using custom module then load it from the hub + # if we load from explicit class, let's use it + pipeline_class = _get_pipeline_class( + cls, config_dict, custom_pipeline=custom_pipeline, cache_dir=cache_dir, revision=custom_revision + ) + + # DEPRECATED: To be removed in 1.0.0 + if pipeline_class.__name__ == "StableDiffusionInpaintPipeline" and version.parse( + version.parse(config_dict["_diffusers_version"]).base_version + ) <= version.parse("0.5.1"): + from diffusers import StableDiffusionInpaintPipeline, StableDiffusionInpaintPipelineLegacy + + pipeline_class = StableDiffusionInpaintPipelineLegacy + + deprecation_message = ( + "You are using a legacy checkpoint for inpainting with Stable Diffusion, therefore we are loading the" + f" {StableDiffusionInpaintPipelineLegacy} class instead of {StableDiffusionInpaintPipeline}. For" + " better inpainting results, we strongly suggest using Stable Diffusion's official inpainting" + " checkpoint: https://huggingface.co/runwayml/stable-diffusion-inpainting instead or adapting your" + f" checkpoint {pretrained_model_name_or_path} to the format of" + " https://huggingface.co/runwayml/stable-diffusion-inpainting. Note that we do not actively maintain" + " the {StableDiffusionInpaintPipelineLegacy} class and will likely remove it in version 1.0.0." + ) + deprecate("StableDiffusionInpaintPipelineLegacy", "1.0.0", deprecation_message, standard_warn=False) + + # 4. Define expected modules given pipeline signature + # and define non-None initialized modules (=`init_kwargs`) + + # some modules can be passed directly to the init + # in this case they are already instantiated in `kwargs` + # extract them here + expected_modules, optional_kwargs = cls._get_signature_keys(pipeline_class) + passed_class_obj = {k: kwargs.pop(k) for k in expected_modules if k in kwargs} + passed_pipe_kwargs = {k: kwargs.pop(k) for k in optional_kwargs if k in kwargs} + + init_dict, unused_kwargs, _ = pipeline_class.extract_init_dict(config_dict, **kwargs) + + # define init kwargs + init_kwargs = {k: init_dict.pop(k) for k in optional_kwargs if k in init_dict} + init_kwargs = {**init_kwargs, **passed_pipe_kwargs} + + # remove `null` components + def load_module(name, value): + if value[0] is None: + return False + if name in passed_class_obj and passed_class_obj[name] is None: + return False + return True + + init_dict = {k: v for k, v in init_dict.items() if load_module(k, v)} + + # Special case: safety_checker must be loaded separately when using `from_flax` + if from_flax and "safety_checker" in init_dict and "safety_checker" not in passed_class_obj: + raise NotImplementedError( + "The safety checker cannot be automatically loaded when loading weights `from_flax`." + " Please, pass `safety_checker=None` to `from_pretrained`, and load the safety checker" + " separately if you need it." + ) + + # 5. Throw nice warnings / errors for fast accelerate loading + if len(unused_kwargs) > 0: + logger.warning( + f"Keyword arguments {unused_kwargs} are not expected by {pipeline_class.__name__} and will be ignored." + ) + + if low_cpu_mem_usage and not is_accelerate_available(): + low_cpu_mem_usage = False + logger.warning( + "Cannot initialize model with low cpu memory usage because `accelerate` was not found in the" + " environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install" + " `accelerate` for faster and less memory-intense model loading. You can do so with: \n```\npip" + " install accelerate\n```\n." + ) + + if device_map is not None and not is_torch_version(">=", "1.9.0"): + raise NotImplementedError( + "Loading and dispatching requires torch >= 1.9.0. Please either update your PyTorch version or set" + " `device_map=None`." + ) + + if low_cpu_mem_usage is True and not is_torch_version(">=", "1.9.0"): + raise NotImplementedError( + "Low memory initialization requires torch >= 1.9.0. Please either update your PyTorch version or set" + " `low_cpu_mem_usage=False`." + ) + + if low_cpu_mem_usage is False and device_map is not None: + raise ValueError( + f"You cannot set `low_cpu_mem_usage` to False while using device_map={device_map} for loading and" + " dispatching. Please make sure to set `low_cpu_mem_usage=True`." + ) + + # import it here to avoid circular import + from diffusers import pipelines + + # 6. Load each module in the pipeline + for name, (library_name, class_name) in init_dict.items(): + # 6.1 - now that JAX/Flax is an official framework of the library, we might load from Flax names + if class_name.startswith("Flax"): + class_name = class_name[4:] + + # 6.2 Define all importable classes + is_pipeline_module = hasattr(pipelines, library_name) + importable_classes = ALL_IMPORTABLE_CLASSES + loaded_sub_model = None + + # 6.3 Use passed sub model or load class_name from library_name + if name in passed_class_obj: + # if the model is in a pipeline module, then we load it from the pipeline + # check that passed_class_obj has correct parent class + maybe_raise_or_warn( + library_name, library, class_name, importable_classes, passed_class_obj, name, is_pipeline_module + ) + + loaded_sub_model = passed_class_obj[name] + else: + # load sub model + loaded_sub_model = load_sub_model( + library_name=library_name, + class_name=class_name, + importable_classes=importable_classes, + pipelines=pipelines, + is_pipeline_module=is_pipeline_module, + pipeline_class=pipeline_class, + torch_dtype=torch_dtype, + provider=provider, + sess_options=sess_options, + device_map=device_map, + max_memory=max_memory, + offload_folder=offload_folder, + offload_state_dict=offload_state_dict, + model_variants=model_variants, + name=name, + from_flax=from_flax, + variant=variant, + low_cpu_mem_usage=low_cpu_mem_usage, + cached_folder=cached_folder, + ) + + init_kwargs[name] = loaded_sub_model # UNet(...), # DiffusionSchedule(...) + + # 7. Potentially add passed objects if expected + missing_modules = set(expected_modules) - set(init_kwargs.keys()) + passed_modules = list(passed_class_obj.keys()) + optional_modules = pipeline_class._optional_components + if len(missing_modules) > 0 and missing_modules <= set(passed_modules + optional_modules): + for module in missing_modules: + init_kwargs[module] = passed_class_obj.get(module, None) + elif len(missing_modules) > 0: + passed_modules = set(list(init_kwargs.keys()) + list(passed_class_obj.keys())) - optional_kwargs + raise ValueError( + f"Pipeline {pipeline_class} expected {expected_modules}, but only {passed_modules} were passed." + ) + + # 8. Instantiate the pipeline + model = pipeline_class(**init_kwargs) + return model + + @classmethod + def download(cls, pretrained_model_name, **kwargs) -> Union[str, os.PathLike]: + r""" + Download and cache a PyTorch diffusion pipeline from pretrained pipeline weights. + + Parameters: + pretrained_model_name (`str` or `os.PathLike`, *optional*): + A string, the *repository id* (for example `CompVis/ldm-text2im-large-256`) of a pretrained pipeline + hosted on the Hub. + custom_pipeline (`str`, *optional*): + Can be either: + + - A string, the *repository id* (for example `CompVis/ldm-text2im-large-256`) of a pretrained + pipeline hosted on the Hub. The repository must contain a file called `pipeline.py` that defines + the custom pipeline. + + - A string, the *file name* of a community pipeline hosted on GitHub under + [Community](https://github.com/huggingface/diffusers/tree/main/examples/community). Valid file + names must match the file name and not the pipeline script (`clip_guided_stable_diffusion` + instead of `clip_guided_stable_diffusion.py`). Community pipelines are always loaded from the + current `main` branch of GitHub. + + - A path to a *directory* (`./my_pipeline_directory/`) containing a custom pipeline. The directory + must contain a file called `pipeline.py` that defines the custom pipeline. + + + + 🧪 This is an experimental feature and may change in the future. + + + + For more information on how to load and create custom pipelines, take a look at [How to contribute a + community pipeline](https://huggingface.co/docs/diffusers/main/en/using-diffusers/contribute_pipeline). + + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to resume downloading the model weights and configuration files. If set to `False`, any + incompletely downloaded files are deleted. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + output_loading_info(`bool`, *optional*, defaults to `False`): + Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. + local_files_only (`bool`, *optional*, defaults to `False`): + Whether to only load local model weights and configuration files or not. If set to `True`, the model + won't be downloaded from the Hub. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from + `diffusers-cli login` (stored in `~/.huggingface`) is used. + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier + allowed by Git. + custom_revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, or a commit id similar to + `revision` when loading a custom pipeline from the Hub. It can be a 🤗 Diffusers version when loading a + custom pipeline from GitHub, otherwise it defaults to `"main"` when loading from the Hub. + mirror (`str`, *optional*): + Mirror source to resolve accessibility issues if you're downloading a model in China. We do not + guarantee the timeliness or safety of the source, and you should refer to the mirror site for more + information. + variant (`str`, *optional*): + Load weights from a specified variant filename such as `"fp16"` or `"ema"`. This is ignored when + loading `from_flax`. + + Returns: + `os.PathLike`: + A path to the downloaded pipeline. + + + + To use private or [gated models](https://huggingface.co/docs/hub/models-gated#gated-models), log-in with + `huggingface-cli login`. + + + + """ + cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) + resume_download = kwargs.pop("resume_download", False) + force_download = kwargs.pop("force_download", False) + proxies = kwargs.pop("proxies", None) + local_files_only = kwargs.pop("local_files_only", HF_HUB_OFFLINE) + use_auth_token = kwargs.pop("use_auth_token", None) + revision = kwargs.pop("revision", None) + from_flax = kwargs.pop("from_flax", False) + custom_pipeline = kwargs.pop("custom_pipeline", None) + custom_revision = kwargs.pop("custom_revision", None) + variant = kwargs.pop("variant", None) + use_safetensors = kwargs.pop("use_safetensors", None) + + if use_safetensors and not is_safetensors_available(): + raise ValueError( + "`use_safetensors`=True but safetensors is not installed. Please install safetensors with `pip install safetensors" + ) + + allow_pickle = False + if use_safetensors is None: + use_safetensors = is_safetensors_available() + allow_pickle = True + + pipeline_is_cached = False + allow_patterns = None + ignore_patterns = None + + if not local_files_only: + try: + info = model_info( + pretrained_model_name, + use_auth_token=use_auth_token, + revision=revision, + ) + except HTTPError as e: + logger.warn(f"Couldn't connect to the Hub: {e}.\nWill try to load from local cache.") + local_files_only = True + + if not local_files_only: + config_file = hf_hub_download( + pretrained_model_name, + cls.config_name, + cache_dir=cache_dir, + revision=revision, + proxies=proxies, + force_download=force_download, + resume_download=resume_download, + use_auth_token=use_auth_token, + ) + + config_dict = cls._dict_from_json_file(config_file) + + ignore_filenames = config_dict.pop("_ignore_files", []) + + # retrieve all folder_names that contain relevant files + folder_names = [k for k, v in config_dict.items() if isinstance(v, list)] + + filenames = {sibling.rfilename for sibling in info.siblings} + model_filenames, variant_filenames = variant_compatible_siblings(filenames, variant=variant) + + if len(variant_filenames) == 0 and variant is not None: + deprecation_message = ( + f"You are trying to load the model files of the `variant={variant}`, but no such modeling files are available." + f"The default model files: {model_filenames} will be loaded instead. Make sure to not load from `variant={variant}`" + "if such variant modeling files are not available. Doing so will lead to an error in v0.22.0 as defaulting to non-variant" + "modeling files is deprecated." + ) + deprecate("no variant default", "0.22.0", deprecation_message, standard_warn=False) + + # remove ignored filenames + model_filenames = set(model_filenames) - set(ignore_filenames) + variant_filenames = set(variant_filenames) - set(ignore_filenames) + + # if the whole pipeline is cached we don't have to ping the Hub + if revision in DEPRECATED_REVISION_ARGS and version.parse( + version.parse(__version__).base_version + ) >= version.parse("0.20.0"): + warn_deprecated_model_variant( + pretrained_model_name, use_auth_token, variant, revision, model_filenames + ) + + model_folder_names = {os.path.split(f)[0] for f in model_filenames} + + # all filenames compatible with variant will be added + allow_patterns = list(model_filenames) + + # allow all patterns from non-model folders + # this enables downloading schedulers, tokenizers, ... + allow_patterns += [f"{k}/*" for k in folder_names if k not in model_folder_names] + # also allow downloading config.json files with the model + allow_patterns += [os.path.join(k, "config.json") for k in model_folder_names] + + allow_patterns += [ + SCHEDULER_CONFIG_NAME, + CONFIG_NAME, + cls.config_name, + CUSTOM_PIPELINE_FILE_NAME, + ] + + # retrieve passed components that should not be downloaded + pipeline_class = _get_pipeline_class( + cls, config_dict, custom_pipeline=custom_pipeline, cache_dir=cache_dir, revision=custom_revision + ) + expected_components, _ = cls._get_signature_keys(pipeline_class) + passed_components = [k for k in expected_components if k in kwargs] + + if ( + use_safetensors + and not allow_pickle + and not is_safetensors_compatible( + model_filenames, variant=variant, passed_components=passed_components + ) + ): + raise EnvironmentError( + f"Could not found the necessary `safetensors` weights in {model_filenames} (variant={variant})" + ) + if from_flax: + ignore_patterns = ["*.bin", "*.safetensors", "*.onnx", "*.pb"] + elif use_safetensors and is_safetensors_compatible( + model_filenames, variant=variant, passed_components=passed_components + ): + ignore_patterns = ["*.bin", "*.msgpack"] + + safetensors_variant_filenames = {f for f in variant_filenames if f.endswith(".safetensors")} + safetensors_model_filenames = {f for f in model_filenames if f.endswith(".safetensors")} + if ( + len(safetensors_variant_filenames) > 0 + and safetensors_model_filenames != safetensors_variant_filenames + ): + logger.warn( + f"\nA mixture of {variant} and non-{variant} filenames will be loaded.\nLoaded {variant} filenames:\n[{', '.join(safetensors_variant_filenames)}]\nLoaded non-{variant} filenames:\n[{', '.join(safetensors_model_filenames - safetensors_variant_filenames)}\nIf this behavior is not expected, please check your folder structure." + ) + else: + ignore_patterns = ["*.safetensors", "*.msgpack"] + + bin_variant_filenames = {f for f in variant_filenames if f.endswith(".bin")} + bin_model_filenames = {f for f in model_filenames if f.endswith(".bin")} + if len(bin_variant_filenames) > 0 and bin_model_filenames != bin_variant_filenames: + logger.warn( + f"\nA mixture of {variant} and non-{variant} filenames will be loaded.\nLoaded {variant} filenames:\n[{', '.join(bin_variant_filenames)}]\nLoaded non-{variant} filenames:\n[{', '.join(bin_model_filenames - bin_variant_filenames)}\nIf this behavior is not expected, please check your folder structure." + ) + + # Don't download any objects that are passed + allow_patterns = [ + p for p in allow_patterns if not (len(p.split("/")) == 2 and p.split("/")[0] in passed_components) + ] + # Don't download index files of forbidden patterns either + ignore_patterns = ignore_patterns + [f"{i}.index.*json" for i in ignore_patterns] + + re_ignore_pattern = [re.compile(fnmatch.translate(p)) for p in ignore_patterns] + re_allow_pattern = [re.compile(fnmatch.translate(p)) for p in allow_patterns] + + expected_files = [f for f in filenames if not any(p.match(f) for p in re_ignore_pattern)] + expected_files = [f for f in expected_files if any(p.match(f) for p in re_allow_pattern)] + + snapshot_folder = Path(config_file).parent + pipeline_is_cached = all((snapshot_folder / f).is_file() for f in expected_files) + + if pipeline_is_cached and not force_download: + # if the pipeline is cached, we can directly return it + # else call snapshot_download + return snapshot_folder + + user_agent = {"pipeline_class": cls.__name__} + if custom_pipeline is not None and not custom_pipeline.endswith(".py"): + user_agent["custom_pipeline"] = custom_pipeline + + # download all allow_patterns - ignore_patterns + cached_folder = snapshot_download( + pretrained_model_name, + cache_dir=cache_dir, + resume_download=resume_download, + proxies=proxies, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + revision=revision, + allow_patterns=allow_patterns, + ignore_patterns=ignore_patterns, + user_agent=user_agent, + ) + + return cached_folder + + @staticmethod + def _get_signature_keys(obj): + parameters = inspect.signature(obj.__init__).parameters + required_parameters = {k: v for k, v in parameters.items() if v.default == inspect._empty} + optional_parameters = set({k for k, v in parameters.items() if v.default != inspect._empty}) + expected_modules = set(required_parameters.keys()) - {"self"} + return expected_modules, optional_parameters + + @property + def components(self) -> Dict[str, Any]: + r""" + The `self.components` property can be useful to run different pipelines with the same weights and + configurations without reallocating additional memory. + + Returns (`dict`): + A dictionary containing all the modules needed to initialize the pipeline. + + Examples: + + ```py + >>> from diffusers import ( + ... StableDiffusionPipeline, + ... StableDiffusionImg2ImgPipeline, + ... StableDiffusionInpaintPipeline, + ... ) + + >>> text2img = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") + >>> img2img = StableDiffusionImg2ImgPipeline(**text2img.components) + >>> inpaint = StableDiffusionInpaintPipeline(**text2img.components) + ``` + """ + expected_modules, optional_parameters = self._get_signature_keys(self) + components = { + k: getattr(self, k) for k in self.config.keys() if not k.startswith("_") and k not in optional_parameters + } + + if set(components.keys()) != expected_modules: + raise ValueError( + f"{self} has been incorrectly initialized or {self.__class__} is incorrectly implemented. Expected" + f" {expected_modules} to be defined, but {components.keys()} are defined." + ) + + return components + + @staticmethod + def numpy_to_pil(images): + """ + Convert a NumPy image or a batch of images to a PIL image. + """ + return numpy_to_pil(images) + + def progress_bar(self, iterable=None, total=None): + if not hasattr(self, "_progress_bar_config"): + self._progress_bar_config = {} + elif not isinstance(self._progress_bar_config, dict): + raise ValueError( + f"`self._progress_bar_config` should be of type `dict`, but is {type(self._progress_bar_config)}." + ) + + if iterable is not None: + return tqdm(iterable, **self._progress_bar_config) + elif total is not None: + return tqdm(total=total, **self._progress_bar_config) + else: + raise ValueError("Either `total` or `iterable` has to be defined.") + + def set_progress_bar_config(self, **kwargs): + self._progress_bar_config = kwargs + + def enable_xformers_memory_efficient_attention(self, attention_op: Optional[Callable] = None): + r""" + Enable memory efficient attention from [xFormers](https://facebookresearch.github.io/xformers/). + + When this option is enabled, you should observe lower GPU memory usage and a potential speed up during + inference. Speed up during training is not guaranteed. + + + + ⚠️ When memory efficient attention and sliced attention are both enabled, memory efficient attention takes + precedent. + + + + Parameters: + attention_op (`Callable`, *optional*): + Override the default `None` operator for use as `op` argument to the + [`memory_efficient_attention()`](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.memory_efficient_attention) + function of xFormers. + + Examples: + + ```py + >>> import torch + >>> from diffusers import DiffusionPipeline + >>> from xformers.ops import MemoryEfficientAttentionFlashAttentionOp + + >>> pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16) + >>> pipe = pipe.to("cuda") + >>> pipe.enable_xformers_memory_efficient_attention(attention_op=MemoryEfficientAttentionFlashAttentionOp) + >>> # Workaround for not accepting attention shape using VAE for Flash Attention + >>> pipe.vae.enable_xformers_memory_efficient_attention(attention_op=None) + ``` + """ + self.set_use_memory_efficient_attention_xformers(True, attention_op) + + def disable_xformers_memory_efficient_attention(self): + r""" + Disable memory efficient attention from [xFormers](https://facebookresearch.github.io/xformers/). + """ + self.set_use_memory_efficient_attention_xformers(False) + + def set_use_memory_efficient_attention_xformers( + self, valid: bool, attention_op: Optional[Callable] = None + ) -> None: + # Recursively walk through all the children. + # Any children which exposes the set_use_memory_efficient_attention_xformers method + # gets the message + def fn_recursive_set_mem_eff(module: torch.nn.Module): + if hasattr(module, "set_use_memory_efficient_attention_xformers"): + module.set_use_memory_efficient_attention_xformers(valid, attention_op) + + for child in module.children(): + fn_recursive_set_mem_eff(child) + + module_names, _ = self._get_signature_keys(self) + modules = [getattr(self, n, None) for n in module_names] + modules = [m for m in modules if isinstance(m, torch.nn.Module)] + + for module in modules: + fn_recursive_set_mem_eff(module) + + def enable_attention_slicing(self, slice_size: Optional[Union[str, int]] = "auto"): + r""" + Enable sliced attention computation. + + When this option is enabled, the attention module splits the input tensor in slices to compute attention in + several steps. This is useful to save some memory in exchange for a small speed decrease. + + Args: + slice_size (`str` or `int`, *optional*, defaults to `"auto"`): + When `"auto"`, halves the input to the attention heads, so attention will be computed in two steps. If + `"max"`, maximum amount of memory will be saved by running only one slice at a time. If a number is + provided, uses as many slices as `attention_head_dim // slice_size`. In this case, `attention_head_dim` + must be a multiple of `slice_size`. + """ + self.set_attention_slice(slice_size) + + def disable_attention_slicing(self): + r""" + Disable sliced attention computation. If `enable_attention_slicing` was previously called, attention is + computed in one step. + """ + # set slice_size = `None` to disable `attention slicing` + self.enable_attention_slicing(None) + + def set_attention_slice(self, slice_size: Optional[int]): + module_names, _ = self._get_signature_keys(self) + modules = [getattr(self, n, None) for n in module_names] + modules = [m for m in modules if isinstance(m, torch.nn.Module) and hasattr(m, "set_attention_slice")] + + for module in modules: + module.set_attention_slice(slice_size) diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__init__.py b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..d61ba9fab3a35142fbb2e75e1c92e40781357b58 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__init__.py @@ -0,0 +1,26 @@ +from dataclasses import dataclass +from typing import List, Optional, Union + +import numpy as np +import PIL + +from ...utils import BaseOutput, is_invisible_watermark_available, is_torch_available, is_transformers_available + + +@dataclass +class StableDiffusionXLPipelineOutput(BaseOutput): + """ + Output class for Stable Diffusion pipelines. + + Args: + images (`List[PIL.Image.Image]` or `np.ndarray`) + List of denoised PIL images of length `batch_size` or numpy array of shape `(batch_size, height, width, + num_channels)`. PIL images or numpy array present the denoised images of the diffusion pipeline. + """ + + images: Union[List[PIL.Image.Image], np.ndarray] + + +if is_transformers_available() and is_torch_available() and is_invisible_watermark_available(): + from .pipeline_stable_diffusion_xl import StableDiffusionXLPipeline + from .pipeline_stable_diffusion_xl_img2img import StableDiffusionXLImg2ImgPipeline diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/__init__.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/__init__.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..5cb34db2d3e5fba4ad4d6297f71435502dbbbbd6 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/__init__.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/pipeline_stable_diffusion_xl.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/pipeline_stable_diffusion_xl.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..224d14046f46dd605378d6d253114f8a9d5b0bc9 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/pipeline_stable_diffusion_xl.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/pipeline_stable_diffusion_xl_img2img.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/pipeline_stable_diffusion_xl_img2img.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..7091974546723d24d0a76f3d88480aebb071f012 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/pipeline_stable_diffusion_xl_img2img.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/watermark.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/watermark.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..238099ee30008b2437f72cfb0605b212f51aa3a2 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/__pycache__/watermark.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py new file mode 100644 index 0000000000000000000000000000000000000000..0da53c95c4dbdc36e4d57b6d53271625f8e301d9 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py @@ -0,0 +1,823 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import inspect +from typing import Any, Callable, Dict, List, Optional, Tuple, Union + +import torch +from transformers import CLIPTextModel, CLIPTextModelWithProjection, CLIPTokenizer + +from ...image_processor import VaeImageProcessor +from ...loaders import FromSingleFileMixin, LoraLoaderMixin, TextualInversionLoaderMixin +from ...models import AutoencoderKL, UNet2DConditionModel +from ...models.attention_processor import ( + AttnProcessor2_0, + LoRAAttnProcessor2_0, + LoRAXFormersAttnProcessor, + XFormersAttnProcessor, +) +from ...schedulers import KarrasDiffusionSchedulers +from ...utils import ( + is_accelerate_available, + is_accelerate_version, + logging, + randn_tensor, + replace_example_docstring, +) +from ..pipeline_utils import DiffusionPipeline +from . import StableDiffusionXLPipelineOutput +from .watermark import StableDiffusionXLWatermarker + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + +EXAMPLE_DOC_STRING = """ + Examples: + ```py + >>> import torch + >>> from diffusers import StableDiffusionXLPipeline + + >>> pipe = StableDiffusionXLPipeline.from_pretrained( + ... "stabilityai/stable-diffusion-xl-base-0.9", torch_dtype=torch.float16 + ... ) + >>> pipe = pipe.to("cuda") + + >>> prompt = "a photo of an astronaut riding a horse on mars" + >>> image = pipe(prompt).images[0] + ``` +""" + + +def rescale_noise_cfg(noise_cfg, noise_pred_text, guidance_rescale=0.0): + """ + Rescale `noise_cfg` according to `guidance_rescale`. Based on findings of [Common Diffusion Noise Schedules and + Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf). See Section 3.4 + """ + std_text = noise_pred_text.std(dim=list(range(1, noise_pred_text.ndim)), keepdim=True) + std_cfg = noise_cfg.std(dim=list(range(1, noise_cfg.ndim)), keepdim=True) + # rescale the results from guidance (fixes overexposure) + noise_pred_rescaled = noise_cfg * (std_text / std_cfg) + # mix with the original results from guidance by factor guidance_rescale to avoid "plain looking" images + noise_cfg = guidance_rescale * noise_pred_rescaled + (1 - guidance_rescale) * noise_cfg + return noise_cfg + + +class StableDiffusionXLPipeline(DiffusionPipeline, FromSingleFileMixin): + r""" + Pipeline for text-to-image generation using Stable Diffusion. + + This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the + library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.) + + In addition the pipeline inherits the following loading methods: + - *Textual-Inversion*: [`loaders.TextualInversionLoaderMixin.load_textual_inversion`] + - *LoRA*: [`loaders.LoraLoaderMixin.load_lora_weights`] + - *Ckpt*: [`loaders.FromSingleFileMixin.from_single_file`] + + as well as the following saving methods: + - *LoRA*: [`loaders.LoraLoaderMixin.save_lora_weights`] + + Args: + vae ([`AutoencoderKL`]): + Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations. + text_encoder ([`CLIPTextModel`]): + Frozen text-encoder. Stable Diffusion uses the text portion of + [CLIP](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel), specifically + the [clip-vit-large-patch14](https://huggingface.co/openai/clip-vit-large-patch14) variant. + tokenizer (`CLIPTokenizer`): + Tokenizer of class + [CLIPTokenizer](https://huggingface.co/docs/transformers/v4.21.0/en/model_doc/clip#transformers.CLIPTokenizer). + unet ([`UNet2DConditionModel`]): Conditional U-Net architecture to denoise the encoded image latents. + scheduler ([`SchedulerMixin`]): + A scheduler to be used in combination with `unet` to denoise the encoded image latents. Can be one of + [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. + """ + + def __init__( + self, + vae: AutoencoderKL, + text_encoder: CLIPTextModel, + text_encoder_2: CLIPTextModelWithProjection, + tokenizer: CLIPTokenizer, + tokenizer_2: CLIPTokenizer, + unet: UNet2DConditionModel, + scheduler: KarrasDiffusionSchedulers, + force_zeros_for_empty_prompt: bool = True, + ): + super().__init__() + + self.register_modules( + vae=vae, + text_encoder=text_encoder, + text_encoder_2=text_encoder_2, + tokenizer=tokenizer, + tokenizer_2=tokenizer_2, + unet=unet, + scheduler=scheduler, + ) + self.register_to_config(force_zeros_for_empty_prompt=force_zeros_for_empty_prompt) + self.vae_scale_factor = 2 ** (len(self.vae.config.block_out_channels) - 1) + self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor) + self.default_sample_size = self.unet.config.sample_size + + self.watermark = StableDiffusionXLWatermarker() + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.enable_vae_slicing + def enable_vae_slicing(self): + r""" + Enable sliced VAE decoding. + + When this option is enabled, the VAE will split the input tensor in slices to compute decoding in several + steps. This is useful to save some memory and allow larger batch sizes. + """ + self.vae.enable_slicing() + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.disable_vae_slicing + def disable_vae_slicing(self): + r""" + Disable sliced VAE decoding. If `enable_vae_slicing` was previously invoked, this method will go back to + computing decoding in one step. + """ + self.vae.disable_slicing() + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.enable_vae_tiling + def enable_vae_tiling(self): + r""" + Enable tiled VAE decoding. + + When this option is enabled, the VAE will split the input tensor into tiles to compute decoding and encoding in + several steps. This is useful to save a large amount of memory and to allow the processing of larger images. + """ + self.vae.enable_tiling() + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.disable_vae_tiling + def disable_vae_tiling(self): + r""" + Disable tiled VAE decoding. If `enable_vae_tiling` was previously invoked, this method will go back to + computing decoding in one step. + """ + self.vae.disable_tiling() + + def enable_sequential_cpu_offload(self, gpu_id=0): + r""" + Offloads all models to CPU using accelerate, significantly reducing memory usage. When called, unet, + text_encoder, vae and safety checker have their state dicts saved to CPU and then are moved to a + `torch.device('meta') and loaded to GPU only when their specific submodule has its `forward` method called. + Note that offloading happens on a submodule basis. Memory savings are higher than with + `enable_model_cpu_offload`, but performance is lower. + """ + if is_accelerate_available() and is_accelerate_version(">=", "0.14.0"): + from accelerate import cpu_offload + else: + raise ImportError("`enable_sequential_cpu_offload` requires `accelerate v0.14.0` or higher") + + device = torch.device(f"cuda:{gpu_id}") + + if self.device.type != "cpu": + self.to("cpu", silence_dtype_warnings=True) + torch.cuda.empty_cache() # otherwise we don't see the memory savings (but they probably exist) + + for cpu_offloaded_model in [self.unet, self.text_encoder, self.text_encoder_2, self.vae]: + cpu_offload(cpu_offloaded_model, device) + + def enable_model_cpu_offload(self, gpu_id=0): + r""" + Offloads all models to CPU using accelerate, reducing memory usage with a low impact on performance. Compared + to `enable_sequential_cpu_offload`, this method moves one whole model at a time to the GPU when its `forward` + method is called, and the model remains in GPU until the next model runs. Memory savings are lower than with + `enable_sequential_cpu_offload`, but performance is much better due to the iterative execution of the `unet`. + """ + if is_accelerate_available() and is_accelerate_version(">=", "0.17.0.dev0"): + from accelerate import cpu_offload_with_hook + else: + raise ImportError("`enable_model_cpu_offload` requires `accelerate v0.17.0` or higher.") + + device = torch.device(f"cuda:{gpu_id}") + + if self.device.type != "cpu": + self.to("cpu", silence_dtype_warnings=True) + torch.cuda.empty_cache() # otherwise we don't see the memory savings (but they probably exist) + + model_sequence = ( + [self.text_encoder, self.text_encoder_2] if self.text_encoder is not None else [self.text_encoder_2] + ) + model_sequence.extend([self.unet, self.vae]) + + hook = None + for cpu_offloaded_model in model_sequence: + _, hook = cpu_offload_with_hook(cpu_offloaded_model, device, prev_module_hook=hook) + + # We'll offload the last model manually. + self.final_offload_hook = hook + + @property + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline._execution_device + def _execution_device(self): + r""" + Returns the device on which the pipeline's models will be executed. After calling + `pipeline.enable_sequential_cpu_offload()` the execution device can only be inferred from Accelerate's module + hooks. + """ + if not hasattr(self.unet, "_hf_hook"): + return self.device + for module in self.unet.modules(): + if ( + hasattr(module, "_hf_hook") + and hasattr(module._hf_hook, "execution_device") + and module._hf_hook.execution_device is not None + ): + return torch.device(module._hf_hook.execution_device) + return self.device + + def encode_prompt( + self, + prompt, + device: Optional[torch.device] = None, + num_images_per_prompt: int = 1, + do_classifier_free_guidance: bool = True, + negative_prompt=None, + prompt_embeds: Optional[torch.FloatTensor] = None, + negative_prompt_embeds: Optional[torch.FloatTensor] = None, + pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + negative_pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + lora_scale: Optional[float] = None, + ): + r""" + Encodes the prompt into text encoder hidden states. + + Args: + prompt (`str` or `List[str]`, *optional*): + prompt to be encoded + device: (`torch.device`): + torch device + num_images_per_prompt (`int`): + number of images that should be generated per prompt + do_classifier_free_guidance (`bool`): + whether to use classifier free guidance or not + negative_prompt (`str` or `List[str]`, *optional*): + The prompt or prompts not to guide the image generation. If not defined, one has to pass + `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is + less than `1`). + prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not + provided, text embeddings will be generated from `prompt` input argument. + negative_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input + argument. + pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. + If not provided, pooled text embeddings will be generated from `prompt` input argument. + negative_pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, pooled negative_prompt_embeds will be generated from `negative_prompt` + input argument. + lora_scale (`float`, *optional*): + A lora scale that will be applied to all LoRA layers of the text encoder if LoRA layers are loaded. + """ + device = device or self._execution_device + + # set lora scale so that monkey patched LoRA + # function of text encoder can correctly access it + if lora_scale is not None and isinstance(self, LoraLoaderMixin): + self._lora_scale = lora_scale + + if prompt is not None and isinstance(prompt, str): + batch_size = 1 + elif prompt is not None and isinstance(prompt, list): + batch_size = len(prompt) + else: + batch_size = prompt_embeds.shape[0] + + # Define tokenizers and text encoders + tokenizers = [self.tokenizer, self.tokenizer_2] if self.tokenizer is not None else [self.tokenizer_2] + text_encoders = ( + [self.text_encoder, self.text_encoder_2] if self.text_encoder is not None else [self.text_encoder_2] + ) + + if prompt_embeds is None: + # textual inversion: procecss multi-vector tokens if necessary + prompt_embeds_list = [] + for tokenizer, text_encoder in zip(tokenizers, text_encoders): + if isinstance(self, TextualInversionLoaderMixin): + prompt = self.maybe_convert_prompt(prompt, tokenizer) + + text_inputs = tokenizer( + prompt, + padding="max_length", + max_length=tokenizer.model_max_length, + truncation=True, + return_tensors="pt", + ) + text_input_ids = text_inputs.input_ids + untruncated_ids = tokenizer(prompt, padding="longest", return_tensors="pt").input_ids + + if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal( + text_input_ids, untruncated_ids + ): + removed_text = tokenizer.batch_decode(untruncated_ids[:, tokenizer.model_max_length - 1 : -1]) + logger.warning( + "The following part of your input was truncated because CLIP can only handle sequences up to" + f" {tokenizer.model_max_length} tokens: {removed_text}" + ) + + prompt_embeds = text_encoder( + text_input_ids.to(device), + output_hidden_states=True, + ) + + # We are only ALWAYS interested in the pooled output of the final text encoder + pooled_prompt_embeds = prompt_embeds[0] + prompt_embeds = prompt_embeds.hidden_states[-2] + + bs_embed, seq_len, _ = prompt_embeds.shape + # duplicate text embeddings for each generation per prompt, using mps friendly method + prompt_embeds = prompt_embeds.repeat(1, num_images_per_prompt, 1) + prompt_embeds = prompt_embeds.view(bs_embed * num_images_per_prompt, seq_len, -1) + + prompt_embeds_list.append(prompt_embeds) + + prompt_embeds = torch.concat(prompt_embeds_list, dim=-1) + + # get unconditional embeddings for classifier free guidance + zero_out_negative_prompt = negative_prompt is None and self.config.force_zeros_for_empty_prompt + if do_classifier_free_guidance and negative_prompt_embeds is None and zero_out_negative_prompt: + negative_prompt_embeds = torch.zeros_like(prompt_embeds) + negative_pooled_prompt_embeds = torch.zeros_like(pooled_prompt_embeds) + elif do_classifier_free_guidance and negative_prompt_embeds is None: + negative_prompt = negative_prompt or "" + uncond_tokens: List[str] + if prompt is not None and type(prompt) is not type(negative_prompt): + raise TypeError( + f"`negative_prompt` should be the same type to `prompt`, but got {type(negative_prompt)} !=" + f" {type(prompt)}." + ) + elif isinstance(negative_prompt, str): + uncond_tokens = [negative_prompt] + elif batch_size != len(negative_prompt): + raise ValueError( + f"`negative_prompt`: {negative_prompt} has batch size {len(negative_prompt)}, but `prompt`:" + f" {prompt} has batch size {batch_size}. Please make sure that passed `negative_prompt` matches" + " the batch size of `prompt`." + ) + else: + uncond_tokens = negative_prompt + + negative_prompt_embeds_list = [] + for tokenizer, text_encoder in zip(tokenizers, text_encoders): + # textual inversion: procecss multi-vector tokens if necessary + if isinstance(self, TextualInversionLoaderMixin): + uncond_tokens = self.maybe_convert_prompt(uncond_tokens, tokenizer) + + max_length = prompt_embeds.shape[1] + uncond_input = tokenizer( + uncond_tokens, + padding="max_length", + max_length=max_length, + truncation=True, + return_tensors="pt", + ) + + negative_prompt_embeds = text_encoder( + uncond_input.input_ids.to(device), + output_hidden_states=True, + ) + # We are only ALWAYS interested in the pooled output of the final text encoder + negative_pooled_prompt_embeds = negative_prompt_embeds[0] + negative_prompt_embeds = negative_prompt_embeds.hidden_states[-2] + + if do_classifier_free_guidance: + # duplicate unconditional embeddings for each generation per prompt, using mps friendly method + seq_len = negative_prompt_embeds.shape[1] + + negative_prompt_embeds = negative_prompt_embeds.to(dtype=text_encoder.dtype, device=device) + + negative_prompt_embeds = negative_prompt_embeds.repeat(1, num_images_per_prompt, 1) + negative_prompt_embeds = negative_prompt_embeds.view( + batch_size * num_images_per_prompt, seq_len, -1 + ) + + # For classifier free guidance, we need to do two forward passes. + # Here we concatenate the unconditional and text embeddings into a single batch + # to avoid doing two forward passes + + negative_prompt_embeds_list.append(negative_prompt_embeds) + + negative_prompt_embeds = torch.concat(negative_prompt_embeds_list, dim=-1) + + bs_embed = pooled_prompt_embeds.shape[0] + pooled_prompt_embeds = pooled_prompt_embeds.repeat(1, num_images_per_prompt).view( + bs_embed * num_images_per_prompt, -1 + ) + negative_pooled_prompt_embeds = negative_pooled_prompt_embeds.repeat(1, num_images_per_prompt).view( + bs_embed * num_images_per_prompt, -1 + ) + + return prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.prepare_extra_step_kwargs + def prepare_extra_step_kwargs(self, generator, eta): + # prepare extra kwargs for the scheduler step, since not all schedulers have the same signature + # eta (η) is only used with the DDIMScheduler, it will be ignored for other schedulers. + # eta corresponds to η in DDIM paper: https://arxiv.org/abs/2010.02502 + # and should be between [0, 1] + + accepts_eta = "eta" in set(inspect.signature(self.scheduler.step).parameters.keys()) + extra_step_kwargs = {} + if accepts_eta: + extra_step_kwargs["eta"] = eta + + # check if the scheduler accepts generator + accepts_generator = "generator" in set(inspect.signature(self.scheduler.step).parameters.keys()) + if accepts_generator: + extra_step_kwargs["generator"] = generator + return extra_step_kwargs + + def check_inputs( + self, + prompt, + height, + width, + callback_steps, + negative_prompt=None, + prompt_embeds=None, + negative_prompt_embeds=None, + pooled_prompt_embeds=None, + negative_pooled_prompt_embeds=None, + ): + if height % 8 != 0 or width % 8 != 0: + raise ValueError(f"`height` and `width` have to be divisible by 8 but are {height} and {width}.") + + if (callback_steps is None) or ( + callback_steps is not None and (not isinstance(callback_steps, int) or callback_steps <= 0) + ): + raise ValueError( + f"`callback_steps` has to be a positive integer but is {callback_steps} of type" + f" {type(callback_steps)}." + ) + + if prompt is not None and prompt_embeds is not None: + raise ValueError( + f"Cannot forward both `prompt`: {prompt} and `prompt_embeds`: {prompt_embeds}. Please make sure to" + " only forward one of the two." + ) + elif prompt is None and prompt_embeds is None: + raise ValueError( + "Provide either `prompt` or `prompt_embeds`. Cannot leave both `prompt` and `prompt_embeds` undefined." + ) + elif prompt is not None and (not isinstance(prompt, str) and not isinstance(prompt, list)): + raise ValueError(f"`prompt` has to be of type `str` or `list` but is {type(prompt)}") + + if negative_prompt is not None and negative_prompt_embeds is not None: + raise ValueError( + f"Cannot forward both `negative_prompt`: {negative_prompt} and `negative_prompt_embeds`:" + f" {negative_prompt_embeds}. Please make sure to only forward one of the two." + ) + + if prompt_embeds is not None and negative_prompt_embeds is not None: + if prompt_embeds.shape != negative_prompt_embeds.shape: + raise ValueError( + "`prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but" + f" got: `prompt_embeds` {prompt_embeds.shape} != `negative_prompt_embeds`" + f" {negative_prompt_embeds.shape}." + ) + + if prompt_embeds is not None and pooled_prompt_embeds is None: + raise ValueError( + "If `prompt_embeds` are provided, `pooled_prompt_embeds` also have to be passed. Make sure to generate `pooled_prompt_embeds` from the same text encoder that was used to generate `prompt_embeds`." + ) + + if negative_prompt_embeds is not None and negative_pooled_prompt_embeds is None: + raise ValueError( + "If `negative_prompt_embeds` are provided, `negative_pooled_prompt_embeds` also have to be passed. Make sure to generate `negative_pooled_prompt_embeds` from the same text encoder that was used to generate `negative_prompt_embeds`." + ) + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.prepare_latents + def prepare_latents(self, batch_size, num_channels_latents, height, width, dtype, device, generator, latents=None): + shape = (batch_size, num_channels_latents, height // self.vae_scale_factor, width // self.vae_scale_factor) + if isinstance(generator, list) and len(generator) != batch_size: + raise ValueError( + f"You have passed a list of generators of length {len(generator)}, but requested an effective batch" + f" size of {batch_size}. Make sure the batch size matches the length of the generators." + ) + + if latents is None: + latents = randn_tensor(shape, generator=generator, device=device, dtype=dtype) + else: + latents = latents.to(device) + + # scale the initial noise by the standard deviation required by the scheduler + latents = latents * self.scheduler.init_noise_sigma + return latents + + def _get_add_time_ids(self, original_size, crops_coords_top_left, target_size, dtype): + add_time_ids = list(original_size + crops_coords_top_left + target_size) + + passed_add_embed_dim = ( + self.unet.config.addition_time_embed_dim * len(add_time_ids) + self.text_encoder_2.config.projection_dim + ) + expected_add_embed_dim = self.unet.add_embedding.linear_1.in_features + + if expected_add_embed_dim != passed_add_embed_dim: + raise ValueError( + f"Model expects an added time embedding vector of length {expected_add_embed_dim}, but a vector of {passed_add_embed_dim} was created. The model has an incorrect config. Please check `unet.config.time_embedding_type` and `text_encoder_2.config.projection_dim`." + ) + + add_time_ids = torch.tensor([add_time_ids], dtype=dtype) + return add_time_ids + + @torch.no_grad() + @replace_example_docstring(EXAMPLE_DOC_STRING) + def __call__( + self, + prompt: Union[str, List[str]] = None, + height: Optional[int] = None, + width: Optional[int] = None, + num_inference_steps: int = 50, + guidance_scale: float = 5.0, + negative_prompt: Optional[Union[str, List[str]]] = None, + num_images_per_prompt: Optional[int] = 1, + eta: float = 0.0, + generator: Optional[Union[torch.Generator, List[torch.Generator]]] = None, + latents: Optional[torch.FloatTensor] = None, + prompt_embeds: Optional[torch.FloatTensor] = None, + negative_prompt_embeds: Optional[torch.FloatTensor] = None, + pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + negative_pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + output_type: Optional[str] = "pil", + return_dict: bool = True, + callback: Optional[Callable[[int, int, torch.FloatTensor], None]] = None, + callback_steps: int = 1, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + guidance_rescale: float = 0.0, + original_size: Optional[Tuple[int, int]] = None, + crops_coords_top_left: Tuple[int, int] = (0, 0), + target_size: Optional[Tuple[int, int]] = None, + ): + r""" + Function invoked when calling the pipeline for generation. + + Args: + prompt (`str` or `List[str]`, *optional*): + The prompt or prompts to guide the image generation. If not defined, one has to pass `prompt_embeds`. + instead. + height (`int`, *optional*, defaults to self.unet.config.sample_size * self.vae_scale_factor): + The height in pixels of the generated image. + width (`int`, *optional*, defaults to self.unet.config.sample_size * self.vae_scale_factor): + The width in pixels of the generated image. + num_inference_steps (`int`, *optional*, defaults to 50): + The number of denoising steps. More denoising steps usually lead to a higher quality image at the + expense of slower inference. + guidance_scale (`float`, *optional*, defaults to 7.5): + Guidance scale as defined in [Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598). + `guidance_scale` is defined as `w` of equation 2. of [Imagen + Paper](https://arxiv.org/pdf/2205.11487.pdf). Guidance scale is enabled by setting `guidance_scale > + 1`. Higher guidance scale encourages to generate images that are closely linked to the text `prompt`, + usually at the expense of lower image quality. + negative_prompt (`str` or `List[str]`, *optional*): + The prompt or prompts not to guide the image generation. If not defined, one has to pass + `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is + less than `1`). + num_images_per_prompt (`int`, *optional*, defaults to 1): + The number of images to generate per prompt. + eta (`float`, *optional*, defaults to 0.0): + Corresponds to parameter eta (η) in the DDIM paper: https://arxiv.org/abs/2010.02502. Only applies to + [`schedulers.DDIMScheduler`], will be ignored for others. + generator (`torch.Generator` or `List[torch.Generator]`, *optional*): + One or a list of [torch generator(s)](https://pytorch.org/docs/stable/generated/torch.Generator.html) + to make generation deterministic. + latents (`torch.FloatTensor`, *optional*): + Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for image + generation. Can be used to tweak the same generation with different prompts. If not provided, a latents + tensor will ge generated by sampling using the supplied random `generator`. + prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not + provided, text embeddings will be generated from `prompt` input argument. + negative_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input + argument. + pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. + If not provided, pooled text embeddings will be generated from `prompt` input argument. + negative_pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, pooled negative_prompt_embeds will be generated from `negative_prompt` + input argument. + output_type (`str`, *optional*, defaults to `"pil"`): + The output format of the generate image. Choose between + [PIL](https://pillow.readthedocs.io/en/stable/): `PIL.Image.Image` or `np.array`. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] instead of a + plain tuple. + callback (`Callable`, *optional*): + A function that will be called every `callback_steps` steps during inference. The function will be + called with the following arguments: `callback(step: int, timestep: int, latents: torch.FloatTensor)`. + callback_steps (`int`, *optional*, defaults to 1): + The frequency at which the `callback` function will be called. If not specified, the callback will be + called at every step. + cross_attention_kwargs (`dict`, *optional*): + A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under + `self.processor` in + [diffusers.cross_attention](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/cross_attention.py). + guidance_rescale (`float`, *optional*, defaults to 0.7): + Guidance rescale factor proposed by [Common Diffusion Noise Schedules and Sample Steps are + Flawed](https://arxiv.org/pdf/2305.08891.pdf) `guidance_scale` is defined as `φ` in equation 16. of + [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf). + Guidance rescale factor should fix overexposure when using zero terminal SNR. + original_size (`Tuple[int]`, *optional*, defaults to (1024, 1024)): + TODO + crops_coords_top_left (`Tuple[int]`, *optional*, defaults to (0, 0)): + TODO + target_size (`Tuple[int]`, *optional*, defaults to (1024, 1024)): + TODO + + Examples: + + Returns: + [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] or `tuple`: + [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] if `return_dict` is True, otherwise a + `tuple. When returning a tuple, the first element is a list with the generated images, and the second + element is a list of `bool`s denoting whether the corresponding generated image likely represents + "not-safe-for-work" (nsfw) content, according to the `safety_checker`. + """ + # 0. Default height and width to unet + height = height or self.default_sample_size * self.vae_scale_factor + width = width or self.default_sample_size * self.vae_scale_factor + + original_size = original_size or (height, width) + target_size = target_size or (height, width) + + # 1. Check inputs. Raise error if not correct + self.check_inputs( + prompt, + height, + width, + callback_steps, + negative_prompt, + prompt_embeds, + negative_prompt_embeds, + pooled_prompt_embeds, + negative_pooled_prompt_embeds, + ) + + # 2. Define call parameters + if prompt is not None and isinstance(prompt, str): + batch_size = 1 + elif prompt is not None and isinstance(prompt, list): + batch_size = len(prompt) + else: + batch_size = prompt_embeds.shape[0] + + device = self._execution_device + + # here `guidance_scale` is defined analog to the guidance weight `w` of equation (2) + # of the Imagen paper: https://arxiv.org/pdf/2205.11487.pdf . `guidance_scale = 1` + # corresponds to doing no classifier free guidance. + do_classifier_free_guidance = guidance_scale > 1.0 + + # 3. Encode input prompt + text_encoder_lora_scale = ( + cross_attention_kwargs.get("scale", None) if cross_attention_kwargs is not None else None + ) + ( + prompt_embeds, + negative_prompt_embeds, + pooled_prompt_embeds, + negative_pooled_prompt_embeds, + ) = self.encode_prompt( + prompt, + device, + num_images_per_prompt, + do_classifier_free_guidance, + negative_prompt, + prompt_embeds=prompt_embeds, + negative_prompt_embeds=negative_prompt_embeds, + pooled_prompt_embeds=pooled_prompt_embeds, + negative_pooled_prompt_embeds=negative_pooled_prompt_embeds, + lora_scale=text_encoder_lora_scale, + ) + + # 4. Prepare timesteps + self.scheduler.set_timesteps(num_inference_steps, device=device) + + timesteps = self.scheduler.timesteps + + # 5. Prepare latent variables + num_channels_latents = self.unet.config.in_channels + latents = self.prepare_latents( + batch_size * num_images_per_prompt, + num_channels_latents, + height, + width, + prompt_embeds.dtype, + device, + generator, + latents, + ) + + # 6. Prepare extra step kwargs. TODO: Logic should ideally just be moved out of the pipeline + extra_step_kwargs = self.prepare_extra_step_kwargs(generator, eta) + + # 7. Prepare added time ids & embeddings + add_text_embeds = pooled_prompt_embeds + add_time_ids = self._get_add_time_ids( + original_size, crops_coords_top_left, target_size, dtype=prompt_embeds.dtype + ) + + if do_classifier_free_guidance: + prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds], dim=0) + add_text_embeds = torch.cat([negative_pooled_prompt_embeds, add_text_embeds], dim=0) + add_time_ids = torch.cat([add_time_ids, add_time_ids], dim=0) + + prompt_embeds = prompt_embeds.to(device) + add_text_embeds = add_text_embeds.to(device) + add_time_ids = add_time_ids.to(device).repeat(batch_size * num_images_per_prompt, 1) + + # 8. Denoising loop + num_warmup_steps = len(timesteps) - num_inference_steps * self.scheduler.order + with self.progress_bar(total=num_inference_steps) as progress_bar: + for i, t in enumerate(timesteps): + # expand the latents if we are doing classifier free guidance + latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents + + latent_model_input = self.scheduler.scale_model_input(latent_model_input, t) + + # predict the noise residual + added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids} + noise_pred = self.unet( + latent_model_input, + t, + encoder_hidden_states=prompt_embeds, + cross_attention_kwargs=cross_attention_kwargs, + added_cond_kwargs=added_cond_kwargs, + return_dict=False, + )[0] + + # perform guidance + if do_classifier_free_guidance: + noise_pred_uncond, noise_pred_text = noise_pred.chunk(2) + noise_pred = noise_pred_uncond + guidance_scale * (noise_pred_text - noise_pred_uncond) + + if do_classifier_free_guidance and guidance_rescale > 0.0: + # Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf + noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=guidance_rescale) + + # compute the previous noisy sample x_t -> x_t-1 + latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0] + + # call the callback, if provided + if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0): + progress_bar.update() + if callback is not None and i % callback_steps == 0: + callback(i, t, latents) + + # make sure the VAE is in float32 mode, as it overflows in float16 + self.vae.to(dtype=torch.float32) + + use_torch_2_0_or_xformers = isinstance( + self.vae.decoder.mid_block.attentions[0].processor, + ( + AttnProcessor2_0, + XFormersAttnProcessor, + LoRAXFormersAttnProcessor, + LoRAAttnProcessor2_0, + ), + ) + # if xformers or torch_2_0 is used attention block does not need + # to be in float32 which can save lots of memory + if use_torch_2_0_or_xformers: + self.vae.post_quant_conv.to(latents.dtype) + self.vae.decoder.conv_in.to(latents.dtype) + self.vae.decoder.mid_block.to(latents.dtype) + else: + latents = latents.float() + + if not output_type == "latent": + image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0] + else: + image = latents + return StableDiffusionXLPipelineOutput(images=image) + + image = self.watermark.apply_watermark(image) + image = self.image_processor.postprocess(image, output_type=output_type) + + # Offload last model to CPU + if hasattr(self, "final_offload_hook") and self.final_offload_hook is not None: + self.final_offload_hook.offload() + + if not return_dict: + return (image,) + + return StableDiffusionXLPipelineOutput(images=image) diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py new file mode 100644 index 0000000000000000000000000000000000000000..f11ff330da859196293671009a23040a2dd10ee9 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py @@ -0,0 +1,896 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import inspect +from typing import Any, Callable, Dict, List, Optional, Tuple, Union + +import numpy as np +import PIL.Image +import torch +from transformers import CLIPTextModel, CLIPTextModelWithProjection, CLIPTokenizer + +from ...image_processor import VaeImageProcessor +from ...loaders import FromSingleFileMixin, LoraLoaderMixin, TextualInversionLoaderMixin +from ...models import AutoencoderKL, UNet2DConditionModel +from ...models.attention_processor import ( + AttnProcessor2_0, + LoRAAttnProcessor2_0, + LoRAXFormersAttnProcessor, + XFormersAttnProcessor, +) +from ...schedulers import KarrasDiffusionSchedulers +from ...utils import ( + is_accelerate_available, + is_accelerate_version, + logging, + randn_tensor, + replace_example_docstring, +) +from ..pipeline_utils import DiffusionPipeline +from . import StableDiffusionXLPipelineOutput +from .watermark import StableDiffusionXLWatermarker + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + +EXAMPLE_DOC_STRING = """ + Examples: + ```py + >>> import torch + >>> from diffusers import StableDiffusionXLImg2ImgPipeline + >>> from diffusers.utils import load_image + + >>> pipe = StableDiffusionXLImg2ImgPipeline.from_pretrained( + ... "stabilityai/stable-diffusion-xl-refiner-0.9", torch_dtype=torch.float16 + ... ) + >>> pipe = pipe.to("cuda") + >>> url = "https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/aa_xl/000000009.png" + + >>> init_image = load_image(url).convert("RGB") + >>> prompt = "a photo of an astronaut riding a horse on mars" + >>> image = pipe(prompt, image=init_image).images[0] + ``` +""" + + +def rescale_noise_cfg(noise_cfg, noise_pred_text, guidance_rescale=0.0): + """ + Rescale `noise_cfg` according to `guidance_rescale`. Based on findings of [Common Diffusion Noise Schedules and + Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf). See Section 3.4 + """ + std_text = noise_pred_text.std(dim=list(range(1, noise_pred_text.ndim)), keepdim=True) + std_cfg = noise_cfg.std(dim=list(range(1, noise_cfg.ndim)), keepdim=True) + # rescale the results from guidance (fixes overexposure) + noise_pred_rescaled = noise_cfg * (std_text / std_cfg) + # mix with the original results from guidance by factor guidance_rescale to avoid "plain looking" images + noise_cfg = guidance_rescale * noise_pred_rescaled + (1 - guidance_rescale) * noise_cfg + return noise_cfg + + +class StableDiffusionXLImg2ImgPipeline(DiffusionPipeline, FromSingleFileMixin): + r""" + Pipeline for text-to-image generation using Stable Diffusion. + + This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the + library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.) + + In addition the pipeline inherits the following loading methods: + - *Textual-Inversion*: [`loaders.TextualInversionLoaderMixin.load_textual_inversion`] + - *LoRA*: [`loaders.LoraLoaderMixin.load_lora_weights`] + - *Ckpt*: [`loaders.FromSingleFileMixin.from_single_file`] + + as well as the following saving methods: + - *LoRA*: [`loaders.LoraLoaderMixin.save_lora_weights`] + + Args: + vae ([`AutoencoderKL`]): + Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations. + text_encoder ([`CLIPTextModel`]): + Frozen text-encoder. Stable Diffusion uses the text portion of + [CLIP](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel), specifically + the [clip-vit-large-patch14](https://huggingface.co/openai/clip-vit-large-patch14) variant. + tokenizer (`CLIPTokenizer`): + Tokenizer of class + [CLIPTokenizer](https://huggingface.co/docs/transformers/v4.21.0/en/model_doc/clip#transformers.CLIPTokenizer). + unet ([`UNet2DConditionModel`]): Conditional U-Net architecture to denoise the encoded image latents. + scheduler ([`SchedulerMixin`]): + A scheduler to be used in combination with `unet` to denoise the encoded image latents. Can be one of + [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. + """ + _optional_components = ["tokenizer", "text_encoder"] + + def __init__( + self, + vae: AutoencoderKL, + text_encoder: CLIPTextModel, + text_encoder_2: CLIPTextModelWithProjection, + tokenizer: CLIPTokenizer, + tokenizer_2: CLIPTokenizer, + unet: UNet2DConditionModel, + scheduler: KarrasDiffusionSchedulers, + requires_aesthetics_score: bool = False, + force_zeros_for_empty_prompt: bool = True, + ): + super().__init__() + + self.register_modules( + vae=vae, + text_encoder=text_encoder, + text_encoder_2=text_encoder_2, + tokenizer=tokenizer, + tokenizer_2=tokenizer_2, + unet=unet, + scheduler=scheduler, + ) + self.register_to_config(force_zeros_for_empty_prompt=force_zeros_for_empty_prompt) + self.register_to_config(requires_aesthetics_score=requires_aesthetics_score) + self.vae_scale_factor = 2 ** (len(self.vae.config.block_out_channels) - 1) + self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor) + + self.watermark = StableDiffusionXLWatermarker() + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.enable_vae_slicing + def enable_vae_slicing(self): + r""" + Enable sliced VAE decoding. + + When this option is enabled, the VAE will split the input tensor in slices to compute decoding in several + steps. This is useful to save some memory and allow larger batch sizes. + """ + self.vae.enable_slicing() + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.disable_vae_slicing + def disable_vae_slicing(self): + r""" + Disable sliced VAE decoding. If `enable_vae_slicing` was previously invoked, this method will go back to + computing decoding in one step. + """ + self.vae.disable_slicing() + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.enable_vae_tiling + def enable_vae_tiling(self): + r""" + Enable tiled VAE decoding. + + When this option is enabled, the VAE will split the input tensor into tiles to compute decoding and encoding in + several steps. This is useful to save a large amount of memory and to allow the processing of larger images. + """ + self.vae.enable_tiling() + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.disable_vae_tiling + def disable_vae_tiling(self): + r""" + Disable tiled VAE decoding. If `enable_vae_tiling` was previously invoked, this method will go back to + computing decoding in one step. + """ + self.vae.disable_tiling() + + # Copied from diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline.enable_sequential_cpu_offload + def enable_sequential_cpu_offload(self, gpu_id=0): + r""" + Offloads all models to CPU using accelerate, significantly reducing memory usage. When called, unet, + text_encoder, vae and safety checker have their state dicts saved to CPU and then are moved to a + `torch.device('meta') and loaded to GPU only when their specific submodule has its `forward` method called. + Note that offloading happens on a submodule basis. Memory savings are higher than with + `enable_model_cpu_offload`, but performance is lower. + """ + if is_accelerate_available() and is_accelerate_version(">=", "0.14.0"): + from accelerate import cpu_offload + else: + raise ImportError("`enable_sequential_cpu_offload` requires `accelerate v0.14.0` or higher") + + device = torch.device(f"cuda:{gpu_id}") + + if self.device.type != "cpu": + self.to("cpu", silence_dtype_warnings=True) + torch.cuda.empty_cache() # otherwise we don't see the memory savings (but they probably exist) + + for cpu_offloaded_model in [self.unet, self.text_encoder, self.text_encoder_2, self.vae]: + cpu_offload(cpu_offloaded_model, device) + + # Copied from diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline.enable_model_cpu_offload + def enable_model_cpu_offload(self, gpu_id=0): + r""" + Offloads all models to CPU using accelerate, reducing memory usage with a low impact on performance. Compared + to `enable_sequential_cpu_offload`, this method moves one whole model at a time to the GPU when its `forward` + method is called, and the model remains in GPU until the next model runs. Memory savings are lower than with + `enable_sequential_cpu_offload`, but performance is much better due to the iterative execution of the `unet`. + """ + if is_accelerate_available() and is_accelerate_version(">=", "0.17.0.dev0"): + from accelerate import cpu_offload_with_hook + else: + raise ImportError("`enable_model_cpu_offload` requires `accelerate v0.17.0` or higher.") + + device = torch.device(f"cuda:{gpu_id}") + + if self.device.type != "cpu": + self.to("cpu", silence_dtype_warnings=True) + torch.cuda.empty_cache() # otherwise we don't see the memory savings (but they probably exist) + + model_sequence = ( + [self.text_encoder, self.text_encoder_2] if self.text_encoder is not None else [self.text_encoder_2] + ) + model_sequence.extend([self.unet, self.vae]) + + hook = None + for cpu_offloaded_model in model_sequence: + _, hook = cpu_offload_with_hook(cpu_offloaded_model, device, prev_module_hook=hook) + + # We'll offload the last model manually. + self.final_offload_hook = hook + + @property + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline._execution_device + def _execution_device(self): + r""" + Returns the device on which the pipeline's models will be executed. After calling + `pipeline.enable_sequential_cpu_offload()` the execution device can only be inferred from Accelerate's module + hooks. + """ + if not hasattr(self.unet, "_hf_hook"): + return self.device + for module in self.unet.modules(): + if ( + hasattr(module, "_hf_hook") + and hasattr(module._hf_hook, "execution_device") + and module._hf_hook.execution_device is not None + ): + return torch.device(module._hf_hook.execution_device) + return self.device + + # Copied from diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline.encode_prompt + def encode_prompt( + self, + prompt, + device: Optional[torch.device] = None, + num_images_per_prompt: int = 1, + do_classifier_free_guidance: bool = True, + negative_prompt=None, + prompt_embeds: Optional[torch.FloatTensor] = None, + negative_prompt_embeds: Optional[torch.FloatTensor] = None, + pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + negative_pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + lora_scale: Optional[float] = None, + ): + r""" + Encodes the prompt into text encoder hidden states. + + Args: + prompt (`str` or `List[str]`, *optional*): + prompt to be encoded + device: (`torch.device`): + torch device + num_images_per_prompt (`int`): + number of images that should be generated per prompt + do_classifier_free_guidance (`bool`): + whether to use classifier free guidance or not + negative_prompt (`str` or `List[str]`, *optional*): + The prompt or prompts not to guide the image generation. If not defined, one has to pass + `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is + less than `1`). + prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not + provided, text embeddings will be generated from `prompt` input argument. + negative_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input + argument. + pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. + If not provided, pooled text embeddings will be generated from `prompt` input argument. + negative_pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, pooled negative_prompt_embeds will be generated from `negative_prompt` + input argument. + lora_scale (`float`, *optional*): + A lora scale that will be applied to all LoRA layers of the text encoder if LoRA layers are loaded. + """ + device = device or self._execution_device + + # set lora scale so that monkey patched LoRA + # function of text encoder can correctly access it + if lora_scale is not None and isinstance(self, LoraLoaderMixin): + self._lora_scale = lora_scale + + if prompt is not None and isinstance(prompt, str): + batch_size = 1 + elif prompt is not None and isinstance(prompt, list): + batch_size = len(prompt) + else: + batch_size = prompt_embeds.shape[0] + + # Define tokenizers and text encoders + tokenizers = [self.tokenizer, self.tokenizer_2] if self.tokenizer is not None else [self.tokenizer_2] + text_encoders = ( + [self.text_encoder, self.text_encoder_2] if self.text_encoder is not None else [self.text_encoder_2] + ) + + if prompt_embeds is None: + # textual inversion: procecss multi-vector tokens if necessary + prompt_embeds_list = [] + for tokenizer, text_encoder in zip(tokenizers, text_encoders): + if isinstance(self, TextualInversionLoaderMixin): + prompt = self.maybe_convert_prompt(prompt, tokenizer) + + text_inputs = tokenizer( + prompt, + padding="max_length", + max_length=tokenizer.model_max_length, + truncation=True, + return_tensors="pt", + ) + text_input_ids = text_inputs.input_ids + untruncated_ids = tokenizer(prompt, padding="longest", return_tensors="pt").input_ids + + if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal( + text_input_ids, untruncated_ids + ): + removed_text = tokenizer.batch_decode(untruncated_ids[:, tokenizer.model_max_length - 1 : -1]) + logger.warning( + "The following part of your input was truncated because CLIP can only handle sequences up to" + f" {tokenizer.model_max_length} tokens: {removed_text}" + ) + + prompt_embeds = text_encoder( + text_input_ids.to(device), + output_hidden_states=True, + ) + + # We are only ALWAYS interested in the pooled output of the final text encoder + pooled_prompt_embeds = prompt_embeds[0] + prompt_embeds = prompt_embeds.hidden_states[-2] + + bs_embed, seq_len, _ = prompt_embeds.shape + # duplicate text embeddings for each generation per prompt, using mps friendly method + prompt_embeds = prompt_embeds.repeat(1, num_images_per_prompt, 1) + prompt_embeds = prompt_embeds.view(bs_embed * num_images_per_prompt, seq_len, -1) + + prompt_embeds_list.append(prompt_embeds) + + prompt_embeds = torch.concat(prompt_embeds_list, dim=-1) + + # get unconditional embeddings for classifier free guidance + zero_out_negative_prompt = negative_prompt is None and self.config.force_zeros_for_empty_prompt + if do_classifier_free_guidance and negative_prompt_embeds is None and zero_out_negative_prompt: + negative_prompt_embeds = torch.zeros_like(prompt_embeds) + negative_pooled_prompt_embeds = torch.zeros_like(pooled_prompt_embeds) + elif do_classifier_free_guidance and negative_prompt_embeds is None: + negative_prompt = negative_prompt or "" + uncond_tokens: List[str] + if prompt is not None and type(prompt) is not type(negative_prompt): + raise TypeError( + f"`negative_prompt` should be the same type to `prompt`, but got {type(negative_prompt)} !=" + f" {type(prompt)}." + ) + elif isinstance(negative_prompt, str): + uncond_tokens = [negative_prompt] + elif batch_size != len(negative_prompt): + raise ValueError( + f"`negative_prompt`: {negative_prompt} has batch size {len(negative_prompt)}, but `prompt`:" + f" {prompt} has batch size {batch_size}. Please make sure that passed `negative_prompt` matches" + " the batch size of `prompt`." + ) + else: + uncond_tokens = negative_prompt + + negative_prompt_embeds_list = [] + for tokenizer, text_encoder in zip(tokenizers, text_encoders): + # textual inversion: procecss multi-vector tokens if necessary + if isinstance(self, TextualInversionLoaderMixin): + uncond_tokens = self.maybe_convert_prompt(uncond_tokens, tokenizer) + + max_length = prompt_embeds.shape[1] + uncond_input = tokenizer( + uncond_tokens, + padding="max_length", + max_length=max_length, + truncation=True, + return_tensors="pt", + ) + + negative_prompt_embeds = text_encoder( + uncond_input.input_ids.to(device), + output_hidden_states=True, + ) + # We are only ALWAYS interested in the pooled output of the final text encoder + negative_pooled_prompt_embeds = negative_prompt_embeds[0] + negative_prompt_embeds = negative_prompt_embeds.hidden_states[-2] + + if do_classifier_free_guidance: + # duplicate unconditional embeddings for each generation per prompt, using mps friendly method + seq_len = negative_prompt_embeds.shape[1] + + negative_prompt_embeds = negative_prompt_embeds.to(dtype=text_encoder.dtype, device=device) + + negative_prompt_embeds = negative_prompt_embeds.repeat(1, num_images_per_prompt, 1) + negative_prompt_embeds = negative_prompt_embeds.view( + batch_size * num_images_per_prompt, seq_len, -1 + ) + + # For classifier free guidance, we need to do two forward passes. + # Here we concatenate the unconditional and text embeddings into a single batch + # to avoid doing two forward passes + + negative_prompt_embeds_list.append(negative_prompt_embeds) + + negative_prompt_embeds = torch.concat(negative_prompt_embeds_list, dim=-1) + + bs_embed = pooled_prompt_embeds.shape[0] + pooled_prompt_embeds = pooled_prompt_embeds.repeat(1, num_images_per_prompt).view( + bs_embed * num_images_per_prompt, -1 + ) + negative_pooled_prompt_embeds = negative_pooled_prompt_embeds.repeat(1, num_images_per_prompt).view( + bs_embed * num_images_per_prompt, -1 + ) + + return prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds + + # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.prepare_extra_step_kwargs + def prepare_extra_step_kwargs(self, generator, eta): + # prepare extra kwargs for the scheduler step, since not all schedulers have the same signature + # eta (η) is only used with the DDIMScheduler, it will be ignored for other schedulers. + # eta corresponds to η in DDIM paper: https://arxiv.org/abs/2010.02502 + # and should be between [0, 1] + + accepts_eta = "eta" in set(inspect.signature(self.scheduler.step).parameters.keys()) + extra_step_kwargs = {} + if accepts_eta: + extra_step_kwargs["eta"] = eta + + # check if the scheduler accepts generator + accepts_generator = "generator" in set(inspect.signature(self.scheduler.step).parameters.keys()) + if accepts_generator: + extra_step_kwargs["generator"] = generator + return extra_step_kwargs + + def check_inputs( + self, prompt, strength, callback_steps, negative_prompt=None, prompt_embeds=None, negative_prompt_embeds=None + ): + if strength < 0 or strength > 1: + raise ValueError(f"The value of strength should in [0.0, 1.0] but is {strength}") + + if (callback_steps is None) or ( + callback_steps is not None and (not isinstance(callback_steps, int) or callback_steps <= 0) + ): + raise ValueError( + f"`callback_steps` has to be a positive integer but is {callback_steps} of type" + f" {type(callback_steps)}." + ) + + if prompt is not None and prompt_embeds is not None: + raise ValueError( + f"Cannot forward both `prompt`: {prompt} and `prompt_embeds`: {prompt_embeds}. Please make sure to" + " only forward one of the two." + ) + elif prompt is None and prompt_embeds is None: + raise ValueError( + "Provide either `prompt` or `prompt_embeds`. Cannot leave both `prompt` and `prompt_embeds` undefined." + ) + elif prompt is not None and (not isinstance(prompt, str) and not isinstance(prompt, list)): + raise ValueError(f"`prompt` has to be of type `str` or `list` but is {type(prompt)}") + + if negative_prompt is not None and negative_prompt_embeds is not None: + raise ValueError( + f"Cannot forward both `negative_prompt`: {negative_prompt} and `negative_prompt_embeds`:" + f" {negative_prompt_embeds}. Please make sure to only forward one of the two." + ) + + if prompt_embeds is not None and negative_prompt_embeds is not None: + if prompt_embeds.shape != negative_prompt_embeds.shape: + raise ValueError( + "`prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but" + f" got: `prompt_embeds` {prompt_embeds.shape} != `negative_prompt_embeds`" + f" {negative_prompt_embeds.shape}." + ) + + def get_timesteps(self, num_inference_steps, strength, device): + # get the original timestep using init_timestep + init_timestep = min(int(num_inference_steps * strength), num_inference_steps) + + t_start = max(num_inference_steps - init_timestep, 0) + timesteps = self.scheduler.timesteps[t_start * self.scheduler.order :] + + return timesteps, num_inference_steps - t_start + + def prepare_latents(self, image, timestep, batch_size, num_images_per_prompt, dtype, device, generator=None): + if not isinstance(image, (torch.Tensor, PIL.Image.Image, list)): + raise ValueError( + f"`image` has to be of type `torch.Tensor`, `PIL.Image.Image` or list but is {type(image)}" + ) + + # Offload text encoder if `enable_model_cpu_offload` was enabled + if hasattr(self, "final_offload_hook") and self.final_offload_hook is not None: + self.text_encoder_2.to("cpu") + torch.cuda.empty_cache() + + image = image.to(device=device, dtype=dtype) + + batch_size = batch_size * num_images_per_prompt + + if image.shape[1] == 4: + init_latents = image + + else: + # make sure the VAE is in float32 mode, as it overflows in float16 + image = image.float() + self.vae.to(dtype=torch.float32) + + if isinstance(generator, list) and len(generator) != batch_size: + raise ValueError( + f"You have passed a list of generators of length {len(generator)}, but requested an effective batch" + f" size of {batch_size}. Make sure the batch size matches the length of the generators." + ) + + elif isinstance(generator, list): + init_latents = [ + self.vae.encode(image[i : i + 1]).latent_dist.sample(generator[i]) for i in range(batch_size) + ] + init_latents = torch.cat(init_latents, dim=0) + else: + init_latents = self.vae.encode(image).latent_dist.sample(generator) + + self.vae.to(dtype) + init_latents = init_latents.to(dtype) + + init_latents = self.vae.config.scaling_factor * init_latents + + if batch_size > init_latents.shape[0] and batch_size % init_latents.shape[0] == 0: + # expand init_latents for batch_size + additional_image_per_prompt = batch_size // init_latents.shape[0] + init_latents = torch.cat([init_latents] * additional_image_per_prompt, dim=0) + elif batch_size > init_latents.shape[0] and batch_size % init_latents.shape[0] != 0: + raise ValueError( + f"Cannot duplicate `image` of batch size {init_latents.shape[0]} to {batch_size} text prompts." + ) + else: + init_latents = torch.cat([init_latents], dim=0) + + shape = init_latents.shape + noise = randn_tensor(shape, generator=generator, device=device, dtype=dtype) + + # get latents + init_latents = self.scheduler.add_noise(init_latents, noise, timestep) + latents = init_latents + + return latents + + def _get_add_time_ids( + self, original_size, crops_coords_top_left, target_size, aesthetic_score, negative_aesthetic_score, dtype + ): + if self.config.requires_aesthetics_score: + add_time_ids = list(original_size + crops_coords_top_left + (aesthetic_score,)) + add_neg_time_ids = list(original_size + crops_coords_top_left + (negative_aesthetic_score,)) + else: + add_time_ids = list(original_size + crops_coords_top_left + target_size) + add_neg_time_ids = list(original_size + crops_coords_top_left + target_size) + + passed_add_embed_dim = ( + self.unet.config.addition_time_embed_dim * len(add_time_ids) + self.text_encoder_2.config.projection_dim + ) + expected_add_embed_dim = self.unet.add_embedding.linear_1.in_features + + if ( + expected_add_embed_dim > passed_add_embed_dim + and (expected_add_embed_dim - passed_add_embed_dim) == self.unet.config.addition_time_embed_dim + ): + raise ValueError( + f"Model expects an added time embedding vector of length {expected_add_embed_dim}, but a vector of {passed_add_embed_dim} was created. Please make sure to enable `requires_aesthetics_score` with `pipe.register_to_config(requires_aesthetics_score=True)` to make sure `aesthetic_score` {aesthetic_score} and `negative_aesthetic_score` {negative_aesthetic_score} is correctly used by the model." + ) + elif ( + expected_add_embed_dim < passed_add_embed_dim + and (passed_add_embed_dim - expected_add_embed_dim) == self.unet.config.addition_time_embed_dim + ): + raise ValueError( + f"Model expects an added time embedding vector of length {expected_add_embed_dim}, but a vector of {passed_add_embed_dim} was created. Please make sure to disable `requires_aesthetics_score` with `pipe.register_to_config(requires_aesthetics_score=False)` to make sure `target_size` {target_size} is correctly used by the model." + ) + elif expected_add_embed_dim != passed_add_embed_dim: + raise ValueError( + f"Model expects an added time embedding vector of length {expected_add_embed_dim}, but a vector of {passed_add_embed_dim} was created. The model has an incorrect config. Please check `unet.config.time_embedding_type` and `text_encoder_2.config.projection_dim`." + ) + + add_time_ids = torch.tensor([add_time_ids], dtype=dtype) + add_neg_time_ids = torch.tensor([add_neg_time_ids], dtype=dtype) + + return add_time_ids, add_neg_time_ids + + @torch.no_grad() + @replace_example_docstring(EXAMPLE_DOC_STRING) + def __call__( + self, + prompt: Union[str, List[str]] = None, + image: Union[ + torch.FloatTensor, + PIL.Image.Image, + np.ndarray, + List[torch.FloatTensor], + List[PIL.Image.Image], + List[np.ndarray], + ] = None, + strength: float = 0.3, + num_inference_steps: int = 50, + guidance_scale: float = 5.0, + negative_prompt: Optional[Union[str, List[str]]] = None, + num_images_per_prompt: Optional[int] = 1, + eta: float = 0.0, + generator: Optional[Union[torch.Generator, List[torch.Generator]]] = None, + latents: Optional[torch.FloatTensor] = None, + prompt_embeds: Optional[torch.FloatTensor] = None, + negative_prompt_embeds: Optional[torch.FloatTensor] = None, + pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + negative_pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + output_type: Optional[str] = "pil", + return_dict: bool = True, + callback: Optional[Callable[[int, int, torch.FloatTensor], None]] = None, + callback_steps: int = 1, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + guidance_rescale: float = 0.0, + original_size: Tuple[int, int] = None, + crops_coords_top_left: Tuple[int, int] = (0, 0), + target_size: Tuple[int, int] = None, + aesthetic_score: float = 6.0, + negative_aesthetic_score: float = 2.5, + ): + r""" + Function invoked when calling the pipeline for generation. + + Args: + prompt (`str` or `List[str]`, *optional*): + The prompt or prompts to guide the image generation. If not defined, one has to pass `prompt_embeds`. + instead. + image (`torch.FloatTensor` or `PIL.Image.Image` or `np.ndarray` or `List[torch.FloatTensor]` or `List[PIL.Image.Image]` or `List[np.ndarray]`): + The image(s) to modify with the pipeline. + strength (`float`, *optional*, defaults to 0.8): + Conceptually, indicates how much to transform the reference `image`. Must be between 0 and 1. `image` + will be used as a starting point, adding more noise to it the larger the `strength`. The number of + denoising steps depends on the amount of noise initially added. When `strength` is 1, added noise will + be maximum and the denoising process will run for the full number of iterations specified in + `num_inference_steps`. A value of 1, therefore, essentially ignores `image`. + num_inference_steps (`int`, *optional*, defaults to 50): + The number of denoising steps. More denoising steps usually lead to a higher quality image at the + expense of slower inference. + guidance_scale (`float`, *optional*, defaults to 7.5): + Guidance scale as defined in [Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598). + `guidance_scale` is defined as `w` of equation 2. of [Imagen + Paper](https://arxiv.org/pdf/2205.11487.pdf). Guidance scale is enabled by setting `guidance_scale > + 1`. Higher guidance scale encourages to generate images that are closely linked to the text `prompt`, + usually at the expense of lower image quality. + negative_prompt (`str` or `List[str]`, *optional*): + The prompt or prompts not to guide the image generation. If not defined, one has to pass + `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is + less than `1`). + num_images_per_prompt (`int`, *optional*, defaults to 1): + The number of images to generate per prompt. + eta (`float`, *optional*, defaults to 0.0): + Corresponds to parameter eta (η) in the DDIM paper: https://arxiv.org/abs/2010.02502. Only applies to + [`schedulers.DDIMScheduler`], will be ignored for others. + generator (`torch.Generator` or `List[torch.Generator]`, *optional*): + One or a list of [torch generator(s)](https://pytorch.org/docs/stable/generated/torch.Generator.html) + to make generation deterministic. + latents (`torch.FloatTensor`, *optional*): + Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for image + generation. Can be used to tweak the same generation with different prompts. If not provided, a latents + tensor will ge generated by sampling using the supplied random `generator`. + prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not + provided, text embeddings will be generated from `prompt` input argument. + negative_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input + argument. + pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. + If not provided, pooled text embeddings will be generated from `prompt` input argument. + negative_pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, pooled negative_prompt_embeds will be generated from `negative_prompt` + input argument. + output_type (`str`, *optional*, defaults to `"pil"`): + The output format of the generate image. Choose between + [PIL](https://pillow.readthedocs.io/en/stable/): `PIL.Image.Image` or `np.array`. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] instead of a + plain tuple. + callback (`Callable`, *optional*): + A function that will be called every `callback_steps` steps during inference. The function will be + called with the following arguments: `callback(step: int, timestep: int, latents: torch.FloatTensor)`. + callback_steps (`int`, *optional*, defaults to 1): + The frequency at which the `callback` function will be called. If not specified, the callback will be + called at every step. + cross_attention_kwargs (`dict`, *optional*): + A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under + `self.processor` in + [diffusers.cross_attention](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/cross_attention.py). + guidance_rescale (`float`, *optional*, defaults to 0.7): + Guidance rescale factor proposed by [Common Diffusion Noise Schedules and Sample Steps are + Flawed](https://arxiv.org/pdf/2305.08891.pdf) `guidance_scale` is defined as `φ` in equation 16. of + [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf). + Guidance rescale factor should fix overexposure when using zero terminal SNR. + original_size (`Tuple[int]`, *optional*, defaults to (1024, 1024)): + TODO + crops_coords_top_left (`Tuple[int]`, *optional*, defaults to (0, 0)): + TODO + target_size (`Tuple[int]`, *optional*, defaults to (1024, 1024)): + TODO + aesthetic_score (`float`, *optional*, defaults to 6.0): + TODO + negative_aesthetic_score (`float`, *optional*, defaults to 2.5): + TDOO + + Examples: + + Returns: + [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] or `tuple`: + [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] if `return_dict` is True, otherwise a + `tuple. When returning a tuple, the first element is a list with the generated images, and the second + element is a list of `bool`s denoting whether the corresponding generated image likely represents + "not-safe-for-work" (nsfw) content, according to the `safety_checker`. + """ + # 1. Check inputs. Raise error if not correct + self.check_inputs(prompt, strength, callback_steps, negative_prompt, prompt_embeds, negative_prompt_embeds) + + # 2. Define call parameters + if prompt is not None and isinstance(prompt, str): + batch_size = 1 + elif prompt is not None and isinstance(prompt, list): + batch_size = len(prompt) + else: + batch_size = prompt_embeds.shape[0] + + device = self._execution_device + + # here `guidance_scale` is defined analog to the guidance weight `w` of equation (2) + # of the Imagen paper: https://arxiv.org/pdf/2205.11487.pdf . `guidance_scale = 1` + # corresponds to doing no classifier free guidance. + do_classifier_free_guidance = guidance_scale > 1.0 + + # 3. Encode input prompt + text_encoder_lora_scale = ( + cross_attention_kwargs.get("scale", None) if cross_attention_kwargs is not None else None + ) + ( + prompt_embeds, + negative_prompt_embeds, + pooled_prompt_embeds, + negative_pooled_prompt_embeds, + ) = self.encode_prompt( + prompt, + device, + num_images_per_prompt, + do_classifier_free_guidance, + negative_prompt, + prompt_embeds=prompt_embeds, + negative_prompt_embeds=negative_prompt_embeds, + pooled_prompt_embeds=pooled_prompt_embeds, + negative_pooled_prompt_embeds=negative_pooled_prompt_embeds, + lora_scale=text_encoder_lora_scale, + ) + + # 4. Preprocess image + image = self.image_processor.preprocess(image) + + # 5. Prepare timesteps + self.scheduler.set_timesteps(num_inference_steps, device=device) + timesteps, num_inference_steps = self.get_timesteps(num_inference_steps, strength, device) + latent_timestep = timesteps[:1].repeat(batch_size * num_images_per_prompt) + + # 6. Prepare latent variables + latents = self.prepare_latents( + image, latent_timestep, batch_size, num_images_per_prompt, prompt_embeds.dtype, device, generator + ) + # 7. Prepare extra step kwargs. + extra_step_kwargs = self.prepare_extra_step_kwargs(generator, eta) + + height, width = latents.shape[-2:] + height = height * self.vae_scale_factor + width = width * self.vae_scale_factor + + original_size = original_size or (height, width) + target_size = target_size or (height, width) + + # 8. Prepare added time ids & embeddings + add_text_embeds = pooled_prompt_embeds + add_time_ids, add_neg_time_ids = self._get_add_time_ids( + original_size, + crops_coords_top_left, + target_size, + aesthetic_score, + negative_aesthetic_score, + dtype=prompt_embeds.dtype, + ) + + if do_classifier_free_guidance: + prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds], dim=0) + add_text_embeds = torch.cat([negative_pooled_prompt_embeds, add_text_embeds], dim=0) + add_time_ids = torch.cat([add_neg_time_ids, add_time_ids], dim=0) + + prompt_embeds = prompt_embeds.to(device) + add_text_embeds = add_text_embeds.to(device) + add_time_ids = add_time_ids.to(device).repeat(batch_size * num_images_per_prompt, 1) + + # 9. Denoising loop + num_warmup_steps = len(timesteps) - num_inference_steps * self.scheduler.order + with self.progress_bar(total=num_inference_steps) as progress_bar: + for i, t in enumerate(timesteps): + # expand the latents if we are doing classifier free guidance + latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents + + latent_model_input = self.scheduler.scale_model_input(latent_model_input, t) + + # predict the noise residual + added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids} + noise_pred = self.unet( + latent_model_input, + t, + encoder_hidden_states=prompt_embeds, + cross_attention_kwargs=cross_attention_kwargs, + added_cond_kwargs=added_cond_kwargs, + return_dict=False, + )[0] + + # perform guidance + if do_classifier_free_guidance: + noise_pred_uncond, noise_pred_text = noise_pred.chunk(2) + noise_pred = noise_pred_uncond + guidance_scale * (noise_pred_text - noise_pred_uncond) + + if do_classifier_free_guidance and guidance_rescale > 0.0: + # Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf + noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=guidance_rescale) + + # compute the previous noisy sample x_t -> x_t-1 + latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0] + + # call the callback, if provided + if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0): + progress_bar.update() + if callback is not None and i % callback_steps == 0: + callback(i, t, latents) + + # make sure the VAE is in float32 mode, as it overflows in float16 + self.vae.to(dtype=torch.float32) + + use_torch_2_0_or_xformers = isinstance( + self.vae.decoder.mid_block.attentions[0].processor, + ( + AttnProcessor2_0, + XFormersAttnProcessor, + LoRAXFormersAttnProcessor, + LoRAAttnProcessor2_0, + ), + ) + # if xformers or torch_2_0 is used attention block does not need + # to be in float32 which can save lots of memory + if use_torch_2_0_or_xformers: + self.vae.post_quant_conv.to(latents.dtype) + self.vae.decoder.conv_in.to(latents.dtype) + self.vae.decoder.mid_block.to(latents.dtype) + else: + latents = latents.float() + + if not output_type == "latent": + image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0] + else: + image = latents + return StableDiffusionXLPipelineOutput(images=image) + + image = self.watermark.apply_watermark(image) + image = self.image_processor.postprocess(image, output_type=output_type) + + # Offload last model to CPU + if hasattr(self, "final_offload_hook") and self.final_offload_hook is not None: + self.final_offload_hook.offload() + + if not return_dict: + return (image,) + + return StableDiffusionXLPipelineOutput(images=image) diff --git a/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/watermark.py b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/watermark.py new file mode 100644 index 0000000000000000000000000000000000000000..bc6c9bf649b161fbc1ae7e59b3de6ba5c22884fa --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/pipelines/stable_diffusion_xl/watermark.py @@ -0,0 +1,31 @@ +import numpy as np +import torch +from imwatermark import WatermarkEncoder + + +# Copied from https://github.com/Stability-AI/generative-models/blob/613af104c6b85184091d42d374fef420eddb356d/scripts/demo/streamlit_helpers.py#L66 +WATERMARK_MESSAGE = 0b101100111110110010010000011110111011000110011110 +# bin(x)[2:] gives bits of x as str, use int to convert them to 0/1 +WATERMARK_BITS = [int(bit) for bit in bin(WATERMARK_MESSAGE)[2:]] + + +class StableDiffusionXLWatermarker: + def __init__(self): + self.watermark = WATERMARK_BITS + self.encoder = WatermarkEncoder() + + self.encoder.set_watermark("bits", self.watermark) + + def apply_watermark(self, images: torch.FloatTensor): + # can't encode images that are smaller than 256 + if images.shape[-1] < 256: + return images + + images = (255 * (images / 2 + 0.5)).cpu().permute(0, 2, 3, 1).float().numpy() + + images = [self.encoder.encode(image, "dwtDct") for image in images] + + images = torch.from_numpy(np.array(images)).permute(0, 3, 1, 2) + + images = torch.clamp(2 * (images / 255 - 0.5), min=-1.0, max=1.0) + return images diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__init__.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..0a07ce4baed20904b85b577aa3e4e38f6a47e945 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/__init__.py @@ -0,0 +1,92 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from ..utils import ( + OptionalDependencyNotAvailable, + is_flax_available, + is_scipy_available, + is_torch_available, + is_torchsde_available, +) + + +try: + if not is_torch_available(): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_pt_objects import * # noqa F403 +else: + from .scheduling_consistency_models import CMStochasticIterativeScheduler + from .scheduling_ddim import DDIMScheduler + from .scheduling_ddim_inverse import DDIMInverseScheduler + from .scheduling_ddim_parallel import DDIMParallelScheduler + from .scheduling_ddpm import DDPMScheduler + from .scheduling_ddpm_parallel import DDPMParallelScheduler + from .scheduling_deis_multistep import DEISMultistepScheduler + from .scheduling_dpmsolver_multistep import DPMSolverMultistepScheduler + from .scheduling_dpmsolver_multistep_inverse import DPMSolverMultistepInverseScheduler + from .scheduling_dpmsolver_singlestep import DPMSolverSinglestepScheduler + from .scheduling_euler_ancestral_discrete import EulerAncestralDiscreteScheduler + from .scheduling_euler_discrete import EulerDiscreteScheduler + from .scheduling_heun_discrete import HeunDiscreteScheduler + from .scheduling_ipndm import IPNDMScheduler + from .scheduling_k_dpm_2_ancestral_discrete import KDPM2AncestralDiscreteScheduler + from .scheduling_k_dpm_2_discrete import KDPM2DiscreteScheduler + from .scheduling_karras_ve import KarrasVeScheduler + from .scheduling_pndm import PNDMScheduler + from .scheduling_repaint import RePaintScheduler + from .scheduling_sde_ve import ScoreSdeVeScheduler + from .scheduling_sde_vp import ScoreSdeVpScheduler + from .scheduling_unclip import UnCLIPScheduler + from .scheduling_unipc_multistep import UniPCMultistepScheduler + from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin + from .scheduling_vq_diffusion import VQDiffusionScheduler + +try: + if not is_flax_available(): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_flax_objects import * # noqa F403 +else: + from .scheduling_ddim_flax import FlaxDDIMScheduler + from .scheduling_ddpm_flax import FlaxDDPMScheduler + from .scheduling_dpmsolver_multistep_flax import FlaxDPMSolverMultistepScheduler + from .scheduling_karras_ve_flax import FlaxKarrasVeScheduler + from .scheduling_lms_discrete_flax import FlaxLMSDiscreteScheduler + from .scheduling_pndm_flax import FlaxPNDMScheduler + from .scheduling_sde_ve_flax import FlaxScoreSdeVeScheduler + from .scheduling_utils_flax import ( + FlaxKarrasDiffusionSchedulers, + FlaxSchedulerMixin, + FlaxSchedulerOutput, + broadcast_to_shape_from_left, + ) + + +try: + if not (is_torch_available() and is_scipy_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_torch_and_scipy_objects import * # noqa F403 +else: + from .scheduling_lms_discrete import LMSDiscreteScheduler + +try: + if not (is_torch_available() and is_torchsde_available()): + raise OptionalDependencyNotAvailable() +except OptionalDependencyNotAvailable: + from ..utils.dummy_torch_and_torchsde_objects import * # noqa F403 +else: + from .scheduling_dpmsolver_sde import DPMSolverSDEScheduler diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/__init__.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/__init__.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..adc4da64e5dd2a10221aebe9db2457e861e096eb Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/__init__.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_consistency_models.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_consistency_models.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..7d4e20dc884efb853e9857036c4869cca31ba5f2 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_consistency_models.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..51f0e1e06dd45f56d331d1f8ec801e5db4a643e4 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim_inverse.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim_inverse.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..c6c1bb4853bf91e751aa5c81b73dff97d0368455 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim_inverse.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim_parallel.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim_parallel.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..d4f35cdd67779bc1e76414afe5a147d24c5de4df Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddim_parallel.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddpm.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddpm.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..070d15fee88d066a335146ff95f2591e3c78f633 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddpm.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddpm_parallel.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddpm_parallel.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..c1181a9a0e86163075204cb86492d6857665d77e Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ddpm_parallel.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_deis_multistep.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_deis_multistep.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..b5aa965255970fd2cc8ea0f448f72bf64c65f6ab Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_deis_multistep.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_multistep.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_multistep.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..04cf21dde0c2d053b47de5edd1d3c59c575d9749 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_multistep.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_multistep_inverse.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_multistep_inverse.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..7c25dd8cf4134360446e15602edc4f9e3ca5c9d0 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_multistep_inverse.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_sde.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_sde.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..af3bd9989f815bd96048d39242ccecd9a69360e4 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_sde.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_singlestep.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_singlestep.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..732422e3212c7f7886ad0811ae105d5d1c3546af Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_dpmsolver_singlestep.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_euler_ancestral_discrete.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_euler_ancestral_discrete.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..76f5392d08740f20a2e7f71a53561624f2136c1f Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_euler_ancestral_discrete.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_euler_discrete.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_euler_discrete.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..da2690aeebb9aab5962ee31f436077dfc4af30af Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_euler_discrete.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_heun_discrete.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_heun_discrete.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..c16437de8b12b24a09727869063a322049a226d9 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_heun_discrete.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ipndm.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ipndm.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..864d09b13dc879db196852aec1b623bd4cc485be Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_ipndm.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_k_dpm_2_ancestral_discrete.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_k_dpm_2_ancestral_discrete.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..32d637baf1bf990e0c75deef88cc8a76dbf5eca3 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_k_dpm_2_ancestral_discrete.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_k_dpm_2_discrete.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_k_dpm_2_discrete.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..6e617389ab97b0a2fae20331774cd91091c1a574 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_k_dpm_2_discrete.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_karras_ve.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_karras_ve.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..861e01e71a541a01c51b1a4cea4ebd616e556a04 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_karras_ve.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_lms_discrete.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_lms_discrete.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..3beb1f380cb8042d4645e4d63c5601d4d6af0da4 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_lms_discrete.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_pndm.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_pndm.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..081e3b44b07f057cb66c7de404f769c4c65b5048 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_pndm.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_repaint.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_repaint.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..3582a922cf108998816ec9231e0be4dae813d804 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_repaint.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_sde_ve.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_sde_ve.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..171e697daf01526ad69a81dd353a2940c80f3308 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_sde_ve.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_sde_vp.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_sde_vp.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..d46d849028ca9ae4ca22df4dffc4e96664e48f94 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_sde_vp.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_unclip.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_unclip.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..fba4c7bc50519afe5d0056cd124f2d46198f19a0 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_unclip.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_unipc_multistep.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_unipc_multistep.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..55ac070f1f75fc37d8c3f60f25aae9c54b96ea88 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_unipc_multistep.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..ac7f8fb22472a73cb1e440c730a0988f98a93898 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_vq_diffusion.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_vq_diffusion.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..d59afa170da448ca0336b1aa8d2db74177d9fae3 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/schedulers/__pycache__/scheduling_vq_diffusion.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_consistency_models.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_consistency_models.py new file mode 100644 index 0000000000000000000000000000000000000000..fb296054d65b804af281dc99d940c8f0ba50e01b --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_consistency_models.py @@ -0,0 +1,380 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, logging, randn_tensor +from .scheduling_utils import SchedulerMixin + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +@dataclass +class CMStochasticIterativeSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + """ + + prev_sample: torch.FloatTensor + + +class CMStochasticIterativeScheduler(SchedulerMixin, ConfigMixin): + """ + Multistep and onestep sampling for consistency models from Song et al. 2023 [1]. This implements Algorithm 1 in the + paper [1]. + + [1] Song, Yang and Dhariwal, Prafulla and Chen, Mark and Sutskever, Ilya. "Consistency Models" + https://arxiv.org/pdf/2303.01469 [2] Karras, Tero, et al. "Elucidating the Design Space of Diffusion-Based + Generative Models." https://arxiv.org/abs/2206.00364 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + sigma_min (`float`): + Minimum noise magnitude in the sigma schedule. This was set to 0.002 in the original implementation. + sigma_max (`float`): + Maximum noise magnitude in the sigma schedule. This was set to 80.0 in the original implementation. + sigma_data (`float`): + The standard deviation of the data distribution, following the EDM paper [2]. This was set to 0.5 in the + original implementation, which is also the original value suggested in the EDM paper. + s_noise (`float`): + The amount of additional noise to counteract loss of detail during sampling. A reasonable range is [1.000, + 1.011]. This was set to 1.0 in the original implementation. + rho (`float`): + The rho parameter used for calculating the Karras sigma schedule, introduced in the EDM paper [2]. This was + set to 7.0 in the original implementation, which is also the original value suggested in the EDM paper. + clip_denoised (`bool`): + Whether to clip the denoised outputs to `(-1, 1)`. Defaults to `True`. + timesteps (`List` or `np.ndarray` or `torch.Tensor`, *optional*): + Optionally, an explicit timestep schedule can be specified. The timesteps are expected to be in increasing + order. + """ + + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 40, + sigma_min: float = 0.002, + sigma_max: float = 80.0, + sigma_data: float = 0.5, + s_noise: float = 1.0, + rho: float = 7.0, + clip_denoised: bool = True, + ): + # standard deviation of the initial noise distribution + self.init_noise_sigma = sigma_max + + ramp = np.linspace(0, 1, num_train_timesteps) + sigmas = self._convert_to_karras(ramp) + timesteps = self.sigma_to_t(sigmas) + + # setable values + self.num_inference_steps = None + self.sigmas = torch.from_numpy(sigmas) + self.timesteps = torch.from_numpy(timesteps) + self.custom_timesteps = False + self.is_scale_input_called = False + + def index_for_timestep(self, timestep, schedule_timesteps=None): + if schedule_timesteps is None: + schedule_timesteps = self.timesteps + + indices = (schedule_timesteps == timestep).nonzero() + return indices.item() + + def scale_model_input( + self, sample: torch.FloatTensor, timestep: Union[float, torch.FloatTensor] + ) -> torch.FloatTensor: + """ + Scales the consistency model input by `(sigma**2 + sigma_data**2) ** 0.5`, following the EDM model. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`float` or `torch.FloatTensor`): the current timestep in the diffusion chain + Returns: + `torch.FloatTensor`: scaled input sample + """ + # Get sigma corresponding to timestep + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_idx = self.index_for_timestep(timestep) + sigma = self.sigmas[step_idx] + + sample = sample / ((sigma**2 + self.config.sigma_data**2) ** 0.5) + + self.is_scale_input_called = True + return sample + + def sigma_to_t(self, sigmas: Union[float, np.ndarray]): + """ + Gets scaled timesteps from the Karras sigmas, for input to the consistency model. + + Args: + sigmas (`float` or `np.ndarray`): single Karras sigma or array of Karras sigmas + Returns: + `float` or `np.ndarray`: scaled input timestep or scaled input timestep array + """ + if not isinstance(sigmas, np.ndarray): + sigmas = np.array(sigmas, dtype=np.float64) + + timesteps = 1000 * 0.25 * np.log(sigmas + 1e-44) + + return timesteps + + def set_timesteps( + self, + num_inference_steps: Optional[int] = None, + device: Union[str, torch.device] = None, + timesteps: Optional[List[int]] = None, + ): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + timesteps (`List[int]`, optional): + custom timesteps used to support arbitrary spacing between timesteps. If `None`, then the default + timestep spacing strategy of equal spacing between timesteps is used. If passed, `num_inference_steps` + must be `None`. + """ + if num_inference_steps is None and timesteps is None: + raise ValueError("Exactly one of `num_inference_steps` or `timesteps` must be supplied.") + + if num_inference_steps is not None and timesteps is not None: + raise ValueError("Can only pass one of `num_inference_steps` or `timesteps`.") + + # Follow DDPMScheduler custom timesteps logic + if timesteps is not None: + for i in range(1, len(timesteps)): + if timesteps[i] >= timesteps[i - 1]: + raise ValueError("`timesteps` must be in descending order.") + + if timesteps[0] >= self.config.num_train_timesteps: + raise ValueError( + f"`timesteps` must start before `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps}." + ) + + timesteps = np.array(timesteps, dtype=np.int64) + self.custom_timesteps = True + else: + if num_inference_steps > self.config.num_train_timesteps: + raise ValueError( + f"`num_inference_steps`: {num_inference_steps} cannot be larger than `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle" + f" maximal {self.config.num_train_timesteps} timesteps." + ) + + self.num_inference_steps = num_inference_steps + + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64) + self.custom_timesteps = False + + # Map timesteps to Karras sigmas directly for multistep sampling + # See https://github.com/openai/consistency_models/blob/main/cm/karras_diffusion.py#L675 + num_train_timesteps = self.config.num_train_timesteps + ramp = timesteps[::-1].copy() + ramp = ramp / (num_train_timesteps - 1) + sigmas = self._convert_to_karras(ramp) + timesteps = self.sigma_to_t(sigmas) + + sigmas = np.concatenate([sigmas, [self.sigma_min]]).astype(np.float32) + self.sigmas = torch.from_numpy(sigmas).to(device=device) + + if str(device).startswith("mps"): + # mps does not support float64 + self.timesteps = torch.from_numpy(timesteps).to(device, dtype=torch.float32) + else: + self.timesteps = torch.from_numpy(timesteps).to(device=device) + + # Modified _convert_to_karras implementation that takes in ramp as argument + def _convert_to_karras(self, ramp): + """Constructs the noise schedule of Karras et al. (2022).""" + + sigma_min: float = self.config.sigma_min + sigma_max: float = self.config.sigma_max + + rho = self.config.rho + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho + return sigmas + + def get_scalings(self, sigma): + sigma_data = self.config.sigma_data + + c_skip = sigma_data**2 / (sigma**2 + sigma_data**2) + c_out = sigma * sigma_data / (sigma**2 + sigma_data**2) ** 0.5 + return c_skip, c_out + + def get_scalings_for_boundary_condition(self, sigma): + """ + Gets the scalings used in the consistency model parameterization, following Appendix C of the original paper. + This enforces the consistency model boundary condition. + + Note that `epsilon` in the equations for c_skip and c_out is set to sigma_min. + + Args: + sigma (`torch.FloatTensor`): + The current sigma in the Karras sigma schedule. + Returns: + `tuple`: + A two-element tuple where c_skip (which weights the current sample) is the first element and c_out + (which weights the consistency model output) is the second element. + """ + sigma_min = self.config.sigma_min + sigma_data = self.config.sigma_data + + c_skip = sigma_data**2 / ((sigma - sigma_min) ** 2 + sigma_data**2) + c_out = (sigma - sigma_min) * sigma_data / (sigma**2 + sigma_data**2) ** 0.5 + return c_skip, c_out + + def step( + self, + model_output: torch.FloatTensor, + timestep: Union[float, torch.FloatTensor], + sample: torch.FloatTensor, + generator: Optional[torch.Generator] = None, + return_dict: bool = True, + ) -> Union[CMStochasticIterativeSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`float`): current timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + generator (`torch.Generator`, *optional*): Random number generator. + return_dict (`bool`): option for returning tuple rather than EulerDiscreteSchedulerOutput class + Returns: + [`~schedulers.scheduling_utils.CMStochasticIterativeSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.CMStochasticIterativeSchedulerOutput`] if `return_dict` is True, otherwise a + `tuple`. When returning a tuple, the first element is the sample tensor. + """ + + if ( + isinstance(timestep, int) + or isinstance(timestep, torch.IntTensor) + or isinstance(timestep, torch.LongTensor) + ): + raise ValueError( + ( + "Passing integer indices (e.g. from `enumerate(timesteps)`) as timesteps to" + f" `{self.__class__}.step()` is not supported. Make sure to pass" + " one of the `scheduler.timesteps` as a timestep." + ), + ) + + if not self.is_scale_input_called: + logger.warning( + "The `scale_model_input` function should be called before `step` to ensure correct denoising. " + "See `StableDiffusionPipeline` for a usage example." + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + + sigma_min = self.config.sigma_min + sigma_max = self.config.sigma_max + + step_index = self.index_for_timestep(timestep) + + # sigma_next corresponds to next_t in original implementation + sigma = self.sigmas[step_index] + if step_index + 1 < self.config.num_train_timesteps: + sigma_next = self.sigmas[step_index + 1] + else: + # Set sigma_next to sigma_min + sigma_next = self.sigmas[-1] + + # Get scalings for boundary conditions + c_skip, c_out = self.get_scalings_for_boundary_condition(sigma) + + # 1. Denoise model output using boundary conditions + denoised = c_out * model_output + c_skip * sample + if self.config.clip_denoised: + denoised = denoised.clamp(-1, 1) + + # 2. Sample z ~ N(0, s_noise^2 * I) + # Noise is not used for onestep sampling. + if len(self.timesteps) > 1: + noise = randn_tensor( + model_output.shape, dtype=model_output.dtype, device=model_output.device, generator=generator + ) + else: + noise = torch.zeros_like(model_output) + z = noise * self.config.s_noise + + sigma_hat = sigma_next.clamp(min=sigma_min, max=sigma_max) + + # 3. Return noisy sample + # tau = sigma_hat, eps = sigma_min + prev_sample = denoised + z * (sigma_hat**2 - sigma_min**2) ** 0.5 + + if not return_dict: + return (prev_sample,) + + return CMStochasticIterativeSchedulerOutput(prev_sample=prev_sample) + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps] + + sigma = sigmas[step_indices].flatten() + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) + + noisy_samples = original_samples + noise * sigma + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim.py new file mode 100644 index 0000000000000000000000000000000000000000..a93255ca600ef34da1b6c1691c4c5e9f7f86c2ed --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim.py @@ -0,0 +1,515 @@ +# Copyright 2023 Stanford University Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This code is strongly influenced by https://github.com/pesser/pytorch_diffusion +# and https://github.com/hojonathanho/diffusion + +import math +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin + + +@dataclass +# Copied from diffusers.schedulers.scheduling_ddpm.DDPMSchedulerOutput with DDPM->DDIM +class DDIMSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +def rescale_zero_terminal_snr(betas): + """ + Rescales betas to have zero terminal SNR Based on https://arxiv.org/pdf/2305.08891.pdf (Algorithm 1) + + + Args: + betas (`torch.FloatTensor`): + the betas that the scheduler is being initialized with. + + Returns: + `torch.FloatTensor`: rescaled betas with zero terminal SNR + """ + # Convert betas to alphas_bar_sqrt + alphas = 1.0 - betas + alphas_cumprod = torch.cumprod(alphas, dim=0) + alphas_bar_sqrt = alphas_cumprod.sqrt() + + # Store old values. + alphas_bar_sqrt_0 = alphas_bar_sqrt[0].clone() + alphas_bar_sqrt_T = alphas_bar_sqrt[-1].clone() + + # Shift so the last timestep is zero. + alphas_bar_sqrt -= alphas_bar_sqrt_T + + # Scale so the first timestep is back to the old value. + alphas_bar_sqrt *= alphas_bar_sqrt_0 / (alphas_bar_sqrt_0 - alphas_bar_sqrt_T) + + # Convert alphas_bar_sqrt to betas + alphas_bar = alphas_bar_sqrt**2 # Revert sqrt + alphas = alphas_bar[1:] / alphas_bar[:-1] # Revert cumprod + alphas = torch.cat([alphas_bar[0:1], alphas]) + betas = 1 - alphas + + return betas + + +class DDIMScheduler(SchedulerMixin, ConfigMixin): + """ + Denoising diffusion implicit models is a scheduler that extends the denoising procedure introduced in denoising + diffusion probabilistic models (DDPMs) with non-Markovian guidance. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2010.02502 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + clip_sample (`bool`, default `True`): + option to clip predicted sample for numerical stability. + clip_sample_range (`float`, default `1.0`): + the maximum magnitude for sample clipping. Valid only when `clip_sample=True`. + set_alpha_to_one (`bool`, default `True`): + each diffusion step uses the value of alphas product at that step and at the previous one. For the final + step there is no previous alpha. When this option is `True` the previous alpha product is fixed to `1`, + otherwise it uses the value of alpha at step 0. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). Valid only when `thresholding=True`. + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True`. + timestep_spacing (`str`, default `"leading"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + rescale_betas_zero_snr (`bool`, default `False`): + whether to rescale the betas to have zero terminal SNR (proposed by https://arxiv.org/pdf/2305.08891.pdf). + This can enable the model to generate very bright and dark samples instead of limiting it to samples with + medium brightness. Loosely related to + [`--offset_noise`](https://github.com/huggingface/diffusers/blob/74fd735eb073eb1d774b1ab4154a0876eb82f055/examples/dreambooth/train_dreambooth.py#L506). + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + clip_sample: bool = True, + set_alpha_to_one: bool = True, + steps_offset: int = 0, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + clip_sample_range: float = 1.0, + sample_max_value: float = 1.0, + timestep_spacing: str = "leading", + rescale_betas_zero_snr: bool = False, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + # Rescale for zero SNR + if rescale_betas_zero_snr: + self.betas = rescale_zero_terminal_snr(self.betas) + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + # At every step in ddim, we are looking into the previous alphas_cumprod + # For the final step, there is no previous alphas_cumprod because we are already at 0 + # `set_alpha_to_one` decides whether we set this parameter simply to one or + # whether we use the final alpha of the "non-previous" one. + self.final_alpha_cumprod = torch.tensor(1.0) if set_alpha_to_one else self.alphas_cumprod[0] + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # setable values + self.num_inference_steps = None + self.timesteps = torch.from_numpy(np.arange(0, num_train_timesteps)[::-1].copy().astype(np.int64)) + + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def _get_variance(self, timestep, prev_timestep): + alpha_prod_t = self.alphas_cumprod[timestep] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + variance = (beta_prod_t_prev / beta_prod_t) * (1 - alpha_prod_t / alpha_prod_t_prev) + + return variance + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + + if num_inference_steps > self.config.num_train_timesteps: + raise ValueError( + f"`num_inference_steps`: {num_inference_steps} cannot be larger than `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle" + f" maximal {self.config.num_train_timesteps} timesteps." + ) + + self.num_inference_steps = num_inference_steps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = ( + np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps) + .round()[::-1] + .copy() + .astype(np.int64) + ) + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = np.round(np.arange(self.config.num_train_timesteps, 0, -step_ratio)).astype(np.int64) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'leading' or 'trailing'." + ) + + self.timesteps = torch.from_numpy(timesteps).to(device) + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + eta: float = 0.0, + use_clipped_model_output: bool = False, + generator=None, + variance_noise: Optional[torch.FloatTensor] = None, + return_dict: bool = True, + ) -> Union[DDIMSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + eta (`float`): weight of noise for added noise in diffusion step. + use_clipped_model_output (`bool`): if `True`, compute "corrected" `model_output` from the clipped + predicted original sample. Necessary because predicted original sample is clipped to [-1, 1] when + `self.config.clip_sample` is `True`. If no clipping has happened, "corrected" `model_output` would + coincide with the one provided as input and `use_clipped_model_output` will have not effect. + generator: random number generator. + variance_noise (`torch.FloatTensor`): instead of generating noise for the variance using `generator`, we + can directly provide the noise for the variance itself. This is useful for methods such as + CycleDiffusion. (https://arxiv.org/abs/2210.05559) + return_dict (`bool`): option for returning tuple rather than DDIMSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.DDIMSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.DDIMSchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + # See formulas (12) and (16) of DDIM paper https://arxiv.org/pdf/2010.02502.pdf + # Ideally, read DDIM paper in-detail understanding + + # Notation ( -> + # - pred_noise_t -> e_theta(x_t, t) + # - pred_original_sample -> f_theta(x_t, t) or x_0 + # - std_dev_t -> sigma_t + # - eta -> η + # - pred_sample_direction -> "direction pointing to x_t" + # - pred_prev_sample -> "x_t-1" + + # 1. get previous step value (=t-1) + prev_timestep = timestep - self.config.num_train_timesteps // self.num_inference_steps + + # 2. compute alphas, betas + alpha_prod_t = self.alphas_cumprod[timestep] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod + + beta_prod_t = 1 - alpha_prod_t + + # 3. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + pred_epsilon = model_output + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + pred_epsilon = (sample - alpha_prod_t ** (0.5) * pred_original_sample) / beta_prod_t ** (0.5) + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + pred_epsilon = (alpha_prod_t**0.5) * model_output + (beta_prod_t**0.5) * sample + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction`" + ) + + # 4. Clip or threshold "predicted x_0" + if self.config.thresholding: + pred_original_sample = self._threshold_sample(pred_original_sample) + elif self.config.clip_sample: + pred_original_sample = pred_original_sample.clamp( + -self.config.clip_sample_range, self.config.clip_sample_range + ) + + # 5. compute variance: "sigma_t(η)" -> see formula (16) + # σ_t = sqrt((1 − α_t−1)/(1 − α_t)) * sqrt(1 − α_t/α_t−1) + variance = self._get_variance(timestep, prev_timestep) + std_dev_t = eta * variance ** (0.5) + + if use_clipped_model_output: + # the pred_epsilon is always re-derived from the clipped x_0 in Glide + pred_epsilon = (sample - alpha_prod_t ** (0.5) * pred_original_sample) / beta_prod_t ** (0.5) + + # 6. compute "direction pointing to x_t" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + pred_sample_direction = (1 - alpha_prod_t_prev - std_dev_t**2) ** (0.5) * pred_epsilon + + # 7. compute x_t without "random noise" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + prev_sample = alpha_prod_t_prev ** (0.5) * pred_original_sample + pred_sample_direction + + if eta > 0: + if variance_noise is not None and generator is not None: + raise ValueError( + "Cannot pass both generator and variance_noise. Please make sure that either `generator` or" + " `variance_noise` stays `None`." + ) + + if variance_noise is None: + variance_noise = randn_tensor( + model_output.shape, generator=generator, device=model_output.device, dtype=model_output.dtype + ) + variance = std_dev_t * variance_noise + + prev_sample = prev_sample + variance + + if not return_dict: + return (prev_sample,) + + return DDIMSchedulerOutput(prev_sample=prev_sample, pred_original_sample=pred_original_sample) + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.get_velocity + def get_velocity( + self, sample: torch.FloatTensor, noise: torch.FloatTensor, timesteps: torch.IntTensor + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as sample + alphas_cumprod = self.alphas_cumprod.to(device=sample.device, dtype=sample.dtype) + timesteps = timesteps.to(sample.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(sample.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(sample.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + velocity = sqrt_alpha_prod * noise - sqrt_one_minus_alpha_prod * sample + return velocity + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_flax.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..db248c33077bf502e31cb2ab97141744b828b514 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_flax.py @@ -0,0 +1,305 @@ +# Copyright 2023 Stanford University Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This code is strongly influenced by https://github.com/pesser/pytorch_diffusion +# and https://github.com/hojonathanho/diffusion + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import flax +import jax.numpy as jnp + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils_flax import ( + CommonSchedulerState, + FlaxKarrasDiffusionSchedulers, + FlaxSchedulerMixin, + FlaxSchedulerOutput, + add_noise_common, + get_velocity_common, +) + + +@flax.struct.dataclass +class DDIMSchedulerState: + common: CommonSchedulerState + final_alpha_cumprod: jnp.ndarray + + # setable values + init_noise_sigma: jnp.ndarray + timesteps: jnp.ndarray + num_inference_steps: Optional[int] = None + + @classmethod + def create( + cls, + common: CommonSchedulerState, + final_alpha_cumprod: jnp.ndarray, + init_noise_sigma: jnp.ndarray, + timesteps: jnp.ndarray, + ): + return cls( + common=common, + final_alpha_cumprod=final_alpha_cumprod, + init_noise_sigma=init_noise_sigma, + timesteps=timesteps, + ) + + +@dataclass +class FlaxDDIMSchedulerOutput(FlaxSchedulerOutput): + state: DDIMSchedulerState + + +class FlaxDDIMScheduler(FlaxSchedulerMixin, ConfigMixin): + """ + Denoising diffusion implicit models is a scheduler that extends the denoising procedure introduced in denoising + diffusion probabilistic models (DDPMs) with non-Markovian guidance. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2010.02502 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`jnp.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + clip_sample (`bool`, default `True`): + option to clip predicted sample between -1 and 1 for numerical stability. + set_alpha_to_one (`bool`, default `True`): + each diffusion step uses the value of alphas product at that step and at the previous one. For the final + step there is no previous alpha. When this option is `True` the previous alpha product is fixed to `1`, + otherwise it uses the value of alpha at step 0. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + prediction_type (`str`, default `epsilon`): + indicates whether the model predicts the noise (epsilon), or the samples. One of `epsilon`, `sample`. + `v-prediction` is not supported for this scheduler. + dtype (`jnp.dtype`, *optional*, defaults to `jnp.float32`): + the `dtype` used for params and computation. + """ + + _compatibles = [e.name for e in FlaxKarrasDiffusionSchedulers] + + dtype: jnp.dtype + + @property + def has_state(self): + return True + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[jnp.ndarray] = None, + set_alpha_to_one: bool = True, + steps_offset: int = 0, + prediction_type: str = "epsilon", + dtype: jnp.dtype = jnp.float32, + ): + self.dtype = dtype + + def create_state(self, common: Optional[CommonSchedulerState] = None) -> DDIMSchedulerState: + if common is None: + common = CommonSchedulerState.create(self) + + # At every step in ddim, we are looking into the previous alphas_cumprod + # For the final step, there is no previous alphas_cumprod because we are already at 0 + # `set_alpha_to_one` decides whether we set this parameter simply to one or + # whether we use the final alpha of the "non-previous" one. + final_alpha_cumprod = ( + jnp.array(1.0, dtype=self.dtype) if self.config.set_alpha_to_one else common.alphas_cumprod[0] + ) + + # standard deviation of the initial noise distribution + init_noise_sigma = jnp.array(1.0, dtype=self.dtype) + + timesteps = jnp.arange(0, self.config.num_train_timesteps).round()[::-1] + + return DDIMSchedulerState.create( + common=common, + final_alpha_cumprod=final_alpha_cumprod, + init_noise_sigma=init_noise_sigma, + timesteps=timesteps, + ) + + def scale_model_input( + self, state: DDIMSchedulerState, sample: jnp.ndarray, timestep: Optional[int] = None + ) -> jnp.ndarray: + """ + Args: + state (`PNDMSchedulerState`): the `FlaxPNDMScheduler` state data class instance. + sample (`jnp.ndarray`): input sample + timestep (`int`, optional): current timestep + + Returns: + `jnp.ndarray`: scaled input sample + """ + return sample + + def set_timesteps( + self, state: DDIMSchedulerState, num_inference_steps: int, shape: Tuple = () + ) -> DDIMSchedulerState: + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + state (`DDIMSchedulerState`): + the `FlaxDDIMScheduler` state data class instance. + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + step_ratio = self.config.num_train_timesteps // num_inference_steps + # creates integer timesteps by multiplying by ratio + # rounding to avoid issues when num_inference_step is power of 3 + timesteps = (jnp.arange(0, num_inference_steps) * step_ratio).round()[::-1] + self.config.steps_offset + + return state.replace( + num_inference_steps=num_inference_steps, + timesteps=timesteps, + ) + + def _get_variance(self, state: DDIMSchedulerState, timestep, prev_timestep): + alpha_prod_t = state.common.alphas_cumprod[timestep] + alpha_prod_t_prev = jnp.where( + prev_timestep >= 0, state.common.alphas_cumprod[prev_timestep], state.final_alpha_cumprod + ) + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + variance = (beta_prod_t_prev / beta_prod_t) * (1 - alpha_prod_t / alpha_prod_t_prev) + + return variance + + def step( + self, + state: DDIMSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + eta: float = 0.0, + return_dict: bool = True, + ) -> Union[FlaxDDIMSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + state (`DDIMSchedulerState`): the `FlaxDDIMScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than FlaxDDIMSchedulerOutput class + + Returns: + [`FlaxDDIMSchedulerOutput`] or `tuple`: [`FlaxDDIMSchedulerOutput`] if `return_dict` is True, otherwise a + `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if state.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + # See formulas (12) and (16) of DDIM paper https://arxiv.org/pdf/2010.02502.pdf + # Ideally, read DDIM paper in-detail understanding + + # Notation ( -> + # - pred_noise_t -> e_theta(x_t, t) + # - pred_original_sample -> f_theta(x_t, t) or x_0 + # - std_dev_t -> sigma_t + # - eta -> η + # - pred_sample_direction -> "direction pointing to x_t" + # - pred_prev_sample -> "x_t-1" + + # 1. get previous step value (=t-1) + prev_timestep = timestep - self.config.num_train_timesteps // state.num_inference_steps + + alphas_cumprod = state.common.alphas_cumprod + final_alpha_cumprod = state.final_alpha_cumprod + + # 2. compute alphas, betas + alpha_prod_t = alphas_cumprod[timestep] + alpha_prod_t_prev = jnp.where(prev_timestep >= 0, alphas_cumprod[prev_timestep], final_alpha_cumprod) + + beta_prod_t = 1 - alpha_prod_t + + # 3. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + pred_epsilon = model_output + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + pred_epsilon = (sample - alpha_prod_t ** (0.5) * pred_original_sample) / beta_prod_t ** (0.5) + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + pred_epsilon = (alpha_prod_t**0.5) * model_output + (beta_prod_t**0.5) * sample + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction`" + ) + + # 4. compute variance: "sigma_t(η)" -> see formula (16) + # σ_t = sqrt((1 − α_t−1)/(1 − α_t)) * sqrt(1 − α_t/α_t−1) + variance = self._get_variance(state, timestep, prev_timestep) + std_dev_t = eta * variance ** (0.5) + + # 5. compute "direction pointing to x_t" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + pred_sample_direction = (1 - alpha_prod_t_prev - std_dev_t**2) ** (0.5) * pred_epsilon + + # 6. compute x_t without "random noise" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + prev_sample = alpha_prod_t_prev ** (0.5) * pred_original_sample + pred_sample_direction + + if not return_dict: + return (prev_sample, state) + + return FlaxDDIMSchedulerOutput(prev_sample=prev_sample, state=state) + + def add_noise( + self, + state: DDIMSchedulerState, + original_samples: jnp.ndarray, + noise: jnp.ndarray, + timesteps: jnp.ndarray, + ) -> jnp.ndarray: + return add_noise_common(state.common, original_samples, noise, timesteps) + + def get_velocity( + self, + state: DDIMSchedulerState, + sample: jnp.ndarray, + noise: jnp.ndarray, + timesteps: jnp.ndarray, + ) -> jnp.ndarray: + return get_velocity_common(state.common, sample, noise, timesteps) + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_inverse.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_inverse.py new file mode 100644 index 0000000000000000000000000000000000000000..c04aabe035b57ef5e211530ad7f1a69bf6dcee54 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_inverse.py @@ -0,0 +1,283 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This code is strongly influenced by https://github.com/pesser/pytorch_diffusion +# and https://github.com/hojonathanho/diffusion +import math +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from diffusers.configuration_utils import ConfigMixin, register_to_config +from diffusers.schedulers.scheduling_utils import SchedulerMixin +from diffusers.utils import BaseOutput, deprecate + + +@dataclass +# Copied from diffusers.schedulers.scheduling_ddpm.DDPMSchedulerOutput with DDPM->DDIM +class DDIMSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class DDIMInverseScheduler(SchedulerMixin, ConfigMixin): + """ + DDIMInverseScheduler is the reverse scheduler of [`DDIMScheduler`]. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2010.02502 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + clip_sample (`bool`, default `True`): + option to clip predicted sample for numerical stability. + clip_sample_range (`float`, default `1.0`): + the maximum magnitude for sample clipping. Valid only when `clip_sample=True`. + set_alpha_to_zero (`bool`, default `True`): + each diffusion step uses the value of alphas product at that step and at the previous one. For the final + step there is no previous alpha. When this option is `True` the previous alpha product is fixed to `0`, + otherwise it uses the value of alpha at step `num_train_timesteps - 1`. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_zero=False`, to make the last step use step `num_train_timesteps - 1` for the previous alpha + product. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + """ + + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + clip_sample: bool = True, + set_alpha_to_zero: bool = True, + steps_offset: int = 0, + prediction_type: str = "epsilon", + clip_sample_range: float = 1.0, + **kwargs, + ): + if kwargs.get("set_alpha_to_one", None) is not None: + deprecation_message = ( + "The `set_alpha_to_one` argument is deprecated. Please use `set_alpha_to_zero` instead." + ) + deprecate("set_alpha_to_one", "1.0.0", deprecation_message, standard_warn=False) + set_alpha_to_zero = kwargs["set_alpha_to_one"] + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + # At every step in inverted ddim, we are looking into the next alphas_cumprod + # For the final step, there is no next alphas_cumprod, and the index is out of bounds + # `set_alpha_to_zero` decides whether we set this parameter simply to zero + # in this case, self.step() just output the predicted noise + # or whether we use the final alpha of the "non-previous" one. + self.final_alpha_cumprod = torch.tensor(0.0) if set_alpha_to_zero else self.alphas_cumprod[-1] + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # setable values + self.num_inference_steps = None + self.timesteps = torch.from_numpy(np.arange(0, num_train_timesteps).copy().astype(np.int64)) + + # Copied from diffusers.schedulers.scheduling_ddim.DDIMScheduler.scale_model_input + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + + if num_inference_steps > self.config.num_train_timesteps: + raise ValueError( + f"`num_inference_steps`: {num_inference_steps} cannot be larger than `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle" + f" maximal {self.config.num_train_timesteps} timesteps." + ) + + self.num_inference_steps = num_inference_steps + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round().copy().astype(np.int64) + self.timesteps = torch.from_numpy(timesteps).to(device) + self.timesteps += self.config.steps_offset + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + eta: float = 0.0, + use_clipped_model_output: bool = False, + variance_noise: Optional[torch.FloatTensor] = None, + return_dict: bool = True, + ) -> Union[DDIMSchedulerOutput, Tuple]: + # 1. get previous step value (=t+1) + prev_timestep = timestep + self.config.num_train_timesteps // self.num_inference_steps + + # 2. compute alphas, betas + # change original implementation to exactly match noise levels for analogous forward process + alpha_prod_t = self.alphas_cumprod[timestep] + alpha_prod_t_prev = ( + self.alphas_cumprod[prev_timestep] + if prev_timestep < self.config.num_train_timesteps + else self.final_alpha_cumprod + ) + + beta_prod_t = 1 - alpha_prod_t + + # 3. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + pred_epsilon = model_output + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + pred_epsilon = (sample - alpha_prod_t ** (0.5) * pred_original_sample) / beta_prod_t ** (0.5) + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + pred_epsilon = (alpha_prod_t**0.5) * model_output + (beta_prod_t**0.5) * sample + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction`" + ) + + # 4. Clip or threshold "predicted x_0" + if self.config.clip_sample: + pred_original_sample = pred_original_sample.clamp( + -self.config.clip_sample_range, self.config.clip_sample_range + ) + + # 5. compute "direction pointing to x_t" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + pred_sample_direction = (1 - alpha_prod_t_prev) ** (0.5) * pred_epsilon + + # 6. compute x_t without "random noise" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + prev_sample = alpha_prod_t_prev ** (0.5) * pred_original_sample + pred_sample_direction + + if not return_dict: + return (prev_sample, pred_original_sample) + return DDIMSchedulerOutput(prev_sample=prev_sample, pred_original_sample=pred_original_sample) + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_parallel.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_parallel.py new file mode 100644 index 0000000000000000000000000000000000000000..db3ea0e1cca55f88d0a81d0311158929516cb038 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddim_parallel.py @@ -0,0 +1,642 @@ +# Copyright 2023 ParaDiGMS authors and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This code is strongly influenced by https://github.com/pesser/pytorch_diffusion +# and https://github.com/hojonathanho/diffusion + +import math +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin + + +@dataclass +# Copied from diffusers.schedulers.scheduling_ddpm.DDPMSchedulerOutput +class DDIMParallelSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +# Copied from diffusers.schedulers.scheduling_ddim.rescale_zero_terminal_snr +def rescale_zero_terminal_snr(betas): + """ + Rescales betas to have zero terminal SNR Based on https://arxiv.org/pdf/2305.08891.pdf (Algorithm 1) + + + Args: + betas (`torch.FloatTensor`): + the betas that the scheduler is being initialized with. + + Returns: + `torch.FloatTensor`: rescaled betas with zero terminal SNR + """ + # Convert betas to alphas_bar_sqrt + alphas = 1.0 - betas + alphas_cumprod = torch.cumprod(alphas, dim=0) + alphas_bar_sqrt = alphas_cumprod.sqrt() + + # Store old values. + alphas_bar_sqrt_0 = alphas_bar_sqrt[0].clone() + alphas_bar_sqrt_T = alphas_bar_sqrt[-1].clone() + + # Shift so the last timestep is zero. + alphas_bar_sqrt -= alphas_bar_sqrt_T + + # Scale so the first timestep is back to the old value. + alphas_bar_sqrt *= alphas_bar_sqrt_0 / (alphas_bar_sqrt_0 - alphas_bar_sqrt_T) + + # Convert alphas_bar_sqrt to betas + alphas_bar = alphas_bar_sqrt**2 # Revert sqrt + alphas = alphas_bar[1:] / alphas_bar[:-1] # Revert cumprod + alphas = torch.cat([alphas_bar[0:1], alphas]) + betas = 1 - alphas + + return betas + + +class DDIMParallelScheduler(SchedulerMixin, ConfigMixin): + """ + Denoising diffusion implicit models is a scheduler that extends the denoising procedure introduced in denoising + diffusion probabilistic models (DDPMs) with non-Markovian guidance. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2010.02502 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + clip_sample (`bool`, default `True`): + option to clip predicted sample for numerical stability. + clip_sample_range (`float`, default `1.0`): + the maximum magnitude for sample clipping. Valid only when `clip_sample=True`. + set_alpha_to_one (`bool`, default `True`): + each diffusion step uses the value of alphas product at that step and at the previous one. For the final + step there is no previous alpha. When this option is `True` the previous alpha product is fixed to `1`, + otherwise it uses the value of alpha at step 0. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). Valid only when `thresholding=True`. + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True`. + timestep_spacing (`str`, default `"leading"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + rescale_betas_zero_snr (`bool`, default `False`): + whether to rescale the betas to have zero terminal SNR (proposed by https://arxiv.org/pdf/2305.08891.pdf). + This can enable the model to generate very bright and dark samples instead of limiting it to samples with + medium brightness. Loosely related to + [`--offset_noise`](https://github.com/huggingface/diffusers/blob/74fd735eb073eb1d774b1ab4154a0876eb82f055/examples/dreambooth/train_dreambooth.py#L506). + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + _is_ode_scheduler = True + + @register_to_config + # Copied from diffusers.schedulers.scheduling_ddim.DDIMScheduler.__init__ + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + clip_sample: bool = True, + set_alpha_to_one: bool = True, + steps_offset: int = 0, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + clip_sample_range: float = 1.0, + sample_max_value: float = 1.0, + timestep_spacing: str = "leading", + rescale_betas_zero_snr: bool = False, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + # Rescale for zero SNR + if rescale_betas_zero_snr: + self.betas = rescale_zero_terminal_snr(self.betas) + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + # At every step in ddim, we are looking into the previous alphas_cumprod + # For the final step, there is no previous alphas_cumprod because we are already at 0 + # `set_alpha_to_one` decides whether we set this parameter simply to one or + # whether we use the final alpha of the "non-previous" one. + self.final_alpha_cumprod = torch.tensor(1.0) if set_alpha_to_one else self.alphas_cumprod[0] + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # setable values + self.num_inference_steps = None + self.timesteps = torch.from_numpy(np.arange(0, num_train_timesteps)[::-1].copy().astype(np.int64)) + + # Copied from diffusers.schedulers.scheduling_ddim.DDIMScheduler.scale_model_input + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def _get_variance(self, timestep, prev_timestep=None): + if prev_timestep is None: + prev_timestep = timestep - self.config.num_train_timesteps // self.num_inference_steps + + alpha_prod_t = self.alphas_cumprod[timestep] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + variance = (beta_prod_t_prev / beta_prod_t) * (1 - alpha_prod_t / alpha_prod_t_prev) + + return variance + + def _batch_get_variance(self, t, prev_t): + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[torch.clip(prev_t, min=0)] + alpha_prod_t_prev[prev_t < 0] = torch.tensor(1.0) + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + variance = (beta_prod_t_prev / beta_prod_t) * (1 - alpha_prod_t / alpha_prod_t_prev) + + return variance + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + # Copied from diffusers.schedulers.scheduling_ddim.DDIMScheduler.set_timesteps + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + + if num_inference_steps > self.config.num_train_timesteps: + raise ValueError( + f"`num_inference_steps`: {num_inference_steps} cannot be larger than `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle" + f" maximal {self.config.num_train_timesteps} timesteps." + ) + + self.num_inference_steps = num_inference_steps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = ( + np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps) + .round()[::-1] + .copy() + .astype(np.int64) + ) + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = np.round(np.arange(self.config.num_train_timesteps, 0, -step_ratio)).astype(np.int64) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'leading' or 'trailing'." + ) + + self.timesteps = torch.from_numpy(timesteps).to(device) + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + eta: float = 0.0, + use_clipped_model_output: bool = False, + generator=None, + variance_noise: Optional[torch.FloatTensor] = None, + return_dict: bool = True, + ) -> Union[DDIMParallelSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + eta (`float`): weight of noise for added noise in diffusion step. + use_clipped_model_output (`bool`): if `True`, compute "corrected" `model_output` from the clipped + predicted original sample. Necessary because predicted original sample is clipped to [-1, 1] when + `self.config.clip_sample` is `True`. If no clipping has happened, "corrected" `model_output` would + coincide with the one provided as input and `use_clipped_model_output` will have not effect. + generator: random number generator. + variance_noise (`torch.FloatTensor`): instead of generating noise for the variance using `generator`, we + can directly provide the noise for the variance itself. This is useful for methods such as + CycleDiffusion. (https://arxiv.org/abs/2210.05559) + return_dict (`bool`): option for returning tuple rather than DDIMParallelSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.DDIMParallelSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.DDIMParallelSchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. + When returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + # See formulas (12) and (16) of DDIM paper https://arxiv.org/pdf/2010.02502.pdf + # Ideally, read DDIM paper in-detail understanding + + # Notation ( -> + # - pred_noise_t -> e_theta(x_t, t) + # - pred_original_sample -> f_theta(x_t, t) or x_0 + # - std_dev_t -> sigma_t + # - eta -> η + # - pred_sample_direction -> "direction pointing to x_t" + # - pred_prev_sample -> "x_t-1" + + # 1. get previous step value (=t-1) + prev_timestep = timestep - self.config.num_train_timesteps // self.num_inference_steps + + # 2. compute alphas, betas + alpha_prod_t = self.alphas_cumprod[timestep] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod + + beta_prod_t = 1 - alpha_prod_t + + # 3. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + pred_epsilon = model_output + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + pred_epsilon = (sample - alpha_prod_t ** (0.5) * pred_original_sample) / beta_prod_t ** (0.5) + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + pred_epsilon = (alpha_prod_t**0.5) * model_output + (beta_prod_t**0.5) * sample + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction`" + ) + + # 4. Clip or threshold "predicted x_0" + if self.config.thresholding: + pred_original_sample = self._threshold_sample(pred_original_sample) + elif self.config.clip_sample: + pred_original_sample = pred_original_sample.clamp( + -self.config.clip_sample_range, self.config.clip_sample_range + ) + + # 5. compute variance: "sigma_t(η)" -> see formula (16) + # σ_t = sqrt((1 − α_t−1)/(1 − α_t)) * sqrt(1 − α_t/α_t−1) + variance = self._get_variance(timestep, prev_timestep) + std_dev_t = eta * variance ** (0.5) + + if use_clipped_model_output: + # the pred_epsilon is always re-derived from the clipped x_0 in Glide + pred_epsilon = (sample - alpha_prod_t ** (0.5) * pred_original_sample) / beta_prod_t ** (0.5) + + # 6. compute "direction pointing to x_t" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + pred_sample_direction = (1 - alpha_prod_t_prev - std_dev_t**2) ** (0.5) * pred_epsilon + + # 7. compute x_t without "random noise" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + prev_sample = alpha_prod_t_prev ** (0.5) * pred_original_sample + pred_sample_direction + + if eta > 0: + if variance_noise is not None and generator is not None: + raise ValueError( + "Cannot pass both generator and variance_noise. Please make sure that either `generator` or" + " `variance_noise` stays `None`." + ) + + if variance_noise is None: + variance_noise = randn_tensor( + model_output.shape, generator=generator, device=model_output.device, dtype=model_output.dtype + ) + variance = std_dev_t * variance_noise + + prev_sample = prev_sample + variance + + if not return_dict: + return (prev_sample,) + + return DDIMParallelSchedulerOutput(prev_sample=prev_sample, pred_original_sample=pred_original_sample) + + def batch_step_no_noise( + self, + model_output: torch.FloatTensor, + timesteps: List[int], + sample: torch.FloatTensor, + eta: float = 0.0, + use_clipped_model_output: bool = False, + ) -> torch.FloatTensor: + """ + Batched version of the `step` function, to be able to reverse the SDE for multiple samples/timesteps at once. + Also, does not add any noise to the predicted sample, which is necessary for parallel sampling where the noise + is pre-sampled by the pipeline. + + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timesteps (`List[int]`): + current discrete timesteps in the diffusion chain. This is now a list of integers. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + eta (`float`): weight of noise for added noise in diffusion step. + use_clipped_model_output (`bool`): if `True`, compute "corrected" `model_output` from the clipped + predicted original sample. Necessary because predicted original sample is clipped to [-1, 1] when + `self.config.clip_sample` is `True`. If no clipping has happened, "corrected" `model_output` would + coincide with the one provided as input and `use_clipped_model_output` will have not effect. + + Returns: + `torch.FloatTensor`: sample tensor at previous timestep. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + assert eta == 0.0 + + # See formulas (12) and (16) of DDIM paper https://arxiv.org/pdf/2010.02502.pdf + # Ideally, read DDIM paper in-detail understanding + + # Notation ( -> + # - pred_noise_t -> e_theta(x_t, t) + # - pred_original_sample -> f_theta(x_t, t) or x_0 + # - std_dev_t -> sigma_t + # - eta -> η + # - pred_sample_direction -> "direction pointing to x_t" + # - pred_prev_sample -> "x_t-1" + + # 1. get previous step value (=t-1) + t = timesteps + prev_t = t - self.config.num_train_timesteps // self.num_inference_steps + + t = t.view(-1, *([1] * (model_output.ndim - 1))) + prev_t = prev_t.view(-1, *([1] * (model_output.ndim - 1))) + + # 1. compute alphas, betas + self.alphas_cumprod = self.alphas_cumprod.to(model_output.device) + self.final_alpha_cumprod = self.final_alpha_cumprod.to(model_output.device) + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[torch.clip(prev_t, min=0)] + alpha_prod_t_prev[prev_t < 0] = torch.tensor(1.0) + + beta_prod_t = 1 - alpha_prod_t + + # 3. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + pred_epsilon = model_output + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + pred_epsilon = (sample - alpha_prod_t ** (0.5) * pred_original_sample) / beta_prod_t ** (0.5) + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + pred_epsilon = (alpha_prod_t**0.5) * model_output + (beta_prod_t**0.5) * sample + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction`" + ) + + # 4. Clip or threshold "predicted x_0" + if self.config.thresholding: + pred_original_sample = self._threshold_sample(pred_original_sample) + elif self.config.clip_sample: + pred_original_sample = pred_original_sample.clamp( + -self.config.clip_sample_range, self.config.clip_sample_range + ) + + # 5. compute variance: "sigma_t(η)" -> see formula (16) + # σ_t = sqrt((1 − α_t−1)/(1 − α_t)) * sqrt(1 − α_t/α_t−1) + variance = self._batch_get_variance(t, prev_t).to(model_output.device).view(*alpha_prod_t_prev.shape) + std_dev_t = eta * variance ** (0.5) + + if use_clipped_model_output: + # the pred_epsilon is always re-derived from the clipped x_0 in Glide + pred_epsilon = (sample - alpha_prod_t ** (0.5) * pred_original_sample) / beta_prod_t ** (0.5) + + # 6. compute "direction pointing to x_t" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + pred_sample_direction = (1 - alpha_prod_t_prev - std_dev_t**2) ** (0.5) * pred_epsilon + + # 7. compute x_t without "random noise" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + prev_sample = alpha_prod_t_prev ** (0.5) * pred_original_sample + pred_sample_direction + + return prev_sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.get_velocity + def get_velocity( + self, sample: torch.FloatTensor, noise: torch.FloatTensor, timesteps: torch.IntTensor + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as sample + alphas_cumprod = self.alphas_cumprod.to(device=sample.device, dtype=sample.dtype) + timesteps = timesteps.to(sample.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(sample.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(sample.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + velocity = sqrt_alpha_prod * noise - sqrt_one_minus_alpha_prod * sample + return velocity + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm.py new file mode 100644 index 0000000000000000000000000000000000000000..a1b7d7aaa9c22a3a768d1aed131794e810400936 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm.py @@ -0,0 +1,513 @@ +# Copyright 2023 UC Berkeley Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/ermongroup/ddim + +import math +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin + + +@dataclass +class DDPMSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class DDPMScheduler(SchedulerMixin, ConfigMixin): + """ + Denoising diffusion probabilistic models (DDPMs) explores the connections between denoising score matching and + Langevin dynamics sampling. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2006.11239 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, `squaredcos_cap_v2` or `sigmoid`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + variance_type (`str`): + options to clip the variance used when adding noise to the denoised sample. Choose from `fixed_small`, + `fixed_small_log`, `fixed_large`, `fixed_large_log`, `learned` or `learned_range`. + clip_sample (`bool`, default `True`): + option to clip predicted sample for numerical stability. + clip_sample_range (`float`, default `1.0`): + the maximum magnitude for sample clipping. Valid only when `clip_sample=True`. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). Valid only when `thresholding=True`. + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True`. + timestep_spacing (`str`, default `"leading"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + variance_type: str = "fixed_small", + clip_sample: bool = True, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + clip_sample_range: float = 1.0, + sample_max_value: float = 1.0, + timestep_spacing: str = "leading", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + elif beta_schedule == "sigmoid": + # GeoDiff sigmoid schedule + betas = torch.linspace(-6, 6, num_train_timesteps) + self.betas = torch.sigmoid(betas) * (beta_end - beta_start) + beta_start + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + self.one = torch.tensor(1.0) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # setable values + self.custom_timesteps = False + self.num_inference_steps = None + self.timesteps = torch.from_numpy(np.arange(0, num_train_timesteps)[::-1].copy()) + + self.variance_type = variance_type + + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def set_timesteps( + self, + num_inference_steps: Optional[int] = None, + device: Union[str, torch.device] = None, + timesteps: Optional[List[int]] = None, + ): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`Optional[int]`): + the number of diffusion steps used when generating samples with a pre-trained model. If passed, then + `timesteps` must be `None`. + device (`str` or `torch.device`, optional): + the device to which the timesteps are moved to. + custom_timesteps (`List[int]`, optional): + custom timesteps used to support arbitrary spacing between timesteps. If `None`, then the default + timestep spacing strategy of equal spacing between timesteps is used. If passed, `num_inference_steps` + must be `None`. + + """ + if num_inference_steps is not None and timesteps is not None: + raise ValueError("Can only pass one of `num_inference_steps` or `custom_timesteps`.") + + if timesteps is not None: + for i in range(1, len(timesteps)): + if timesteps[i] >= timesteps[i - 1]: + raise ValueError("`custom_timesteps` must be in descending order.") + + if timesteps[0] >= self.config.num_train_timesteps: + raise ValueError( + f"`timesteps` must start before `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps}." + ) + + timesteps = np.array(timesteps, dtype=np.int64) + self.custom_timesteps = True + else: + if num_inference_steps > self.config.num_train_timesteps: + raise ValueError( + f"`num_inference_steps`: {num_inference_steps} cannot be larger than `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle" + f" maximal {self.config.num_train_timesteps} timesteps." + ) + + self.num_inference_steps = num_inference_steps + self.custom_timesteps = False + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = ( + np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps) + .round()[::-1] + .copy() + .astype(np.int64) + ) + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = np.round(np.arange(self.config.num_train_timesteps, 0, -step_ratio)).astype(np.int64) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + self.timesteps = torch.from_numpy(timesteps).to(device) + + def _get_variance(self, t, predicted_variance=None, variance_type=None): + prev_t = self.previous_timestep(t) + + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[prev_t] if prev_t >= 0 else self.one + current_beta_t = 1 - alpha_prod_t / alpha_prod_t_prev + + # For t > 0, compute predicted variance βt (see formula (6) and (7) from https://arxiv.org/pdf/2006.11239.pdf) + # and sample from it to get previous sample + # x_{t-1} ~ N(pred_prev_sample, variance) == add variance to pred_sample + variance = (1 - alpha_prod_t_prev) / (1 - alpha_prod_t) * current_beta_t + + # we always take the log of variance, so clamp it to ensure it's not 0 + variance = torch.clamp(variance, min=1e-20) + + if variance_type is None: + variance_type = self.config.variance_type + + # hacks - were probably added for training stability + if variance_type == "fixed_small": + variance = variance + # for rl-diffuser https://arxiv.org/abs/2205.09991 + elif variance_type == "fixed_small_log": + variance = torch.log(variance) + variance = torch.exp(0.5 * variance) + elif variance_type == "fixed_large": + variance = current_beta_t + elif variance_type == "fixed_large_log": + # Glide max_log + variance = torch.log(current_beta_t) + elif variance_type == "learned": + return predicted_variance + elif variance_type == "learned_range": + min_log = torch.log(variance) + max_log = torch.log(current_beta_t) + frac = (predicted_variance + 1) / 2 + variance = frac * max_log + (1 - frac) * min_log + + return variance + + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + generator=None, + return_dict: bool = True, + ) -> Union[DDPMSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + generator: random number generator. + return_dict (`bool`): option for returning tuple rather than DDPMSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.DDPMSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.DDPMSchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + + """ + t = timestep + + prev_t = self.previous_timestep(t) + + if model_output.shape[1] == sample.shape[1] * 2 and self.variance_type in ["learned", "learned_range"]: + model_output, predicted_variance = torch.split(model_output, sample.shape[1], dim=1) + else: + predicted_variance = None + + # 1. compute alphas, betas + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[prev_t] if prev_t >= 0 else self.one + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + current_alpha_t = alpha_prod_t / alpha_prod_t_prev + current_beta_t = 1 - current_alpha_t + + # 2. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (15) from https://arxiv.org/pdf/2006.11239.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample` or" + " `v_prediction` for the DDPMScheduler." + ) + + # 3. Clip or threshold "predicted x_0" + if self.config.thresholding: + pred_original_sample = self._threshold_sample(pred_original_sample) + elif self.config.clip_sample: + pred_original_sample = pred_original_sample.clamp( + -self.config.clip_sample_range, self.config.clip_sample_range + ) + + # 4. Compute coefficients for pred_original_sample x_0 and current sample x_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_original_sample_coeff = (alpha_prod_t_prev ** (0.5) * current_beta_t) / beta_prod_t + current_sample_coeff = current_alpha_t ** (0.5) * beta_prod_t_prev / beta_prod_t + + # 5. Compute predicted previous sample µ_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_prev_sample = pred_original_sample_coeff * pred_original_sample + current_sample_coeff * sample + + # 6. Add noise + variance = 0 + if t > 0: + device = model_output.device + variance_noise = randn_tensor( + model_output.shape, generator=generator, device=device, dtype=model_output.dtype + ) + if self.variance_type == "fixed_small_log": + variance = self._get_variance(t, predicted_variance=predicted_variance) * variance_noise + elif self.variance_type == "learned_range": + variance = self._get_variance(t, predicted_variance=predicted_variance) + variance = torch.exp(0.5 * variance) * variance_noise + else: + variance = (self._get_variance(t, predicted_variance=predicted_variance) ** 0.5) * variance_noise + + pred_prev_sample = pred_prev_sample + variance + + if not return_dict: + return (pred_prev_sample,) + + return DDPMSchedulerOutput(prev_sample=pred_prev_sample, pred_original_sample=pred_original_sample) + + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + def get_velocity( + self, sample: torch.FloatTensor, noise: torch.FloatTensor, timesteps: torch.IntTensor + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as sample + alphas_cumprod = self.alphas_cumprod.to(device=sample.device, dtype=sample.dtype) + timesteps = timesteps.to(sample.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(sample.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(sample.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + velocity = sqrt_alpha_prod * noise - sqrt_one_minus_alpha_prod * sample + return velocity + + def __len__(self): + return self.config.num_train_timesteps + + def previous_timestep(self, timestep): + if self.custom_timesteps: + index = (self.timesteps == timestep).nonzero(as_tuple=True)[0][0] + if index == self.timesteps.shape[0] - 1: + prev_t = torch.tensor(-1) + else: + prev_t = self.timesteps[index + 1] + else: + num_inference_steps = ( + self.num_inference_steps if self.num_inference_steps else self.config.num_train_timesteps + ) + prev_t = timestep - self.config.num_train_timesteps // num_inference_steps + + return prev_t diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm_flax.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..529d2bd03a75403e298ec7a30808689a48cf5301 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm_flax.py @@ -0,0 +1,299 @@ +# Copyright 2023 UC Berkeley Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/ermongroup/ddim + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import flax +import jax +import jax.numpy as jnp + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils_flax import ( + CommonSchedulerState, + FlaxKarrasDiffusionSchedulers, + FlaxSchedulerMixin, + FlaxSchedulerOutput, + add_noise_common, + get_velocity_common, +) + + +@flax.struct.dataclass +class DDPMSchedulerState: + common: CommonSchedulerState + + # setable values + init_noise_sigma: jnp.ndarray + timesteps: jnp.ndarray + num_inference_steps: Optional[int] = None + + @classmethod + def create(cls, common: CommonSchedulerState, init_noise_sigma: jnp.ndarray, timesteps: jnp.ndarray): + return cls(common=common, init_noise_sigma=init_noise_sigma, timesteps=timesteps) + + +@dataclass +class FlaxDDPMSchedulerOutput(FlaxSchedulerOutput): + state: DDPMSchedulerState + + +class FlaxDDPMScheduler(FlaxSchedulerMixin, ConfigMixin): + """ + Denoising diffusion probabilistic models (DDPMs) explores the connections between denoising score matching and + Langevin dynamics sampling. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2006.11239 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + variance_type (`str`): + options to clip the variance used when adding noise to the denoised sample. Choose from `fixed_small`, + `fixed_small_log`, `fixed_large`, `fixed_large_log`, `learned` or `learned_range`. + clip_sample (`bool`, default `True`): + option to clip predicted sample between -1 and 1 for numerical stability. + prediction_type (`str`, default `epsilon`): + indicates whether the model predicts the noise (epsilon), or the samples. One of `epsilon`, `sample`. + `v-prediction` is not supported for this scheduler. + dtype (`jnp.dtype`, *optional*, defaults to `jnp.float32`): + the `dtype` used for params and computation. + """ + + _compatibles = [e.name for e in FlaxKarrasDiffusionSchedulers] + + dtype: jnp.dtype + + @property + def has_state(self): + return True + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[jnp.ndarray] = None, + variance_type: str = "fixed_small", + clip_sample: bool = True, + prediction_type: str = "epsilon", + dtype: jnp.dtype = jnp.float32, + ): + self.dtype = dtype + + def create_state(self, common: Optional[CommonSchedulerState] = None) -> DDPMSchedulerState: + if common is None: + common = CommonSchedulerState.create(self) + + # standard deviation of the initial noise distribution + init_noise_sigma = jnp.array(1.0, dtype=self.dtype) + + timesteps = jnp.arange(0, self.config.num_train_timesteps).round()[::-1] + + return DDPMSchedulerState.create( + common=common, + init_noise_sigma=init_noise_sigma, + timesteps=timesteps, + ) + + def scale_model_input( + self, state: DDPMSchedulerState, sample: jnp.ndarray, timestep: Optional[int] = None + ) -> jnp.ndarray: + """ + Args: + state (`PNDMSchedulerState`): the `FlaxPNDMScheduler` state data class instance. + sample (`jnp.ndarray`): input sample + timestep (`int`, optional): current timestep + + Returns: + `jnp.ndarray`: scaled input sample + """ + return sample + + def set_timesteps( + self, state: DDPMSchedulerState, num_inference_steps: int, shape: Tuple = () + ) -> DDPMSchedulerState: + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + state (`DDIMSchedulerState`): + the `FlaxDDPMScheduler` state data class instance. + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + + step_ratio = self.config.num_train_timesteps // num_inference_steps + # creates integer timesteps by multiplying by ratio + # rounding to avoid issues when num_inference_step is power of 3 + timesteps = (jnp.arange(0, num_inference_steps) * step_ratio).round()[::-1] + + return state.replace( + num_inference_steps=num_inference_steps, + timesteps=timesteps, + ) + + def _get_variance(self, state: DDPMSchedulerState, t, predicted_variance=None, variance_type=None): + alpha_prod_t = state.common.alphas_cumprod[t] + alpha_prod_t_prev = jnp.where(t > 0, state.common.alphas_cumprod[t - 1], jnp.array(1.0, dtype=self.dtype)) + + # For t > 0, compute predicted variance βt (see formula (6) and (7) from https://arxiv.org/pdf/2006.11239.pdf) + # and sample from it to get previous sample + # x_{t-1} ~ N(pred_prev_sample, variance) == add variance to pred_sample + variance = (1 - alpha_prod_t_prev) / (1 - alpha_prod_t) * state.common.betas[t] + + if variance_type is None: + variance_type = self.config.variance_type + + # hacks - were probably added for training stability + if variance_type == "fixed_small": + variance = jnp.clip(variance, a_min=1e-20) + # for rl-diffuser https://arxiv.org/abs/2205.09991 + elif variance_type == "fixed_small_log": + variance = jnp.log(jnp.clip(variance, a_min=1e-20)) + elif variance_type == "fixed_large": + variance = state.common.betas[t] + elif variance_type == "fixed_large_log": + # Glide max_log + variance = jnp.log(state.common.betas[t]) + elif variance_type == "learned": + return predicted_variance + elif variance_type == "learned_range": + min_log = variance + max_log = state.common.betas[t] + frac = (predicted_variance + 1) / 2 + variance = frac * max_log + (1 - frac) * min_log + + return variance + + def step( + self, + state: DDPMSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + key: Optional[jax.random.KeyArray] = None, + return_dict: bool = True, + ) -> Union[FlaxDDPMSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + state (`DDPMSchedulerState`): the `FlaxDDPMScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + key (`jax.random.KeyArray`): a PRNG key. + return_dict (`bool`): option for returning tuple rather than FlaxDDPMSchedulerOutput class + + Returns: + [`FlaxDDPMSchedulerOutput`] or `tuple`: [`FlaxDDPMSchedulerOutput`] if `return_dict` is True, otherwise a + `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + t = timestep + + if key is None: + key = jax.random.PRNGKey(0) + + if model_output.shape[1] == sample.shape[1] * 2 and self.config.variance_type in ["learned", "learned_range"]: + model_output, predicted_variance = jnp.split(model_output, sample.shape[1], axis=1) + else: + predicted_variance = None + + # 1. compute alphas, betas + alpha_prod_t = state.common.alphas_cumprod[t] + alpha_prod_t_prev = jnp.where(t > 0, state.common.alphas_cumprod[t - 1], jnp.array(1.0, dtype=self.dtype)) + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + # 2. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (15) from https://arxiv.org/pdf/2006.11239.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample` " + " for the FlaxDDPMScheduler." + ) + + # 3. Clip "predicted x_0" + if self.config.clip_sample: + pred_original_sample = jnp.clip(pred_original_sample, -1, 1) + + # 4. Compute coefficients for pred_original_sample x_0 and current sample x_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_original_sample_coeff = (alpha_prod_t_prev ** (0.5) * state.common.betas[t]) / beta_prod_t + current_sample_coeff = state.common.alphas[t] ** (0.5) * beta_prod_t_prev / beta_prod_t + + # 5. Compute predicted previous sample µ_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_prev_sample = pred_original_sample_coeff * pred_original_sample + current_sample_coeff * sample + + # 6. Add noise + def random_variance(): + split_key = jax.random.split(key, num=1) + noise = jax.random.normal(split_key, shape=model_output.shape, dtype=self.dtype) + return (self._get_variance(state, t, predicted_variance=predicted_variance) ** 0.5) * noise + + variance = jnp.where(t > 0, random_variance(), jnp.zeros(model_output.shape, dtype=self.dtype)) + + pred_prev_sample = pred_prev_sample + variance + + if not return_dict: + return (pred_prev_sample, state) + + return FlaxDDPMSchedulerOutput(prev_sample=pred_prev_sample, state=state) + + def add_noise( + self, + state: DDPMSchedulerState, + original_samples: jnp.ndarray, + noise: jnp.ndarray, + timesteps: jnp.ndarray, + ) -> jnp.ndarray: + return add_noise_common(state.common, original_samples, noise, timesteps) + + def get_velocity( + self, + state: DDPMSchedulerState, + sample: jnp.ndarray, + noise: jnp.ndarray, + timesteps: jnp.ndarray, + ) -> jnp.ndarray: + return get_velocity_common(state.common, sample, noise, timesteps) + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm_parallel.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm_parallel.py new file mode 100644 index 0000000000000000000000000000000000000000..a92e175877d24057e49bf405e88185fd4297e6d2 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ddpm_parallel.py @@ -0,0 +1,604 @@ +# Copyright 2023 ParaDiGMS authors and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/ermongroup/ddim + +import math +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin + + +@dataclass +# Copied from diffusers.schedulers.scheduling_ddpm.DDPMSchedulerOutput +class DDPMParallelSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class DDPMParallelScheduler(SchedulerMixin, ConfigMixin): + """ + Denoising diffusion probabilistic models (DDPMs) explores the connections between denoising score matching and + Langevin dynamics sampling. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2006.11239 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, `squaredcos_cap_v2` or `sigmoid`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + variance_type (`str`): + options to clip the variance used when adding noise to the denoised sample. Choose from `fixed_small`, + `fixed_small_log`, `fixed_large`, `fixed_large_log`, `learned` or `learned_range`. + clip_sample (`bool`, default `True`): + option to clip predicted sample for numerical stability. + clip_sample_range (`float`, default `1.0`): + the maximum magnitude for sample clipping. Valid only when `clip_sample=True`. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). Valid only when `thresholding=True`. + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True`. + timestep_spacing (`str`, default `"leading"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + _is_ode_scheduler = False + + @register_to_config + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.__init__ + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + variance_type: str = "fixed_small", + clip_sample: bool = True, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + clip_sample_range: float = 1.0, + sample_max_value: float = 1.0, + timestep_spacing: str = "leading", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + elif beta_schedule == "sigmoid": + # GeoDiff sigmoid schedule + betas = torch.linspace(-6, 6, num_train_timesteps) + self.betas = torch.sigmoid(betas) * (beta_end - beta_start) + beta_start + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + self.one = torch.tensor(1.0) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # setable values + self.custom_timesteps = False + self.num_inference_steps = None + self.timesteps = torch.from_numpy(np.arange(0, num_train_timesteps)[::-1].copy()) + + self.variance_type = variance_type + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.scale_model_input + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.set_timesteps + def set_timesteps( + self, + num_inference_steps: Optional[int] = None, + device: Union[str, torch.device] = None, + timesteps: Optional[List[int]] = None, + ): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`Optional[int]`): + the number of diffusion steps used when generating samples with a pre-trained model. If passed, then + `timesteps` must be `None`. + device (`str` or `torch.device`, optional): + the device to which the timesteps are moved to. + custom_timesteps (`List[int]`, optional): + custom timesteps used to support arbitrary spacing between timesteps. If `None`, then the default + timestep spacing strategy of equal spacing between timesteps is used. If passed, `num_inference_steps` + must be `None`. + + """ + if num_inference_steps is not None and timesteps is not None: + raise ValueError("Can only pass one of `num_inference_steps` or `custom_timesteps`.") + + if timesteps is not None: + for i in range(1, len(timesteps)): + if timesteps[i] >= timesteps[i - 1]: + raise ValueError("`custom_timesteps` must be in descending order.") + + if timesteps[0] >= self.config.num_train_timesteps: + raise ValueError( + f"`timesteps` must start before `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps}." + ) + + timesteps = np.array(timesteps, dtype=np.int64) + self.custom_timesteps = True + else: + if num_inference_steps > self.config.num_train_timesteps: + raise ValueError( + f"`num_inference_steps`: {num_inference_steps} cannot be larger than `self.config.train_timesteps`:" + f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle" + f" maximal {self.config.num_train_timesteps} timesteps." + ) + + self.num_inference_steps = num_inference_steps + self.custom_timesteps = False + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = ( + np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps) + .round()[::-1] + .copy() + .astype(np.int64) + ) + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = np.round(np.arange(self.config.num_train_timesteps, 0, -step_ratio)).astype(np.int64) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + self.timesteps = torch.from_numpy(timesteps).to(device) + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._get_variance + def _get_variance(self, t, predicted_variance=None, variance_type=None): + prev_t = self.previous_timestep(t) + + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[prev_t] if prev_t >= 0 else self.one + current_beta_t = 1 - alpha_prod_t / alpha_prod_t_prev + + # For t > 0, compute predicted variance βt (see formula (6) and (7) from https://arxiv.org/pdf/2006.11239.pdf) + # and sample from it to get previous sample + # x_{t-1} ~ N(pred_prev_sample, variance) == add variance to pred_sample + variance = (1 - alpha_prod_t_prev) / (1 - alpha_prod_t) * current_beta_t + + # we always take the log of variance, so clamp it to ensure it's not 0 + variance = torch.clamp(variance, min=1e-20) + + if variance_type is None: + variance_type = self.config.variance_type + + # hacks - were probably added for training stability + if variance_type == "fixed_small": + variance = variance + # for rl-diffuser https://arxiv.org/abs/2205.09991 + elif variance_type == "fixed_small_log": + variance = torch.log(variance) + variance = torch.exp(0.5 * variance) + elif variance_type == "fixed_large": + variance = current_beta_t + elif variance_type == "fixed_large_log": + # Glide max_log + variance = torch.log(current_beta_t) + elif variance_type == "learned": + return predicted_variance + elif variance_type == "learned_range": + min_log = torch.log(variance) + max_log = torch.log(current_beta_t) + frac = (predicted_variance + 1) / 2 + variance = frac * max_log + (1 - frac) * min_log + + return variance + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + generator=None, + return_dict: bool = True, + ) -> Union[DDPMParallelSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + generator: random number generator. + return_dict (`bool`): option for returning tuple rather than DDPMParallelSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.DDPMParallelSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.DDPMParallelSchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. + When returning a tuple, the first element is the sample tensor. + + """ + t = timestep + + prev_t = self.previous_timestep(t) + + if model_output.shape[1] == sample.shape[1] * 2 and self.variance_type in ["learned", "learned_range"]: + model_output, predicted_variance = torch.split(model_output, sample.shape[1], dim=1) + else: + predicted_variance = None + + # 1. compute alphas, betas + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[prev_t] if prev_t >= 0 else self.one + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + current_alpha_t = alpha_prod_t / alpha_prod_t_prev + current_beta_t = 1 - current_alpha_t + + # 2. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (15) from https://arxiv.org/pdf/2006.11239.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample` or" + " `v_prediction` for the DDPMScheduler." + ) + + # 3. Clip or threshold "predicted x_0" + if self.config.thresholding: + pred_original_sample = self._threshold_sample(pred_original_sample) + elif self.config.clip_sample: + pred_original_sample = pred_original_sample.clamp( + -self.config.clip_sample_range, self.config.clip_sample_range + ) + + # 4. Compute coefficients for pred_original_sample x_0 and current sample x_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_original_sample_coeff = (alpha_prod_t_prev ** (0.5) * current_beta_t) / beta_prod_t + current_sample_coeff = current_alpha_t ** (0.5) * beta_prod_t_prev / beta_prod_t + + # 5. Compute predicted previous sample µ_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_prev_sample = pred_original_sample_coeff * pred_original_sample + current_sample_coeff * sample + + # 6. Add noise + variance = 0 + if t > 0: + device = model_output.device + variance_noise = randn_tensor( + model_output.shape, generator=generator, device=device, dtype=model_output.dtype + ) + if self.variance_type == "fixed_small_log": + variance = self._get_variance(t, predicted_variance=predicted_variance) * variance_noise + elif self.variance_type == "learned_range": + variance = self._get_variance(t, predicted_variance=predicted_variance) + variance = torch.exp(0.5 * variance) * variance_noise + else: + variance = (self._get_variance(t, predicted_variance=predicted_variance) ** 0.5) * variance_noise + + pred_prev_sample = pred_prev_sample + variance + + if not return_dict: + return (pred_prev_sample,) + + return DDPMParallelSchedulerOutput(prev_sample=pred_prev_sample, pred_original_sample=pred_original_sample) + + def batch_step_no_noise( + self, + model_output: torch.FloatTensor, + timesteps: List[int], + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + Batched version of the `step` function, to be able to reverse the SDE for multiple samples/timesteps at once. + Also, does not add any noise to the predicted sample, which is necessary for parallel sampling where the noise + is pre-sampled by the pipeline. + + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timesteps (`List[int]`): + current discrete timesteps in the diffusion chain. This is now a list of integers. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: sample tensor at previous timestep. + """ + t = timesteps + num_inference_steps = self.num_inference_steps if self.num_inference_steps else self.config.num_train_timesteps + prev_t = t - self.config.num_train_timesteps // num_inference_steps + + t = t.view(-1, *([1] * (model_output.ndim - 1))) + prev_t = prev_t.view(-1, *([1] * (model_output.ndim - 1))) + + if model_output.shape[1] == sample.shape[1] * 2 and self.variance_type in ["learned", "learned_range"]: + model_output, predicted_variance = torch.split(model_output, sample.shape[1], dim=1) + else: + pass + + # 1. compute alphas, betas + self.alphas_cumprod = self.alphas_cumprod.to(model_output.device) + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[torch.clip(prev_t, min=0)] + alpha_prod_t_prev[prev_t < 0] = torch.tensor(1.0) + + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + current_alpha_t = alpha_prod_t / alpha_prod_t_prev + current_beta_t = 1 - current_alpha_t + + # 2. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (15) from https://arxiv.org/pdf/2006.11239.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + elif self.config.prediction_type == "v_prediction": + pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample` or" + " `v_prediction` for the DDPMParallelScheduler." + ) + + # 3. Clip or threshold "predicted x_0" + if self.config.thresholding: + pred_original_sample = self._threshold_sample(pred_original_sample) + elif self.config.clip_sample: + pred_original_sample = pred_original_sample.clamp( + -self.config.clip_sample_range, self.config.clip_sample_range + ) + + # 4. Compute coefficients for pred_original_sample x_0 and current sample x_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_original_sample_coeff = (alpha_prod_t_prev ** (0.5) * current_beta_t) / beta_prod_t + current_sample_coeff = current_alpha_t ** (0.5) * beta_prod_t_prev / beta_prod_t + + # 5. Compute predicted previous sample µ_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_prev_sample = pred_original_sample_coeff * pred_original_sample + current_sample_coeff * sample + + return pred_prev_sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.get_velocity + def get_velocity( + self, sample: torch.FloatTensor, noise: torch.FloatTensor, timesteps: torch.IntTensor + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as sample + alphas_cumprod = self.alphas_cumprod.to(device=sample.device, dtype=sample.dtype) + timesteps = timesteps.to(sample.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(sample.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(sample.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + velocity = sqrt_alpha_prod * noise - sqrt_one_minus_alpha_prod * sample + return velocity + + def __len__(self): + return self.config.num_train_timesteps + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.previous_timestep + def previous_timestep(self, timestep): + if self.custom_timesteps: + index = (self.timesteps == timestep).nonzero(as_tuple=True)[0][0] + if index == self.timesteps.shape[0] - 1: + prev_t = torch.tensor(-1) + else: + prev_t = self.timesteps[index + 1] + else: + num_inference_steps = ( + self.num_inference_steps if self.num_inference_steps else self.config.num_train_timesteps + ) + prev_t = timestep - self.config.num_train_timesteps // num_inference_steps + + return prev_t diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_deis_multistep.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_deis_multistep.py new file mode 100644 index 0000000000000000000000000000000000000000..36947294922b6cc0ecdc5bf7dc9c0772a056d03a --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_deis_multistep.py @@ -0,0 +1,568 @@ +# Copyright 2023 FLAIR Lab and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: check https://arxiv.org/abs/2204.13902 and https://github.com/qsh-zh/deis for more info +# The codebase is modified based on https://github.com/huggingface/diffusers/blob/main/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py + +import math +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class DEISMultistepScheduler(SchedulerMixin, ConfigMixin): + """ + DEIS (https://arxiv.org/abs/2204.13902) is a fast high order solver for diffusion ODEs. We slightly modify the + polynomial fitting formula in log-rho space instead of the original linear t space in DEIS paper. The modification + enjoys closed-form coefficients for exponential multistep update instead of replying on the numerical solver. More + variants of DEIS can be found in https://github.com/qsh-zh/deis. + + Currently, we support the log-rho multistep DEIS. We recommend to use `solver_order=2 / 3` while `solver_order=1` + reduces to DDIM. + + We also support the "dynamic thresholding" method in Imagen (https://arxiv.org/abs/2205.11487). For pixel-space + diffusion models, you can set `thresholding=True` to use the dynamic thresholding. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + solver_order (`int`, default `2`): + the order of DEIS; can be `1` or `2` or `3`. We recommend to use `solver_order=2` for guided sampling, and + `solver_order=3` for unconditional sampling. + prediction_type (`str`, default `epsilon`): + indicates whether the model predicts the noise (epsilon), or the data / `x0`. One of `epsilon`, `sample`, + or `v-prediction`. + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True` + algorithm_type (`str`, default `deis`): + the algorithm type for the solver. current we support multistep deis, we will add other variants of DEIS in + the future + lower_order_final (`bool`, default `True`): + whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. We empirically + find this trick can stabilize the sampling of DEIS for steps < 15, especially for steps <= 10. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[np.ndarray] = None, + solver_order: int = 2, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + sample_max_value: float = 1.0, + algorithm_type: str = "deis", + solver_type: str = "logrho", + lower_order_final: bool = True, + use_karras_sigmas: Optional[bool] = False, + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + # Currently we only support VP-type noise schedule + self.alpha_t = torch.sqrt(self.alphas_cumprod) + self.sigma_t = torch.sqrt(1 - self.alphas_cumprod) + self.lambda_t = torch.log(self.alpha_t) - torch.log(self.sigma_t) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # settings for DEIS + if algorithm_type not in ["deis"]: + if algorithm_type in ["dpmsolver", "dpmsolver++"]: + self.register_to_config(algorithm_type="deis") + else: + raise NotImplementedError(f"{algorithm_type} does is not implemented for {self.__class__}") + + if solver_type not in ["logrho"]: + if solver_type in ["midpoint", "heun", "bh1", "bh2"]: + self.register_to_config(solver_type="logrho") + else: + raise NotImplementedError(f"solver type {solver_type} does is not implemented for {self.__class__}") + + # setable values + self.num_inference_steps = None + timesteps = np.linspace(0, num_train_timesteps - 1, num_train_timesteps, dtype=np.float32)[::-1].copy() + self.timesteps = torch.from_numpy(timesteps) + self.model_outputs = [None] * solver_order + self.lower_order_nums = 0 + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = ( + np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps + 1) + .round()[::-1][:-1] + .copy() + .astype(np.int64) + ) + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // (num_inference_steps + 1) + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps + 1) * step_ratio).round()[::-1][:-1].copy().astype(np.int64) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = np.arange(self.config.num_train_timesteps, 0, -step_ratio).round().copy().astype(np.int64) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + if self.config.use_karras_sigmas: + log_sigmas = np.log(sigmas) + sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=num_inference_steps) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]).round() + timesteps = np.flip(timesteps).copy().astype(np.int64) + + self.sigmas = torch.from_numpy(sigmas) + + # when num_inference_steps == num_train_timesteps, we can end up with + # duplicates in timesteps. + _, unique_indices = np.unique(timesteps, return_index=True) + timesteps = timesteps[np.sort(unique_indices)] + + self.timesteps = torch.from_numpy(timesteps).to(device) + + self.num_inference_steps = len(timesteps) + + self.model_outputs = [ + None, + ] * self.config.solver_order + self.lower_order_nums = 0 + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + def convert_model_output( + self, model_output: torch.FloatTensor, timestep: int, sample: torch.FloatTensor + ) -> torch.FloatTensor: + """ + Convert the model output to the corresponding type that the algorithm DEIS needs. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the converted model output. + """ + if self.config.prediction_type == "epsilon": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = (sample - sigma_t * model_output) / alpha_t + elif self.config.prediction_type == "sample": + x0_pred = model_output + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = alpha_t * sample - sigma_t * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the DEISMultistepScheduler." + ) + + if self.config.thresholding: + x0_pred = self._threshold_sample(x0_pred) + + if self.config.algorithm_type == "deis": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + return (sample - alpha_t * x0_pred) / sigma_t + else: + raise NotImplementedError("only support log-rho multistep deis now") + + def deis_first_order_update( + self, + model_output: torch.FloatTensor, + timestep: int, + prev_timestep: int, + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + One step for the first-order DEIS (equivalent to DDIM). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + lambda_t, lambda_s = self.lambda_t[prev_timestep], self.lambda_t[timestep] + alpha_t, alpha_s = self.alpha_t[prev_timestep], self.alpha_t[timestep] + sigma_t, _ = self.sigma_t[prev_timestep], self.sigma_t[timestep] + h = lambda_t - lambda_s + if self.config.algorithm_type == "deis": + x_t = (alpha_t / alpha_s) * sample - (sigma_t * (torch.exp(h) - 1.0)) * model_output + else: + raise NotImplementedError("only support log-rho multistep deis now") + return x_t + + def multistep_deis_second_order_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + One step for the second-order multistep DEIS. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + t, s0, s1 = prev_timestep, timestep_list[-1], timestep_list[-2] + m0, m1 = model_output_list[-1], model_output_list[-2] + alpha_t, alpha_s0, alpha_s1 = self.alpha_t[t], self.alpha_t[s0], self.alpha_t[s1] + sigma_t, sigma_s0, sigma_s1 = self.sigma_t[t], self.sigma_t[s0], self.sigma_t[s1] + + rho_t, rho_s0, rho_s1 = sigma_t / alpha_t, sigma_s0 / alpha_s0, sigma_s1 / alpha_s1 + + if self.config.algorithm_type == "deis": + + def ind_fn(t, b, c): + # Integrate[(log(t) - log(c)) / (log(b) - log(c)), {t}] + return t * (-np.log(c) + np.log(t) - 1) / (np.log(b) - np.log(c)) + + coef1 = ind_fn(rho_t, rho_s0, rho_s1) - ind_fn(rho_s0, rho_s0, rho_s1) + coef2 = ind_fn(rho_t, rho_s1, rho_s0) - ind_fn(rho_s0, rho_s1, rho_s0) + + x_t = alpha_t * (sample / alpha_s0 + coef1 * m0 + coef2 * m1) + return x_t + else: + raise NotImplementedError("only support log-rho multistep deis now") + + def multistep_deis_third_order_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + One step for the third-order multistep DEIS. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + t, s0, s1, s2 = prev_timestep, timestep_list[-1], timestep_list[-2], timestep_list[-3] + m0, m1, m2 = model_output_list[-1], model_output_list[-2], model_output_list[-3] + alpha_t, alpha_s0, alpha_s1, alpha_s2 = self.alpha_t[t], self.alpha_t[s0], self.alpha_t[s1], self.alpha_t[s2] + sigma_t, sigma_s0, sigma_s1, simga_s2 = self.sigma_t[t], self.sigma_t[s0], self.sigma_t[s1], self.sigma_t[s2] + rho_t, rho_s0, rho_s1, rho_s2 = ( + sigma_t / alpha_t, + sigma_s0 / alpha_s0, + sigma_s1 / alpha_s1, + simga_s2 / alpha_s2, + ) + + if self.config.algorithm_type == "deis": + + def ind_fn(t, b, c, d): + # Integrate[(log(t) - log(c))(log(t) - log(d)) / (log(b) - log(c))(log(b) - log(d)), {t}] + numerator = t * ( + np.log(c) * (np.log(d) - np.log(t) + 1) + - np.log(d) * np.log(t) + + np.log(d) + + np.log(t) ** 2 + - 2 * np.log(t) + + 2 + ) + denominator = (np.log(b) - np.log(c)) * (np.log(b) - np.log(d)) + return numerator / denominator + + coef1 = ind_fn(rho_t, rho_s0, rho_s1, rho_s2) - ind_fn(rho_s0, rho_s0, rho_s1, rho_s2) + coef2 = ind_fn(rho_t, rho_s1, rho_s2, rho_s0) - ind_fn(rho_s0, rho_s1, rho_s2, rho_s0) + coef3 = ind_fn(rho_t, rho_s2, rho_s0, rho_s1) - ind_fn(rho_s0, rho_s2, rho_s0, rho_s1) + + x_t = alpha_t * (sample / alpha_s0 + coef1 * m0 + coef2 * m1 + coef3 * m2) + + return x_t + else: + raise NotImplementedError("only support log-rho multistep deis now") + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Step function propagating the sample with the multistep DEIS. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~scheduling_utils.SchedulerOutput`] or `tuple`: [`~scheduling_utils.SchedulerOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero() + if len(step_index) == 0: + step_index = len(self.timesteps) - 1 + else: + step_index = step_index.item() + prev_timestep = 0 if step_index == len(self.timesteps) - 1 else self.timesteps[step_index + 1] + lower_order_final = ( + (step_index == len(self.timesteps) - 1) and self.config.lower_order_final and len(self.timesteps) < 15 + ) + lower_order_second = ( + (step_index == len(self.timesteps) - 2) and self.config.lower_order_final and len(self.timesteps) < 15 + ) + + model_output = self.convert_model_output(model_output, timestep, sample) + for i in range(self.config.solver_order - 1): + self.model_outputs[i] = self.model_outputs[i + 1] + self.model_outputs[-1] = model_output + + if self.config.solver_order == 1 or self.lower_order_nums < 1 or lower_order_final: + prev_sample = self.deis_first_order_update(model_output, timestep, prev_timestep, sample) + elif self.config.solver_order == 2 or self.lower_order_nums < 2 or lower_order_second: + timestep_list = [self.timesteps[step_index - 1], timestep] + prev_sample = self.multistep_deis_second_order_update( + self.model_outputs, timestep_list, prev_timestep, sample + ) + else: + timestep_list = [self.timesteps[step_index - 2], self.timesteps[step_index - 1], timestep] + prev_sample = self.multistep_deis_third_order_update( + self.model_outputs, timestep_list, prev_timestep, sample + ) + + if self.lower_order_nums < self.config.solver_order: + self.lower_order_nums += 1 + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def scale_model_input(self, sample: torch.FloatTensor, *args, **kwargs) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep.py new file mode 100644 index 0000000000000000000000000000000000000000..d7516fa601e17cdd5661039c181804d687a66f0e --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep.py @@ -0,0 +1,749 @@ +# Copyright 2023 TSAIL Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/LuChengTHU/dpm-solver + +import math +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class DPMSolverMultistepScheduler(SchedulerMixin, ConfigMixin): + """ + DPM-Solver (and the improved version DPM-Solver++) is a fast dedicated high-order solver for diffusion ODEs with + the convergence order guarantee. Empirically, sampling by DPM-Solver with only 20 steps can generate high-quality + samples, and it can generate quite good samples even in only 10 steps. + + For more details, see the original paper: https://arxiv.org/abs/2206.00927 and https://arxiv.org/abs/2211.01095 + + Currently, we support the multistep DPM-Solver for both noise prediction models and data prediction models. We + recommend to use `solver_order=2` for guided sampling, and `solver_order=3` for unconditional sampling. + + We also support the "dynamic thresholding" method in Imagen (https://arxiv.org/abs/2205.11487). For pixel-space + diffusion models, you can set both `algorithm_type="dpmsolver++"` and `thresholding=True` to use the dynamic + thresholding. Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + + We also support the SDE variant of DPM-Solver and DPM-Solver++, which is a fast SDE solver for the reverse + diffusion SDE. Currently we only support the first-order and second-order solvers. We recommend using the + second-order `sde-dpmsolver++`. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + solver_order (`int`, default `2`): + the order of DPM-Solver; can be `1` or `2` or `3`. We recommend to use `solver_order=2` for guided + sampling, and `solver_order=3` for unconditional sampling. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + For pixel-space diffusion models, you can set both `algorithm_type=dpmsolver++` and `thresholding=True` to + use the dynamic thresholding. Note that the thresholding method is unsuitable for latent-space diffusion + models (such as stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True` and + `algorithm_type="dpmsolver++`. + algorithm_type (`str`, default `dpmsolver++`): + the algorithm type for the solver. Either `dpmsolver` or `dpmsolver++` or `sde-dpmsolver` or + `sde-dpmsolver++`. The `dpmsolver` type implements the algorithms in https://arxiv.org/abs/2206.00927, and + the `dpmsolver++` type implements the algorithms in https://arxiv.org/abs/2211.01095. We recommend to use + `dpmsolver++` or `sde-dpmsolver++` with `solver_order=2` for guided sampling (e.g. stable-diffusion). + solver_type (`str`, default `midpoint`): + the solver type for the second-order solver. Either `midpoint` or `heun`. The solver type slightly affects + the sample quality, especially for small number of steps. We empirically find that `midpoint` solvers are + slightly better, so we recommend to use the `midpoint` type. + lower_order_final (`bool`, default `True`): + whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. We empirically + find this trick can stabilize the sampling of DPM-Solver for steps < 15, especially for steps <= 10. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + lambda_min_clipped (`float`, default `-inf`): + the clipping threshold for the minimum value of lambda(t) for numerical stability. This is critical for + cosine (squaredcos_cap_v2) noise schedule. + variance_type (`str`, *optional*): + Set to "learned" or "learned_range" for diffusion models that predict variance. For example, OpenAI's + guided-diffusion (https://github.com/openai/guided-diffusion) predicts both mean and variance of the + Gaussian distribution in the model's output. DPM-Solver only needs the "mean" output because it is based on + diffusion ODEs. whether the model's output contains the predicted Gaussian variance. For example, OpenAI's + guided-diffusion (https://github.com/openai/guided-diffusion) predicts both mean and variance of the + Gaussian distribution in the model's output. DPM-Solver only needs the "mean" output because it is based on + diffusion ODEs. + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + solver_order: int = 2, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + sample_max_value: float = 1.0, + algorithm_type: str = "dpmsolver++", + solver_type: str = "midpoint", + lower_order_final: bool = True, + use_karras_sigmas: Optional[bool] = False, + lambda_min_clipped: float = -float("inf"), + variance_type: Optional[str] = None, + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + # Currently we only support VP-type noise schedule + self.alpha_t = torch.sqrt(self.alphas_cumprod) + self.sigma_t = torch.sqrt(1 - self.alphas_cumprod) + self.lambda_t = torch.log(self.alpha_t) - torch.log(self.sigma_t) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # settings for DPM-Solver + if algorithm_type not in ["dpmsolver", "dpmsolver++", "sde-dpmsolver", "sde-dpmsolver++"]: + if algorithm_type == "deis": + self.register_to_config(algorithm_type="dpmsolver++") + else: + raise NotImplementedError(f"{algorithm_type} does is not implemented for {self.__class__}") + + if solver_type not in ["midpoint", "heun"]: + if solver_type in ["logrho", "bh1", "bh2"]: + self.register_to_config(solver_type="midpoint") + else: + raise NotImplementedError(f"{solver_type} does is not implemented for {self.__class__}") + + # setable values + self.num_inference_steps = None + timesteps = np.linspace(0, num_train_timesteps - 1, num_train_timesteps, dtype=np.float32)[::-1].copy() + self.timesteps = torch.from_numpy(timesteps) + self.model_outputs = [None] * solver_order + self.lower_order_nums = 0 + + def set_timesteps(self, num_inference_steps: int = None, device: Union[str, torch.device] = None): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + # Clipping the minimum of all lambda(t) for numerical stability. + # This is critical for cosine (squaredcos_cap_v2) noise schedule. + clipped_idx = torch.searchsorted(torch.flip(self.lambda_t, [0]), self.config.lambda_min_clipped) + last_timestep = ((self.config.num_train_timesteps - clipped_idx).numpy()).item() + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = ( + np.linspace(0, last_timestep - 1, num_inference_steps + 1).round()[::-1][:-1].copy().astype(np.int64) + ) + elif self.config.timestep_spacing == "leading": + step_ratio = last_timestep // (num_inference_steps + 1) + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps + 1) * step_ratio).round()[::-1][:-1].copy().astype(np.int64) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = np.arange(last_timestep, 0, -step_ratio).round().copy().astype(np.int64) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + if self.config.use_karras_sigmas: + log_sigmas = np.log(sigmas) + sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=num_inference_steps) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]).round() + timesteps = np.flip(timesteps).copy().astype(np.int64) + + self.sigmas = torch.from_numpy(sigmas) + + # when num_inference_steps == num_train_timesteps, we can end up with + # duplicates in timesteps. + _, unique_indices = np.unique(timesteps, return_index=True) + timesteps = timesteps[np.sort(unique_indices)] + + self.timesteps = torch.from_numpy(timesteps).to(device) + + self.num_inference_steps = len(timesteps) + + self.model_outputs = [ + None, + ] * self.config.solver_order + self.lower_order_nums = 0 + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._sigma_to_t + def _sigma_to_t(self, sigma, log_sigmas): + # get log sigma + log_sigma = np.log(sigma) + + # get distribution + dists = log_sigma - log_sigmas[:, np.newaxis] + + # get sigmas range + low_idx = np.cumsum((dists >= 0), axis=0).argmax(axis=0).clip(max=log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = log_sigmas[low_idx] + high = log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = np.clip(w, 0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.reshape(sigma.shape) + return t + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._convert_to_karras + def _convert_to_karras(self, in_sigmas: torch.FloatTensor, num_inference_steps) -> torch.FloatTensor: + """Constructs the noise schedule of Karras et al. (2022).""" + + sigma_min: float = in_sigmas[-1].item() + sigma_max: float = in_sigmas[0].item() + + rho = 7.0 # 7.0 is the value used in the paper + ramp = np.linspace(0, 1, num_inference_steps) + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho + return sigmas + + def convert_model_output( + self, model_output: torch.FloatTensor, timestep: int, sample: torch.FloatTensor + ) -> torch.FloatTensor: + """ + Convert the model output to the corresponding type that the algorithm (DPM-Solver / DPM-Solver++) needs. + + DPM-Solver is designed to discretize an integral of the noise prediction model, and DPM-Solver++ is designed to + discretize an integral of the data prediction model. So we need to first convert the model output to the + corresponding type to match the algorithm. + + Note that the algorithm type and the model type is decoupled. That is to say, we can use either DPM-Solver or + DPM-Solver++ for both noise prediction model and data prediction model. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the converted model output. + """ + + # DPM-Solver++ needs to solve an integral of the data prediction model. + if self.config.algorithm_type in ["dpmsolver++", "sde-dpmsolver++"]: + if self.config.prediction_type == "epsilon": + # DPM-Solver and DPM-Solver++ only need the "mean" output. + if self.config.variance_type in ["learned", "learned_range"]: + model_output = model_output[:, :3] + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = (sample - sigma_t * model_output) / alpha_t + elif self.config.prediction_type == "sample": + x0_pred = model_output + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = alpha_t * sample - sigma_t * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the DPMSolverMultistepScheduler." + ) + + if self.config.thresholding: + x0_pred = self._threshold_sample(x0_pred) + + return x0_pred + + # DPM-Solver needs to solve an integral of the noise prediction model. + elif self.config.algorithm_type in ["dpmsolver", "sde-dpmsolver"]: + if self.config.prediction_type == "epsilon": + # DPM-Solver and DPM-Solver++ only need the "mean" output. + if self.config.variance_type in ["learned", "learned_range"]: + epsilon = model_output[:, :3] + else: + epsilon = model_output + elif self.config.prediction_type == "sample": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + epsilon = (sample - alpha_t * model_output) / sigma_t + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + epsilon = alpha_t * model_output + sigma_t * sample + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the DPMSolverMultistepScheduler." + ) + + if self.config.thresholding: + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = (sample - sigma_t * epsilon) / alpha_t + x0_pred = self._threshold_sample(x0_pred) + epsilon = (sample - alpha_t * x0_pred) / sigma_t + + return epsilon + + def dpm_solver_first_order_update( + self, + model_output: torch.FloatTensor, + timestep: int, + prev_timestep: int, + sample: torch.FloatTensor, + noise: Optional[torch.FloatTensor] = None, + ) -> torch.FloatTensor: + """ + One step for the first-order DPM-Solver (equivalent to DDIM). + + See https://arxiv.org/abs/2206.00927 for the detailed derivation. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + lambda_t, lambda_s = self.lambda_t[prev_timestep], self.lambda_t[timestep] + alpha_t, alpha_s = self.alpha_t[prev_timestep], self.alpha_t[timestep] + sigma_t, sigma_s = self.sigma_t[prev_timestep], self.sigma_t[timestep] + h = lambda_t - lambda_s + if self.config.algorithm_type == "dpmsolver++": + x_t = (sigma_t / sigma_s) * sample - (alpha_t * (torch.exp(-h) - 1.0)) * model_output + elif self.config.algorithm_type == "dpmsolver": + x_t = (alpha_t / alpha_s) * sample - (sigma_t * (torch.exp(h) - 1.0)) * model_output + elif self.config.algorithm_type == "sde-dpmsolver++": + assert noise is not None + x_t = ( + (sigma_t / sigma_s * torch.exp(-h)) * sample + + (alpha_t * (1 - torch.exp(-2.0 * h))) * model_output + + sigma_t * torch.sqrt(1.0 - torch.exp(-2 * h)) * noise + ) + elif self.config.algorithm_type == "sde-dpmsolver": + assert noise is not None + x_t = ( + (alpha_t / alpha_s) * sample + - 2.0 * (sigma_t * (torch.exp(h) - 1.0)) * model_output + + sigma_t * torch.sqrt(torch.exp(2 * h) - 1.0) * noise + ) + return x_t + + def multistep_dpm_solver_second_order_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + noise: Optional[torch.FloatTensor] = None, + ) -> torch.FloatTensor: + """ + One step for the second-order multistep DPM-Solver. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + t, s0, s1 = prev_timestep, timestep_list[-1], timestep_list[-2] + m0, m1 = model_output_list[-1], model_output_list[-2] + lambda_t, lambda_s0, lambda_s1 = self.lambda_t[t], self.lambda_t[s0], self.lambda_t[s1] + alpha_t, alpha_s0 = self.alpha_t[t], self.alpha_t[s0] + sigma_t, sigma_s0 = self.sigma_t[t], self.sigma_t[s0] + h, h_0 = lambda_t - lambda_s0, lambda_s0 - lambda_s1 + r0 = h_0 / h + D0, D1 = m0, (1.0 / r0) * (m0 - m1) + if self.config.algorithm_type == "dpmsolver++": + # See https://arxiv.org/abs/2211.01095 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + - 0.5 * (alpha_t * (torch.exp(-h) - 1.0)) * D1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + + (alpha_t * ((torch.exp(-h) - 1.0) / h + 1.0)) * D1 + ) + elif self.config.algorithm_type == "dpmsolver": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - 0.5 * (sigma_t * (torch.exp(h) - 1.0)) * D1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1 + ) + elif self.config.algorithm_type == "sde-dpmsolver++": + assert noise is not None + if self.config.solver_type == "midpoint": + x_t = ( + (sigma_t / sigma_s0 * torch.exp(-h)) * sample + + (alpha_t * (1 - torch.exp(-2.0 * h))) * D0 + + 0.5 * (alpha_t * (1 - torch.exp(-2.0 * h))) * D1 + + sigma_t * torch.sqrt(1.0 - torch.exp(-2 * h)) * noise + ) + elif self.config.solver_type == "heun": + x_t = ( + (sigma_t / sigma_s0 * torch.exp(-h)) * sample + + (alpha_t * (1 - torch.exp(-2.0 * h))) * D0 + + (alpha_t * ((1.0 - torch.exp(-2.0 * h)) / (-2.0 * h) + 1.0)) * D1 + + sigma_t * torch.sqrt(1.0 - torch.exp(-2 * h)) * noise + ) + elif self.config.algorithm_type == "sde-dpmsolver": + assert noise is not None + if self.config.solver_type == "midpoint": + x_t = ( + (alpha_t / alpha_s0) * sample + - 2.0 * (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * (torch.exp(h) - 1.0)) * D1 + + sigma_t * torch.sqrt(torch.exp(2 * h) - 1.0) * noise + ) + elif self.config.solver_type == "heun": + x_t = ( + (alpha_t / alpha_s0) * sample + - 2.0 * (sigma_t * (torch.exp(h) - 1.0)) * D0 + - 2.0 * (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1 + + sigma_t * torch.sqrt(torch.exp(2 * h) - 1.0) * noise + ) + return x_t + + def multistep_dpm_solver_third_order_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + One step for the third-order multistep DPM-Solver. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + t, s0, s1, s2 = prev_timestep, timestep_list[-1], timestep_list[-2], timestep_list[-3] + m0, m1, m2 = model_output_list[-1], model_output_list[-2], model_output_list[-3] + lambda_t, lambda_s0, lambda_s1, lambda_s2 = ( + self.lambda_t[t], + self.lambda_t[s0], + self.lambda_t[s1], + self.lambda_t[s2], + ) + alpha_t, alpha_s0 = self.alpha_t[t], self.alpha_t[s0] + sigma_t, sigma_s0 = self.sigma_t[t], self.sigma_t[s0] + h, h_0, h_1 = lambda_t - lambda_s0, lambda_s0 - lambda_s1, lambda_s1 - lambda_s2 + r0, r1 = h_0 / h, h_1 / h + D0 = m0 + D1_0, D1_1 = (1.0 / r0) * (m0 - m1), (1.0 / r1) * (m1 - m2) + D1 = D1_0 + (r0 / (r0 + r1)) * (D1_0 - D1_1) + D2 = (1.0 / (r0 + r1)) * (D1_0 - D1_1) + if self.config.algorithm_type == "dpmsolver++": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + + (alpha_t * ((torch.exp(-h) - 1.0) / h + 1.0)) * D1 + - (alpha_t * ((torch.exp(-h) - 1.0 + h) / h**2 - 0.5)) * D2 + ) + elif self.config.algorithm_type == "dpmsolver": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1 + - (sigma_t * ((torch.exp(h) - 1.0 - h) / h**2 - 0.5)) * D2 + ) + return x_t + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + generator=None, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Step function propagating the sample with the multistep DPM-Solver. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~scheduling_utils.SchedulerOutput`] or `tuple`: [`~scheduling_utils.SchedulerOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero() + if len(step_index) == 0: + step_index = len(self.timesteps) - 1 + else: + step_index = step_index.item() + prev_timestep = 0 if step_index == len(self.timesteps) - 1 else self.timesteps[step_index + 1] + lower_order_final = ( + (step_index == len(self.timesteps) - 1) and self.config.lower_order_final and len(self.timesteps) < 15 + ) + lower_order_second = ( + (step_index == len(self.timesteps) - 2) and self.config.lower_order_final and len(self.timesteps) < 15 + ) + + model_output = self.convert_model_output(model_output, timestep, sample) + for i in range(self.config.solver_order - 1): + self.model_outputs[i] = self.model_outputs[i + 1] + self.model_outputs[-1] = model_output + + if self.config.algorithm_type in ["sde-dpmsolver", "sde-dpmsolver++"]: + noise = randn_tensor( + model_output.shape, generator=generator, device=model_output.device, dtype=model_output.dtype + ) + else: + noise = None + + if self.config.solver_order == 1 or self.lower_order_nums < 1 or lower_order_final: + prev_sample = self.dpm_solver_first_order_update( + model_output, timestep, prev_timestep, sample, noise=noise + ) + elif self.config.solver_order == 2 or self.lower_order_nums < 2 or lower_order_second: + timestep_list = [self.timesteps[step_index - 1], timestep] + prev_sample = self.multistep_dpm_solver_second_order_update( + self.model_outputs, timestep_list, prev_timestep, sample, noise=noise + ) + else: + timestep_list = [self.timesteps[step_index - 2], self.timesteps[step_index - 1], timestep] + prev_sample = self.multistep_dpm_solver_third_order_update( + self.model_outputs, timestep_list, prev_timestep, sample + ) + + if self.lower_order_nums < self.config.solver_order: + self.lower_order_nums += 1 + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def scale_model_input(self, sample: torch.FloatTensor, *args, **kwargs) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep_flax.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..9b4ee67a7f5dbf8384eaedc0ede322284a413edd --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep_flax.py @@ -0,0 +1,622 @@ +# Copyright 2023 TSAIL Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/LuChengTHU/dpm-solver + +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import flax +import jax +import jax.numpy as jnp + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils_flax import ( + CommonSchedulerState, + FlaxKarrasDiffusionSchedulers, + FlaxSchedulerMixin, + FlaxSchedulerOutput, + add_noise_common, +) + + +@flax.struct.dataclass +class DPMSolverMultistepSchedulerState: + common: CommonSchedulerState + alpha_t: jnp.ndarray + sigma_t: jnp.ndarray + lambda_t: jnp.ndarray + + # setable values + init_noise_sigma: jnp.ndarray + timesteps: jnp.ndarray + num_inference_steps: Optional[int] = None + + # running values + model_outputs: Optional[jnp.ndarray] = None + lower_order_nums: Optional[jnp.int32] = None + prev_timestep: Optional[jnp.int32] = None + cur_sample: Optional[jnp.ndarray] = None + + @classmethod + def create( + cls, + common: CommonSchedulerState, + alpha_t: jnp.ndarray, + sigma_t: jnp.ndarray, + lambda_t: jnp.ndarray, + init_noise_sigma: jnp.ndarray, + timesteps: jnp.ndarray, + ): + return cls( + common=common, + alpha_t=alpha_t, + sigma_t=sigma_t, + lambda_t=lambda_t, + init_noise_sigma=init_noise_sigma, + timesteps=timesteps, + ) + + +@dataclass +class FlaxDPMSolverMultistepSchedulerOutput(FlaxSchedulerOutput): + state: DPMSolverMultistepSchedulerState + + +class FlaxDPMSolverMultistepScheduler(FlaxSchedulerMixin, ConfigMixin): + """ + DPM-Solver (and the improved version DPM-Solver++) is a fast dedicated high-order solver for diffusion ODEs with + the convergence order guarantee. Empirically, sampling by DPM-Solver with only 20 steps can generate high-quality + samples, and it can generate quite good samples even in only 10 steps. + + For more details, see the original paper: https://arxiv.org/abs/2206.00927 and https://arxiv.org/abs/2211.01095 + + Currently, we support the multistep DPM-Solver for both noise prediction models and data prediction models. We + recommend to use `solver_order=2` for guided sampling, and `solver_order=3` for unconditional sampling. + + We also support the "dynamic thresholding" method in Imagen (https://arxiv.org/abs/2205.11487). For pixel-space + diffusion models, you can set both `algorithm_type="dpmsolver++"` and `thresholding=True` to use the dynamic + thresholding. Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2206.00927 and https://arxiv.org/abs/2211.01095 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + solver_order (`int`, default `2`): + the order of DPM-Solver; can be `1` or `2` or `3`. We recommend to use `solver_order=2` for guided + sampling, and `solver_order=3` for unconditional sampling. + prediction_type (`str`, default `epsilon`): + indicates whether the model predicts the noise (epsilon), or the data / `x0`. One of `epsilon`, `sample`, + or `v-prediction`. + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + For pixel-space diffusion models, you can set both `algorithm_type=dpmsolver++` and `thresholding=True` to + use the dynamic thresholding. Note that the thresholding method is unsuitable for latent-space diffusion + models (such as stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True` and + `algorithm_type="dpmsolver++`. + algorithm_type (`str`, default `dpmsolver++`): + the algorithm type for the solver. Either `dpmsolver` or `dpmsolver++`. The `dpmsolver` type implements the + algorithms in https://arxiv.org/abs/2206.00927, and the `dpmsolver++` type implements the algorithms in + https://arxiv.org/abs/2211.01095. We recommend to use `dpmsolver++` with `solver_order=2` for guided + sampling (e.g. stable-diffusion). + solver_type (`str`, default `midpoint`): + the solver type for the second-order solver. Either `midpoint` or `heun`. The solver type slightly affects + the sample quality, especially for small number of steps. We empirically find that `midpoint` solvers are + slightly better, so we recommend to use the `midpoint` type. + lower_order_final (`bool`, default `True`): + whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. We empirically + find this trick can stabilize the sampling of DPM-Solver for steps < 15, especially for steps <= 10. + dtype (`jnp.dtype`, *optional*, defaults to `jnp.float32`): + the `dtype` used for params and computation. + """ + + _compatibles = [e.name for e in FlaxKarrasDiffusionSchedulers] + + dtype: jnp.dtype + + @property + def has_state(self): + return True + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[jnp.ndarray] = None, + solver_order: int = 2, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + sample_max_value: float = 1.0, + algorithm_type: str = "dpmsolver++", + solver_type: str = "midpoint", + lower_order_final: bool = True, + dtype: jnp.dtype = jnp.float32, + ): + self.dtype = dtype + + def create_state(self, common: Optional[CommonSchedulerState] = None) -> DPMSolverMultistepSchedulerState: + if common is None: + common = CommonSchedulerState.create(self) + + # Currently we only support VP-type noise schedule + alpha_t = jnp.sqrt(common.alphas_cumprod) + sigma_t = jnp.sqrt(1 - common.alphas_cumprod) + lambda_t = jnp.log(alpha_t) - jnp.log(sigma_t) + + # settings for DPM-Solver + if self.config.algorithm_type not in ["dpmsolver", "dpmsolver++"]: + raise NotImplementedError(f"{self.config.algorithm_type} does is not implemented for {self.__class__}") + if self.config.solver_type not in ["midpoint", "heun"]: + raise NotImplementedError(f"{self.config.solver_type} does is not implemented for {self.__class__}") + + # standard deviation of the initial noise distribution + init_noise_sigma = jnp.array(1.0, dtype=self.dtype) + + timesteps = jnp.arange(0, self.config.num_train_timesteps).round()[::-1] + + return DPMSolverMultistepSchedulerState.create( + common=common, + alpha_t=alpha_t, + sigma_t=sigma_t, + lambda_t=lambda_t, + init_noise_sigma=init_noise_sigma, + timesteps=timesteps, + ) + + def set_timesteps( + self, state: DPMSolverMultistepSchedulerState, num_inference_steps: int, shape: Tuple + ) -> DPMSolverMultistepSchedulerState: + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + state (`DPMSolverMultistepSchedulerState`): + the `FlaxDPMSolverMultistepScheduler` state data class instance. + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + shape (`Tuple`): + the shape of the samples to be generated. + """ + + timesteps = ( + jnp.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps + 1) + .round()[::-1][:-1] + .astype(jnp.int32) + ) + + # initial running values + + model_outputs = jnp.zeros((self.config.solver_order,) + shape, dtype=self.dtype) + lower_order_nums = jnp.int32(0) + prev_timestep = jnp.int32(-1) + cur_sample = jnp.zeros(shape, dtype=self.dtype) + + return state.replace( + num_inference_steps=num_inference_steps, + timesteps=timesteps, + model_outputs=model_outputs, + lower_order_nums=lower_order_nums, + prev_timestep=prev_timestep, + cur_sample=cur_sample, + ) + + def convert_model_output( + self, + state: DPMSolverMultistepSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + ) -> jnp.ndarray: + """ + Convert the model output to the corresponding type that the algorithm (DPM-Solver / DPM-Solver++) needs. + + DPM-Solver is designed to discretize an integral of the noise prediction model, and DPM-Solver++ is designed to + discretize an integral of the data prediction model. So we need to first convert the model output to the + corresponding type to match the algorithm. + + Note that the algorithm type and the model type is decoupled. That is to say, we can use either DPM-Solver or + DPM-Solver++ for both noise prediction model and data prediction model. + + Args: + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + + Returns: + `jnp.ndarray`: the converted model output. + """ + # DPM-Solver++ needs to solve an integral of the data prediction model. + if self.config.algorithm_type == "dpmsolver++": + if self.config.prediction_type == "epsilon": + alpha_t, sigma_t = state.alpha_t[timestep], state.sigma_t[timestep] + x0_pred = (sample - sigma_t * model_output) / alpha_t + elif self.config.prediction_type == "sample": + x0_pred = model_output + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = state.alpha_t[timestep], state.sigma_t[timestep] + x0_pred = alpha_t * sample - sigma_t * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, " + " or `v_prediction` for the FlaxDPMSolverMultistepScheduler." + ) + + if self.config.thresholding: + # Dynamic thresholding in https://arxiv.org/abs/2205.11487 + dynamic_max_val = jnp.percentile( + jnp.abs(x0_pred), self.config.dynamic_thresholding_ratio, axis=tuple(range(1, x0_pred.ndim)) + ) + dynamic_max_val = jnp.maximum( + dynamic_max_val, self.config.sample_max_value * jnp.ones_like(dynamic_max_val) + ) + x0_pred = jnp.clip(x0_pred, -dynamic_max_val, dynamic_max_val) / dynamic_max_val + return x0_pred + # DPM-Solver needs to solve an integral of the noise prediction model. + elif self.config.algorithm_type == "dpmsolver": + if self.config.prediction_type == "epsilon": + return model_output + elif self.config.prediction_type == "sample": + alpha_t, sigma_t = state.alpha_t[timestep], state.sigma_t[timestep] + epsilon = (sample - alpha_t * model_output) / sigma_t + return epsilon + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = state.alpha_t[timestep], state.sigma_t[timestep] + epsilon = alpha_t * model_output + sigma_t * sample + return epsilon + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, " + " or `v_prediction` for the FlaxDPMSolverMultistepScheduler." + ) + + def dpm_solver_first_order_update( + self, + state: DPMSolverMultistepSchedulerState, + model_output: jnp.ndarray, + timestep: int, + prev_timestep: int, + sample: jnp.ndarray, + ) -> jnp.ndarray: + """ + One step for the first-order DPM-Solver (equivalent to DDIM). + + See https://arxiv.org/abs/2206.00927 for the detailed derivation. + + Args: + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + + Returns: + `jnp.ndarray`: the sample tensor at the previous timestep. + """ + t, s0 = prev_timestep, timestep + m0 = model_output + lambda_t, lambda_s = state.lambda_t[t], state.lambda_t[s0] + alpha_t, alpha_s = state.alpha_t[t], state.alpha_t[s0] + sigma_t, sigma_s = state.sigma_t[t], state.sigma_t[s0] + h = lambda_t - lambda_s + if self.config.algorithm_type == "dpmsolver++": + x_t = (sigma_t / sigma_s) * sample - (alpha_t * (jnp.exp(-h) - 1.0)) * m0 + elif self.config.algorithm_type == "dpmsolver": + x_t = (alpha_t / alpha_s) * sample - (sigma_t * (jnp.exp(h) - 1.0)) * m0 + return x_t + + def multistep_dpm_solver_second_order_update( + self, + state: DPMSolverMultistepSchedulerState, + model_output_list: jnp.ndarray, + timestep_list: List[int], + prev_timestep: int, + sample: jnp.ndarray, + ) -> jnp.ndarray: + """ + One step for the second-order multistep DPM-Solver. + + Args: + model_output_list (`List[jnp.ndarray]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + + Returns: + `jnp.ndarray`: the sample tensor at the previous timestep. + """ + t, s0, s1 = prev_timestep, timestep_list[-1], timestep_list[-2] + m0, m1 = model_output_list[-1], model_output_list[-2] + lambda_t, lambda_s0, lambda_s1 = state.lambda_t[t], state.lambda_t[s0], state.lambda_t[s1] + alpha_t, alpha_s0 = state.alpha_t[t], state.alpha_t[s0] + sigma_t, sigma_s0 = state.sigma_t[t], state.sigma_t[s0] + h, h_0 = lambda_t - lambda_s0, lambda_s0 - lambda_s1 + r0 = h_0 / h + D0, D1 = m0, (1.0 / r0) * (m0 - m1) + if self.config.algorithm_type == "dpmsolver++": + # See https://arxiv.org/abs/2211.01095 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (jnp.exp(-h) - 1.0)) * D0 + - 0.5 * (alpha_t * (jnp.exp(-h) - 1.0)) * D1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (jnp.exp(-h) - 1.0)) * D0 + + (alpha_t * ((jnp.exp(-h) - 1.0) / h + 1.0)) * D1 + ) + elif self.config.algorithm_type == "dpmsolver": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (jnp.exp(h) - 1.0)) * D0 + - 0.5 * (sigma_t * (jnp.exp(h) - 1.0)) * D1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (jnp.exp(h) - 1.0)) * D0 + - (sigma_t * ((jnp.exp(h) - 1.0) / h - 1.0)) * D1 + ) + return x_t + + def multistep_dpm_solver_third_order_update( + self, + state: DPMSolverMultistepSchedulerState, + model_output_list: jnp.ndarray, + timestep_list: List[int], + prev_timestep: int, + sample: jnp.ndarray, + ) -> jnp.ndarray: + """ + One step for the third-order multistep DPM-Solver. + + Args: + model_output_list (`List[jnp.ndarray]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + + Returns: + `jnp.ndarray`: the sample tensor at the previous timestep. + """ + t, s0, s1, s2 = prev_timestep, timestep_list[-1], timestep_list[-2], timestep_list[-3] + m0, m1, m2 = model_output_list[-1], model_output_list[-2], model_output_list[-3] + lambda_t, lambda_s0, lambda_s1, lambda_s2 = ( + state.lambda_t[t], + state.lambda_t[s0], + state.lambda_t[s1], + state.lambda_t[s2], + ) + alpha_t, alpha_s0 = state.alpha_t[t], state.alpha_t[s0] + sigma_t, sigma_s0 = state.sigma_t[t], state.sigma_t[s0] + h, h_0, h_1 = lambda_t - lambda_s0, lambda_s0 - lambda_s1, lambda_s1 - lambda_s2 + r0, r1 = h_0 / h, h_1 / h + D0 = m0 + D1_0, D1_1 = (1.0 / r0) * (m0 - m1), (1.0 / r1) * (m1 - m2) + D1 = D1_0 + (r0 / (r0 + r1)) * (D1_0 - D1_1) + D2 = (1.0 / (r0 + r1)) * (D1_0 - D1_1) + if self.config.algorithm_type == "dpmsolver++": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (jnp.exp(-h) - 1.0)) * D0 + + (alpha_t * ((jnp.exp(-h) - 1.0) / h + 1.0)) * D1 + - (alpha_t * ((jnp.exp(-h) - 1.0 + h) / h**2 - 0.5)) * D2 + ) + elif self.config.algorithm_type == "dpmsolver": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (jnp.exp(h) - 1.0)) * D0 + - (sigma_t * ((jnp.exp(h) - 1.0) / h - 1.0)) * D1 + - (sigma_t * ((jnp.exp(h) - 1.0 - h) / h**2 - 0.5)) * D2 + ) + return x_t + + def step( + self, + state: DPMSolverMultistepSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + return_dict: bool = True, + ) -> Union[FlaxDPMSolverMultistepSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by DPM-Solver. Core function to propagate the diffusion process + from the learned model outputs (most often the predicted noise). + + Args: + state (`DPMSolverMultistepSchedulerState`): + the `FlaxDPMSolverMultistepScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than FlaxDPMSolverMultistepSchedulerOutput class + + Returns: + [`FlaxDPMSolverMultistepSchedulerOutput`] or `tuple`: [`FlaxDPMSolverMultistepSchedulerOutput`] if + `return_dict` is True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if state.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + (step_index,) = jnp.where(state.timesteps == timestep, size=1) + step_index = step_index[0] + + prev_timestep = jax.lax.select(step_index == len(state.timesteps) - 1, 0, state.timesteps[step_index + 1]) + + model_output = self.convert_model_output(state, model_output, timestep, sample) + + model_outputs_new = jnp.roll(state.model_outputs, -1, axis=0) + model_outputs_new = model_outputs_new.at[-1].set(model_output) + state = state.replace( + model_outputs=model_outputs_new, + prev_timestep=prev_timestep, + cur_sample=sample, + ) + + def step_1(state: DPMSolverMultistepSchedulerState) -> jnp.ndarray: + return self.dpm_solver_first_order_update( + state, + state.model_outputs[-1], + state.timesteps[step_index], + state.prev_timestep, + state.cur_sample, + ) + + def step_23(state: DPMSolverMultistepSchedulerState) -> jnp.ndarray: + def step_2(state: DPMSolverMultistepSchedulerState) -> jnp.ndarray: + timestep_list = jnp.array([state.timesteps[step_index - 1], state.timesteps[step_index]]) + return self.multistep_dpm_solver_second_order_update( + state, + state.model_outputs, + timestep_list, + state.prev_timestep, + state.cur_sample, + ) + + def step_3(state: DPMSolverMultistepSchedulerState) -> jnp.ndarray: + timestep_list = jnp.array( + [ + state.timesteps[step_index - 2], + state.timesteps[step_index - 1], + state.timesteps[step_index], + ] + ) + return self.multistep_dpm_solver_third_order_update( + state, + state.model_outputs, + timestep_list, + state.prev_timestep, + state.cur_sample, + ) + + step_2_output = step_2(state) + step_3_output = step_3(state) + + if self.config.solver_order == 2: + return step_2_output + elif self.config.lower_order_final and len(state.timesteps) < 15: + return jax.lax.select( + state.lower_order_nums < 2, + step_2_output, + jax.lax.select( + step_index == len(state.timesteps) - 2, + step_2_output, + step_3_output, + ), + ) + else: + return jax.lax.select( + state.lower_order_nums < 2, + step_2_output, + step_3_output, + ) + + step_1_output = step_1(state) + step_23_output = step_23(state) + + if self.config.solver_order == 1: + prev_sample = step_1_output + + elif self.config.lower_order_final and len(state.timesteps) < 15: + prev_sample = jax.lax.select( + state.lower_order_nums < 1, + step_1_output, + jax.lax.select( + step_index == len(state.timesteps) - 1, + step_1_output, + step_23_output, + ), + ) + + else: + prev_sample = jax.lax.select( + state.lower_order_nums < 1, + step_1_output, + step_23_output, + ) + + state = state.replace( + lower_order_nums=jnp.minimum(state.lower_order_nums + 1, self.config.solver_order), + ) + + if not return_dict: + return (prev_sample, state) + + return FlaxDPMSolverMultistepSchedulerOutput(prev_sample=prev_sample, state=state) + + def scale_model_input( + self, state: DPMSolverMultistepSchedulerState, sample: jnp.ndarray, timestep: Optional[int] = None + ) -> jnp.ndarray: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + state (`DPMSolverMultistepSchedulerState`): + the `FlaxDPMSolverMultistepScheduler` state data class instance. + sample (`jnp.ndarray`): input sample + timestep (`int`, optional): current timestep + + Returns: + `jnp.ndarray`: scaled input sample + """ + return sample + + def add_noise( + self, + state: DPMSolverMultistepSchedulerState, + original_samples: jnp.ndarray, + noise: jnp.ndarray, + timesteps: jnp.ndarray, + ) -> jnp.ndarray: + return add_noise_common(state.common, original_samples, noise, timesteps) + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep_inverse.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep_inverse.py new file mode 100644 index 0000000000000000000000000000000000000000..a6736b3544191c689a4acaf3b0174f1e7c7305a2 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_multistep_inverse.py @@ -0,0 +1,716 @@ +# Copyright 2023 TSAIL Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/LuChengTHU/dpm-solver + +import math +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class DPMSolverMultistepInverseScheduler(SchedulerMixin, ConfigMixin): + """ + DPMSolverMultistepInverseScheduler is the reverse scheduler of [`DPMSolverMultistepScheduler`]. + + We also support the "dynamic thresholding" method in Imagen (https://arxiv.org/abs/2205.11487). For pixel-space + diffusion models, you can set both `algorithm_type="dpmsolver++"` and `thresholding=True` to use the dynamic + thresholding. Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + solver_order (`int`, default `2`): + the order of DPM-Solver; can be `1` or `2` or `3`. We recommend to use `solver_order=2` for guided + sampling, and `solver_order=3` for unconditional sampling. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + For pixel-space diffusion models, you can set both `algorithm_type=dpmsolver++` and `thresholding=True` to + use the dynamic thresholding. Note that the thresholding method is unsuitable for latent-space diffusion + models (such as stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True` and + `algorithm_type="dpmsolver++`. + algorithm_type (`str`, default `dpmsolver++`): + the algorithm type for the solver. Either `dpmsolver` or `dpmsolver++` or `sde-dpmsolver` or + `sde-dpmsolver++`. The `dpmsolver` type implements the algorithms in https://arxiv.org/abs/2206.00927, and + the `dpmsolver++` type implements the algorithms in https://arxiv.org/abs/2211.01095. We recommend to use + `dpmsolver++` or `sde-dpmsolver++` with `solver_order=2` for guided sampling (e.g. stable-diffusion). + solver_type (`str`, default `midpoint`): + the solver type for the second-order solver. Either `midpoint` or `heun`. The solver type slightly affects + the sample quality, especially for small number of steps. We empirically find that `midpoint` solvers are + slightly better, so we recommend to use the `midpoint` type. + lower_order_final (`bool`, default `True`): + whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. We empirically + find this trick can stabilize the sampling of DPM-Solver for steps < 15, especially for steps <= 10. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + lambda_min_clipped (`float`, default `-inf`): + the clipping threshold for the minimum value of lambda(t) for numerical stability. This is critical for + cosine (squaredcos_cap_v2) noise schedule. + variance_type (`str`, *optional*): + Set to "learned" or "learned_range" for diffusion models that predict variance. For example, OpenAI's + guided-diffusion (https://github.com/openai/guided-diffusion) predicts both mean and variance of the + Gaussian distribution in the model's output. DPM-Solver only needs the "mean" output because it is based on + diffusion ODEs. whether the model's output contains the predicted Gaussian variance. For example, OpenAI's + guided-diffusion (https://github.com/openai/guided-diffusion) predicts both mean and variance of the + Gaussian distribution in the model's output. DPM-Solver only needs the "mean" output because it is based on + diffusion ODEs. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + solver_order: int = 2, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + sample_max_value: float = 1.0, + algorithm_type: str = "dpmsolver++", + solver_type: str = "midpoint", + lower_order_final: bool = True, + use_karras_sigmas: Optional[bool] = False, + lambda_min_clipped: float = -float("inf"), + variance_type: Optional[str] = None, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + # Currently we only support VP-type noise schedule + self.alpha_t = torch.sqrt(self.alphas_cumprod) + self.sigma_t = torch.sqrt(1 - self.alphas_cumprod) + self.lambda_t = torch.log(self.alpha_t) - torch.log(self.sigma_t) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # settings for DPM-Solver + if algorithm_type not in ["dpmsolver", "dpmsolver++", "sde-dpmsolver", "sde-dpmsolver++"]: + if algorithm_type == "deis": + self.register_to_config(algorithm_type="dpmsolver++") + else: + raise NotImplementedError(f"{algorithm_type} does is not implemented for {self.__class__}") + + if solver_type not in ["midpoint", "heun"]: + if solver_type in ["logrho", "bh1", "bh2"]: + self.register_to_config(solver_type="midpoint") + else: + raise NotImplementedError(f"{solver_type} does is not implemented for {self.__class__}") + + # setable values + self.num_inference_steps = None + timesteps = np.linspace(0, num_train_timesteps - 1, num_train_timesteps, dtype=np.float32).copy() + self.timesteps = torch.from_numpy(timesteps) + self.model_outputs = [None] * solver_order + self.lower_order_nums = 0 + self.use_karras_sigmas = use_karras_sigmas + + def set_timesteps(self, num_inference_steps: int = None, device: Union[str, torch.device] = None): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + # Clipping the minimum of all lambda(t) for numerical stability. + # This is critical for cosine (squaredcos_cap_v2) noise schedule. + clipped_idx = torch.searchsorted(torch.flip(self.lambda_t, [0]), self.lambda_min_clipped) + self.noisiest_timestep = self.config.num_train_timesteps - 1 - clipped_idx + timesteps = ( + np.linspace(0, self.noisiest_timestep, num_inference_steps + 1).round()[:-1].copy().astype(np.int64) + ) + + if self.use_karras_sigmas: + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + log_sigmas = np.log(sigmas) + sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=num_inference_steps) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]).round() + timesteps = timesteps.copy().astype(np.int64) + + # when num_inference_steps == num_train_timesteps, we can end up with + # duplicates in timesteps. + _, unique_indices = np.unique(timesteps, return_index=True) + timesteps = timesteps[np.sort(unique_indices)] + + self.timesteps = torch.from_numpy(timesteps).to(device) + + self.num_inference_steps = len(timesteps) + + self.model_outputs = [ + None, + ] * self.config.solver_order + self.lower_order_nums = 0 + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._sigma_to_t + def _sigma_to_t(self, sigma, log_sigmas): + # get log sigma + log_sigma = np.log(sigma) + + # get distribution + dists = log_sigma - log_sigmas[:, np.newaxis] + + # get sigmas range + low_idx = np.cumsum((dists >= 0), axis=0).argmax(axis=0).clip(max=log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = log_sigmas[low_idx] + high = log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = np.clip(w, 0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.reshape(sigma.shape) + return t + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._convert_to_karras + def _convert_to_karras(self, in_sigmas: torch.FloatTensor, num_inference_steps) -> torch.FloatTensor: + """Constructs the noise schedule of Karras et al. (2022).""" + + sigma_min: float = in_sigmas[-1].item() + sigma_max: float = in_sigmas[0].item() + + rho = 7.0 # 7.0 is the value used in the paper + ramp = np.linspace(0, 1, num_inference_steps) + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho + return sigmas + + # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler.convert_model_output + def convert_model_output( + self, model_output: torch.FloatTensor, timestep: int, sample: torch.FloatTensor + ) -> torch.FloatTensor: + """ + Convert the model output to the corresponding type that the algorithm (DPM-Solver / DPM-Solver++) needs. + + DPM-Solver is designed to discretize an integral of the noise prediction model, and DPM-Solver++ is designed to + discretize an integral of the data prediction model. So we need to first convert the model output to the + corresponding type to match the algorithm. + + Note that the algorithm type and the model type is decoupled. That is to say, we can use either DPM-Solver or + DPM-Solver++ for both noise prediction model and data prediction model. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the converted model output. + """ + + # DPM-Solver++ needs to solve an integral of the data prediction model. + if self.config.algorithm_type in ["dpmsolver++", "sde-dpmsolver++"]: + if self.config.prediction_type == "epsilon": + # DPM-Solver and DPM-Solver++ only need the "mean" output. + if self.config.variance_type in ["learned", "learned_range"]: + model_output = model_output[:, :3] + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = (sample - sigma_t * model_output) / alpha_t + elif self.config.prediction_type == "sample": + x0_pred = model_output + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = alpha_t * sample - sigma_t * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the DPMSolverMultistepScheduler." + ) + + if self.config.thresholding: + x0_pred = self._threshold_sample(x0_pred) + + return x0_pred + + # DPM-Solver needs to solve an integral of the noise prediction model. + elif self.config.algorithm_type in ["dpmsolver", "sde-dpmsolver"]: + if self.config.prediction_type == "epsilon": + # DPM-Solver and DPM-Solver++ only need the "mean" output. + if self.config.variance_type in ["learned", "learned_range"]: + epsilon = model_output[:, :3] + else: + epsilon = model_output + elif self.config.prediction_type == "sample": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + epsilon = (sample - alpha_t * model_output) / sigma_t + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + epsilon = alpha_t * model_output + sigma_t * sample + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the DPMSolverMultistepScheduler." + ) + + if self.config.thresholding: + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = (sample - sigma_t * epsilon) / alpha_t + x0_pred = self._threshold_sample(x0_pred) + epsilon = (sample - alpha_t * x0_pred) / sigma_t + + return epsilon + + # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler.dpm_solver_first_order_update + def dpm_solver_first_order_update( + self, + model_output: torch.FloatTensor, + timestep: int, + prev_timestep: int, + sample: torch.FloatTensor, + noise: Optional[torch.FloatTensor] = None, + ) -> torch.FloatTensor: + """ + One step for the first-order DPM-Solver (equivalent to DDIM). + + See https://arxiv.org/abs/2206.00927 for the detailed derivation. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + lambda_t, lambda_s = self.lambda_t[prev_timestep], self.lambda_t[timestep] + alpha_t, alpha_s = self.alpha_t[prev_timestep], self.alpha_t[timestep] + sigma_t, sigma_s = self.sigma_t[prev_timestep], self.sigma_t[timestep] + h = lambda_t - lambda_s + if self.config.algorithm_type == "dpmsolver++": + x_t = (sigma_t / sigma_s) * sample - (alpha_t * (torch.exp(-h) - 1.0)) * model_output + elif self.config.algorithm_type == "dpmsolver": + x_t = (alpha_t / alpha_s) * sample - (sigma_t * (torch.exp(h) - 1.0)) * model_output + elif self.config.algorithm_type == "sde-dpmsolver++": + assert noise is not None + x_t = ( + (sigma_t / sigma_s * torch.exp(-h)) * sample + + (alpha_t * (1 - torch.exp(-2.0 * h))) * model_output + + sigma_t * torch.sqrt(1.0 - torch.exp(-2 * h)) * noise + ) + elif self.config.algorithm_type == "sde-dpmsolver": + assert noise is not None + x_t = ( + (alpha_t / alpha_s) * sample + - 2.0 * (sigma_t * (torch.exp(h) - 1.0)) * model_output + + sigma_t * torch.sqrt(torch.exp(2 * h) - 1.0) * noise + ) + return x_t + + # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler.multistep_dpm_solver_second_order_update + def multistep_dpm_solver_second_order_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + noise: Optional[torch.FloatTensor] = None, + ) -> torch.FloatTensor: + """ + One step for the second-order multistep DPM-Solver. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + t, s0, s1 = prev_timestep, timestep_list[-1], timestep_list[-2] + m0, m1 = model_output_list[-1], model_output_list[-2] + lambda_t, lambda_s0, lambda_s1 = self.lambda_t[t], self.lambda_t[s0], self.lambda_t[s1] + alpha_t, alpha_s0 = self.alpha_t[t], self.alpha_t[s0] + sigma_t, sigma_s0 = self.sigma_t[t], self.sigma_t[s0] + h, h_0 = lambda_t - lambda_s0, lambda_s0 - lambda_s1 + r0 = h_0 / h + D0, D1 = m0, (1.0 / r0) * (m0 - m1) + if self.config.algorithm_type == "dpmsolver++": + # See https://arxiv.org/abs/2211.01095 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + - 0.5 * (alpha_t * (torch.exp(-h) - 1.0)) * D1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + + (alpha_t * ((torch.exp(-h) - 1.0) / h + 1.0)) * D1 + ) + elif self.config.algorithm_type == "dpmsolver": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - 0.5 * (sigma_t * (torch.exp(h) - 1.0)) * D1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1 + ) + elif self.config.algorithm_type == "sde-dpmsolver++": + assert noise is not None + if self.config.solver_type == "midpoint": + x_t = ( + (sigma_t / sigma_s0 * torch.exp(-h)) * sample + + (alpha_t * (1 - torch.exp(-2.0 * h))) * D0 + + 0.5 * (alpha_t * (1 - torch.exp(-2.0 * h))) * D1 + + sigma_t * torch.sqrt(1.0 - torch.exp(-2 * h)) * noise + ) + elif self.config.solver_type == "heun": + x_t = ( + (sigma_t / sigma_s0 * torch.exp(-h)) * sample + + (alpha_t * (1 - torch.exp(-2.0 * h))) * D0 + + (alpha_t * ((1.0 - torch.exp(-2.0 * h)) / (-2.0 * h) + 1.0)) * D1 + + sigma_t * torch.sqrt(1.0 - torch.exp(-2 * h)) * noise + ) + elif self.config.algorithm_type == "sde-dpmsolver": + assert noise is not None + if self.config.solver_type == "midpoint": + x_t = ( + (alpha_t / alpha_s0) * sample + - 2.0 * (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * (torch.exp(h) - 1.0)) * D1 + + sigma_t * torch.sqrt(torch.exp(2 * h) - 1.0) * noise + ) + elif self.config.solver_type == "heun": + x_t = ( + (alpha_t / alpha_s0) * sample + - 2.0 * (sigma_t * (torch.exp(h) - 1.0)) * D0 + - 2.0 * (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1 + + sigma_t * torch.sqrt(torch.exp(2 * h) - 1.0) * noise + ) + return x_t + + # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler.multistep_dpm_solver_third_order_update + def multistep_dpm_solver_third_order_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + One step for the third-order multistep DPM-Solver. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + t, s0, s1, s2 = prev_timestep, timestep_list[-1], timestep_list[-2], timestep_list[-3] + m0, m1, m2 = model_output_list[-1], model_output_list[-2], model_output_list[-3] + lambda_t, lambda_s0, lambda_s1, lambda_s2 = ( + self.lambda_t[t], + self.lambda_t[s0], + self.lambda_t[s1], + self.lambda_t[s2], + ) + alpha_t, alpha_s0 = self.alpha_t[t], self.alpha_t[s0] + sigma_t, sigma_s0 = self.sigma_t[t], self.sigma_t[s0] + h, h_0, h_1 = lambda_t - lambda_s0, lambda_s0 - lambda_s1, lambda_s1 - lambda_s2 + r0, r1 = h_0 / h, h_1 / h + D0 = m0 + D1_0, D1_1 = (1.0 / r0) * (m0 - m1), (1.0 / r1) * (m1 - m2) + D1 = D1_0 + (r0 / (r0 + r1)) * (D1_0 - D1_1) + D2 = (1.0 / (r0 + r1)) * (D1_0 - D1_1) + if self.config.algorithm_type == "dpmsolver++": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + x_t = ( + (sigma_t / sigma_s0) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + + (alpha_t * ((torch.exp(-h) - 1.0) / h + 1.0)) * D1 + - (alpha_t * ((torch.exp(-h) - 1.0 + h) / h**2 - 0.5)) * D2 + ) + elif self.config.algorithm_type == "dpmsolver": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + x_t = ( + (alpha_t / alpha_s0) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1 + - (sigma_t * ((torch.exp(h) - 1.0 - h) / h**2 - 0.5)) * D2 + ) + return x_t + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + generator=None, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Step function propagating the sample with the multistep DPM-Solver. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~scheduling_utils.SchedulerOutput`] or `tuple`: [`~scheduling_utils.SchedulerOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero() + if len(step_index) == 0: + step_index = len(self.timesteps) - 1 + else: + step_index = step_index.item() + prev_timestep = ( + self.noisiest_timestep if step_index == len(self.timesteps) - 1 else self.timesteps[step_index + 1] + ) + lower_order_final = ( + (step_index == len(self.timesteps) - 1) and self.config.lower_order_final and len(self.timesteps) < 15 + ) + lower_order_second = ( + (step_index == len(self.timesteps) - 2) and self.config.lower_order_final and len(self.timesteps) < 15 + ) + + model_output = self.convert_model_output(model_output, timestep, sample) + for i in range(self.config.solver_order - 1): + self.model_outputs[i] = self.model_outputs[i + 1] + self.model_outputs[-1] = model_output + + if self.config.algorithm_type in ["sde-dpmsolver", "sde-dpmsolver++"]: + noise = randn_tensor( + model_output.shape, generator=generator, device=model_output.device, dtype=model_output.dtype + ) + else: + noise = None + + if self.config.solver_order == 1 or self.lower_order_nums < 1 or lower_order_final: + prev_sample = self.dpm_solver_first_order_update( + model_output, timestep, prev_timestep, sample, noise=noise + ) + elif self.config.solver_order == 2 or self.lower_order_nums < 2 or lower_order_second: + timestep_list = [self.timesteps[step_index - 1], timestep] + prev_sample = self.multistep_dpm_solver_second_order_update( + self.model_outputs, timestep_list, prev_timestep, sample, noise=noise + ) + else: + timestep_list = [self.timesteps[step_index - 2], self.timesteps[step_index - 1], timestep] + prev_sample = self.multistep_dpm_solver_third_order_update( + self.model_outputs, timestep_list, prev_timestep, sample + ) + + if self.lower_order_nums < self.config.solver_order: + self.lower_order_nums += 1 + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler.scale_model_input + def scale_model_input(self, sample: torch.FloatTensor, *args, **kwargs) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_sde.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_sde.py new file mode 100644 index 0000000000000000000000000000000000000000..a31e97b6965169823634afe8984866a9f7d03ba3 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_sde.py @@ -0,0 +1,509 @@ +# Copyright 2023 Katherine Crowson, The HuggingFace Team and hlky. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from collections import defaultdict +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch +import torchsde + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +class BatchedBrownianTree: + """A wrapper around torchsde.BrownianTree that enables batches of entropy.""" + + def __init__(self, x, t0, t1, seed=None, **kwargs): + t0, t1, self.sign = self.sort(t0, t1) + w0 = kwargs.get("w0", torch.zeros_like(x)) + if seed is None: + seed = torch.randint(0, 2**63 - 1, []).item() + self.batched = True + try: + assert len(seed) == x.shape[0] + w0 = w0[0] + except TypeError: + seed = [seed] + self.batched = False + self.trees = [torchsde.BrownianTree(t0, w0, t1, entropy=s, **kwargs) for s in seed] + + @staticmethod + def sort(a, b): + return (a, b, 1) if a < b else (b, a, -1) + + def __call__(self, t0, t1): + t0, t1, sign = self.sort(t0, t1) + w = torch.stack([tree(t0, t1) for tree in self.trees]) * (self.sign * sign) + return w if self.batched else w[0] + + +class BrownianTreeNoiseSampler: + """A noise sampler backed by a torchsde.BrownianTree. + + Args: + x (Tensor): The tensor whose shape, device and dtype to use to generate + random samples. + sigma_min (float): The low end of the valid interval. + sigma_max (float): The high end of the valid interval. + seed (int or List[int]): The random seed. If a list of seeds is + supplied instead of a single integer, then the noise sampler will use one BrownianTree per batch item, each + with its own seed. + transform (callable): A function that maps sigma to the sampler's + internal timestep. + """ + + def __init__(self, x, sigma_min, sigma_max, seed=None, transform=lambda x: x): + self.transform = transform + t0, t1 = self.transform(torch.as_tensor(sigma_min)), self.transform(torch.as_tensor(sigma_max)) + self.tree = BatchedBrownianTree(x, t0, t1, seed) + + def __call__(self, sigma, sigma_next): + t0, t1 = self.transform(torch.as_tensor(sigma)), self.transform(torch.as_tensor(sigma_next)) + return self.tree(t0, t1) / (t1 - t0).abs().sqrt() + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class DPMSolverSDEScheduler(SchedulerMixin, ConfigMixin): + """ + Implements Stochastic Sampler (Algorithm 2) from Karras et al. (2022). Based on the original k-diffusion + implementation by Katherine Crowson: + https://github.com/crowsonkb/k-diffusion/blob/41b4cb6df0506694a7776af31349acf082bf6091/k_diffusion/sampling.py#L543 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. beta_start (`float`): the + starting `beta` value of inference. beta_end (`float`): the final `beta` value. beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear` or `scaled_linear`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + noise_sampler_seed (`int`, *optional*, defaults to `None`): + The random seed to use for the noise sampler. If `None`, a random seed will be generated. + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 2 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.00085, # sensible defaults + beta_end: float = 0.012, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + prediction_type: str = "epsilon", + use_karras_sigmas: Optional[bool] = False, + noise_sampler_seed: Optional[int] = None, + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + # set all values + self.set_timesteps(num_train_timesteps, None, num_train_timesteps) + self.use_karras_sigmas = use_karras_sigmas + self.noise_sampler = None + self.noise_sampler_seed = noise_sampler_seed + + # Copied from diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler.index_for_timestep + def index_for_timestep(self, timestep, schedule_timesteps=None): + if schedule_timesteps is None: + schedule_timesteps = self.timesteps + + indices = (schedule_timesteps == timestep).nonzero() + + # The sigma index that is taken for the **very** first `step` + # is always the second index (or the last index if there is only 1) + # This way we can ensure we don't accidentally skip a sigma in + # case we start in the middle of the denoising schedule (e.g. for image-to-image) + if len(self._index_counter) == 0: + pos = 1 if len(indices) > 1 else 0 + else: + timestep_int = timestep.cpu().item() if torch.is_tensor(timestep) else timestep + pos = self._index_counter[timestep_int] + + return indices[pos].item() + + @property + def init_noise_sigma(self): + # standard deviation of the initial noise distribution + if self.config.timestep_spacing in ["linspace", "trailing"]: + return self.sigmas.max() + + return (self.sigmas.max() ** 2 + 1) ** 0.5 + + def scale_model_input( + self, + sample: torch.FloatTensor, + timestep: Union[float, torch.FloatTensor], + ) -> torch.FloatTensor: + """ + Args: + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + sample (`torch.FloatTensor`): input sample timestep (`int`, optional): current timestep + Returns: + `torch.FloatTensor`: scaled input sample + """ + step_index = self.index_for_timestep(timestep) + + sigma = self.sigmas[step_index] + sigma_input = sigma if self.state_in_first_order else self.mid_point_sigma + sample = sample / ((sigma_input**2 + 1) ** 0.5) + return sample + + def set_timesteps( + self, + num_inference_steps: int, + device: Union[str, torch.device] = None, + num_train_timesteps: Optional[int] = None, + ): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + self.num_inference_steps = num_inference_steps + + num_train_timesteps = num_train_timesteps or self.config.num_train_timesteps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = np.linspace(0, num_train_timesteps - 1, num_inference_steps, dtype=float)[::-1].copy() + elif self.config.timestep_spacing == "leading": + step_ratio = num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(float) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(num_train_timesteps, 0, -step_ratio)).round().copy().astype(float) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + log_sigmas = np.log(sigmas) + sigmas = np.interp(timesteps, np.arange(0, len(sigmas)), sigmas) + + if self.use_karras_sigmas: + sigmas = self._convert_to_karras(in_sigmas=sigmas) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]) + + second_order_timesteps = self._second_order_timesteps(sigmas, log_sigmas) + + sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32) + sigmas = torch.from_numpy(sigmas).to(device=device) + self.sigmas = torch.cat([sigmas[:1], sigmas[1:-1].repeat_interleave(2), sigmas[-1:]]) + + timesteps = torch.from_numpy(timesteps) + second_order_timesteps = torch.from_numpy(second_order_timesteps) + timesteps = torch.cat([timesteps[:1], timesteps[1:].repeat_interleave(2)]) + timesteps[1::2] = second_order_timesteps + + if str(device).startswith("mps"): + # mps does not support float64 + self.timesteps = timesteps.to(device, dtype=torch.float32) + else: + self.timesteps = timesteps.to(device=device) + + # empty first order variables + self.sample = None + self.mid_point_sigma = None + + # for exp beta schedules, such as the one for `pipeline_shap_e.py` + # we need an index counter + self._index_counter = defaultdict(int) + + def _second_order_timesteps(self, sigmas, log_sigmas): + def sigma_fn(_t): + return np.exp(-_t) + + def t_fn(_sigma): + return -np.log(_sigma) + + midpoint_ratio = 0.5 + t = t_fn(sigmas) + delta_time = np.diff(t) + t_proposed = t[:-1] + delta_time * midpoint_ratio + sig_proposed = sigma_fn(t_proposed) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sig_proposed]) + return timesteps + + # copied from diffusers.schedulers.scheduling_euler_discrete._sigma_to_t + def _sigma_to_t(self, sigma, log_sigmas): + # get log sigma + log_sigma = np.log(sigma) + + # get distribution + dists = log_sigma - log_sigmas[:, np.newaxis] + + # get sigmas range + low_idx = np.cumsum((dists >= 0), axis=0).argmax(axis=0).clip(max=log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = log_sigmas[low_idx] + high = log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = np.clip(w, 0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.reshape(sigma.shape) + return t + + # copied from diffusers.schedulers.scheduling_euler_discrete._convert_to_karras + def _convert_to_karras(self, in_sigmas: torch.FloatTensor) -> torch.FloatTensor: + """Constructs the noise schedule of Karras et al. (2022).""" + + sigma_min: float = in_sigmas[-1].item() + sigma_max: float = in_sigmas[0].item() + + rho = 7.0 # 7.0 is the value used in the paper + ramp = np.linspace(0, 1, self.num_inference_steps) + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho + return sigmas + + @property + def state_in_first_order(self): + return self.sample is None + + def step( + self, + model_output: Union[torch.FloatTensor, np.ndarray], + timestep: Union[float, torch.FloatTensor], + sample: Union[torch.FloatTensor, np.ndarray], + return_dict: bool = True, + s_noise: float = 1.0, + ) -> Union[SchedulerOutput, Tuple]: + """ + Args: + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + model_output (Union[torch.FloatTensor, np.ndarray]): Direct output from learned diffusion model. + timestep (Union[float, torch.FloatTensor]): Current discrete timestep in the diffusion chain. + sample (Union[torch.FloatTensor, np.ndarray]): Current instance of sample being created by diffusion process. + return_dict (bool, optional): Option for returning tuple rather than SchedulerOutput class. Defaults to True. + s_noise (float, optional): Scaling factor for the noise added to the sample. Defaults to 1.0. + Returns: + [`~schedulers.scheduling_utils.SchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.SchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + """ + step_index = self.index_for_timestep(timestep) + + # advance index counter by 1 + timestep_int = timestep.cpu().item() if torch.is_tensor(timestep) else timestep + self._index_counter[timestep_int] += 1 + + # Create a noise sampler if it hasn't been created yet + if self.noise_sampler is None: + min_sigma, max_sigma = self.sigmas[self.sigmas > 0].min(), self.sigmas.max() + self.noise_sampler = BrownianTreeNoiseSampler(sample, min_sigma, max_sigma, self.noise_sampler_seed) + + # Define functions to compute sigma and t from each other + def sigma_fn(_t: torch.FloatTensor) -> torch.FloatTensor: + return _t.neg().exp() + + def t_fn(_sigma: torch.FloatTensor) -> torch.FloatTensor: + return _sigma.log().neg() + + if self.state_in_first_order: + sigma = self.sigmas[step_index] + sigma_next = self.sigmas[step_index + 1] + else: + # 2nd order + sigma = self.sigmas[step_index - 1] + sigma_next = self.sigmas[step_index] + + # Set the midpoint and step size for the current step + midpoint_ratio = 0.5 + t, t_next = t_fn(sigma), t_fn(sigma_next) + delta_time = t_next - t + t_proposed = t + delta_time * midpoint_ratio + + # 1. compute predicted original sample (x_0) from sigma-scaled predicted noise + if self.config.prediction_type == "epsilon": + sigma_input = sigma if self.state_in_first_order else sigma_fn(t_proposed) + pred_original_sample = sample - sigma_input * model_output + elif self.config.prediction_type == "v_prediction": + sigma_input = sigma if self.state_in_first_order else sigma_fn(t_proposed) + pred_original_sample = model_output * (-sigma_input / (sigma_input**2 + 1) ** 0.5) + ( + sample / (sigma_input**2 + 1) + ) + elif self.config.prediction_type == "sample": + raise NotImplementedError("prediction_type not implemented yet: sample") + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, or `v_prediction`" + ) + + if sigma_next == 0: + derivative = (sample - pred_original_sample) / sigma + dt = sigma_next - sigma + prev_sample = sample + derivative * dt + else: + if self.state_in_first_order: + t_next = t_proposed + else: + sample = self.sample + + sigma_from = sigma_fn(t) + sigma_to = sigma_fn(t_next) + sigma_up = min(sigma_to, (sigma_to**2 * (sigma_from**2 - sigma_to**2) / sigma_from**2) ** 0.5) + sigma_down = (sigma_to**2 - sigma_up**2) ** 0.5 + ancestral_t = t_fn(sigma_down) + prev_sample = (sigma_fn(ancestral_t) / sigma_fn(t)) * sample - ( + t - ancestral_t + ).expm1() * pred_original_sample + prev_sample = prev_sample + self.noise_sampler(sigma_fn(t), sigma_fn(t_next)) * s_noise * sigma_up + + if self.state_in_first_order: + # store for 2nd order step + self.sample = sample + self.mid_point_sigma = sigma_fn(t_next) + else: + # free for "first order mode" + self.sample = None + self.mid_point_sigma = None + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + # Copied from diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + step_indices = [self.index_for_timestep(t, schedule_timesteps) for t in timesteps] + + sigma = sigmas[step_indices].flatten() + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) + + noisy_samples = original_samples + noise * sigma + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_singlestep.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_singlestep.py new file mode 100644 index 0000000000000000000000000000000000000000..93975a27fc6e3899c009b5576ed74753ea62abbb --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_dpmsolver_singlestep.py @@ -0,0 +1,737 @@ +# Copyright 2023 TSAIL Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/LuChengTHU/dpm-solver + +import math +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import logging +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class DPMSolverSinglestepScheduler(SchedulerMixin, ConfigMixin): + """ + DPM-Solver (and the improved version DPM-Solver++) is a fast dedicated high-order solver for diffusion ODEs with + the convergence order guarantee. Empirically, sampling by DPM-Solver with only 20 steps can generate high-quality + samples, and it can generate quite good samples even in only 10 steps. + + For more details, see the original paper: https://arxiv.org/abs/2206.00927 and https://arxiv.org/abs/2211.01095 + + Currently, we support the singlestep DPM-Solver for both noise prediction models and data prediction models. We + recommend to use `solver_order=2` for guided sampling, and `solver_order=3` for unconditional sampling. + + We also support the "dynamic thresholding" method in Imagen (https://arxiv.org/abs/2205.11487). For pixel-space + diffusion models, you can set both `algorithm_type="dpmsolver++"` and `thresholding=True` to use the dynamic + thresholding. Note that the thresholding method is unsuitable for latent-space diffusion models (such as + stable-diffusion). + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + solver_order (`int`, default `2`): + the order of DPM-Solver; can be `1` or `2` or `3`. We recommend to use `solver_order=2` for guided + sampling, and `solver_order=3` for unconditional sampling. + prediction_type (`str`, default `epsilon`): + indicates whether the model predicts the noise (epsilon), or the data / `x0`. One of `epsilon`, `sample`, + or `v-prediction`. + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + For pixel-space diffusion models, you can set both `algorithm_type=dpmsolver++` and `thresholding=True` to + use the dynamic thresholding. Note that the thresholding method is unsuitable for latent-space diffusion + models (such as stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True` and + `algorithm_type="dpmsolver++`. + algorithm_type (`str`, default `dpmsolver++`): + the algorithm type for the solver. Either `dpmsolver` or `dpmsolver++`. The `dpmsolver` type implements the + algorithms in https://arxiv.org/abs/2206.00927, and the `dpmsolver++` type implements the algorithms in + https://arxiv.org/abs/2211.01095. We recommend to use `dpmsolver++` with `solver_order=2` for guided + sampling (e.g. stable-diffusion). + solver_type (`str`, default `midpoint`): + the solver type for the second-order solver. Either `midpoint` or `heun`. The solver type slightly affects + the sample quality, especially for small number of steps. We empirically find that `midpoint` solvers are + slightly better, so we recommend to use the `midpoint` type. + lower_order_final (`bool`, default `True`): + whether to use lower-order solvers in the final steps. For singlestep schedulers, we recommend to enable + this to use up all the function evaluations. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + lambda_min_clipped (`float`, default `-inf`): + the clipping threshold for the minimum value of lambda(t) for numerical stability. This is critical for + cosine (squaredcos_cap_v2) noise schedule. + variance_type (`str`, *optional*): + Set to "learned" or "learned_range" for diffusion models that predict variance. For example, OpenAI's + guided-diffusion (https://github.com/openai/guided-diffusion) predicts both mean and variance of the + Gaussian distribution in the model's output. DPM-Solver only needs the "mean" output because it is based on + diffusion ODEs. whether the model's output contains the predicted Gaussian variance. For example, OpenAI's + guided-diffusion (https://github.com/openai/guided-diffusion) predicts both mean and variance of the + Gaussian distribution in the model's output. DPM-Solver only needs the "mean" output because it is based on + diffusion ODEs. + + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[np.ndarray] = None, + solver_order: int = 2, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + sample_max_value: float = 1.0, + algorithm_type: str = "dpmsolver++", + solver_type: str = "midpoint", + lower_order_final: bool = True, + use_karras_sigmas: Optional[bool] = False, + lambda_min_clipped: float = -float("inf"), + variance_type: Optional[str] = None, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + # Currently we only support VP-type noise schedule + self.alpha_t = torch.sqrt(self.alphas_cumprod) + self.sigma_t = torch.sqrt(1 - self.alphas_cumprod) + self.lambda_t = torch.log(self.alpha_t) - torch.log(self.sigma_t) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # settings for DPM-Solver + if algorithm_type not in ["dpmsolver", "dpmsolver++"]: + if algorithm_type == "deis": + self.register_to_config(algorithm_type="dpmsolver++") + else: + raise NotImplementedError(f"{algorithm_type} does is not implemented for {self.__class__}") + if solver_type not in ["midpoint", "heun"]: + if solver_type in ["logrho", "bh1", "bh2"]: + self.register_to_config(solver_type="midpoint") + else: + raise NotImplementedError(f"{solver_type} does is not implemented for {self.__class__}") + + # setable values + self.num_inference_steps = None + timesteps = np.linspace(0, num_train_timesteps - 1, num_train_timesteps, dtype=np.float32)[::-1].copy() + self.timesteps = torch.from_numpy(timesteps) + self.model_outputs = [None] * solver_order + self.sample = None + self.order_list = self.get_order_list(num_train_timesteps) + + def get_order_list(self, num_inference_steps: int) -> List[int]: + """ + Computes the solver order at each time step. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + steps = num_inference_steps + order = self.config.solver_order + if self.config.lower_order_final: + if order == 3: + if steps % 3 == 0: + orders = [1, 2, 3] * (steps // 3 - 1) + [1, 2] + [1] + elif steps % 3 == 1: + orders = [1, 2, 3] * (steps // 3) + [1] + else: + orders = [1, 2, 3] * (steps // 3) + [1, 2] + elif order == 2: + if steps % 2 == 0: + orders = [1, 2] * (steps // 2) + else: + orders = [1, 2] * (steps // 2) + [1] + elif order == 1: + orders = [1] * steps + else: + if order == 3: + orders = [1, 2, 3] * (steps // 3) + elif order == 2: + orders = [1, 2] * (steps // 2) + elif order == 1: + orders = [1] * steps + return orders + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + self.num_inference_steps = num_inference_steps + # Clipping the minimum of all lambda(t) for numerical stability. + # This is critical for cosine (squaredcos_cap_v2) noise schedule. + clipped_idx = torch.searchsorted(torch.flip(self.lambda_t, [0]), self.config.lambda_min_clipped) + timesteps = ( + np.linspace(0, self.config.num_train_timesteps - 1 - clipped_idx, num_inference_steps + 1) + .round()[::-1][:-1] + .copy() + .astype(np.int64) + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + if self.config.use_karras_sigmas: + log_sigmas = np.log(sigmas) + sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=num_inference_steps) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]).round() + timesteps = np.flip(timesteps).copy().astype(np.int64) + + self.sigmas = torch.from_numpy(sigmas) + + self.timesteps = torch.from_numpy(timesteps).to(device) + self.model_outputs = [None] * self.config.solver_order + self.sample = None + + if not self.config.lower_order_final and num_inference_steps % self.config.solver_order != 0: + logger.warn( + "Changing scheduler {self.config} to have `lower_order_final` set to True to handle uneven amount of inference steps. Please make sure to always use an even number of `num_inference steps when using `lower_order_final=True`." + ) + self.register_to_config(lower_order_final=True) + + self.order_list = self.get_order_list(num_inference_steps) + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._sigma_to_t + def _sigma_to_t(self, sigma, log_sigmas): + # get log sigma + log_sigma = np.log(sigma) + + # get distribution + dists = log_sigma - log_sigmas[:, np.newaxis] + + # get sigmas range + low_idx = np.cumsum((dists >= 0), axis=0).argmax(axis=0).clip(max=log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = log_sigmas[low_idx] + high = log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = np.clip(w, 0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.reshape(sigma.shape) + return t + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._convert_to_karras + def _convert_to_karras(self, in_sigmas: torch.FloatTensor, num_inference_steps) -> torch.FloatTensor: + """Constructs the noise schedule of Karras et al. (2022).""" + + sigma_min: float = in_sigmas[-1].item() + sigma_max: float = in_sigmas[0].item() + + rho = 7.0 # 7.0 is the value used in the paper + ramp = np.linspace(0, 1, num_inference_steps) + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho + return sigmas + + def convert_model_output( + self, model_output: torch.FloatTensor, timestep: int, sample: torch.FloatTensor + ) -> torch.FloatTensor: + """ + Convert the model output to the corresponding type that the algorithm (DPM-Solver / DPM-Solver++) needs. + + DPM-Solver is designed to discretize an integral of the noise prediction model, and DPM-Solver++ is designed to + discretize an integral of the data prediction model. So we need to first convert the model output to the + corresponding type to match the algorithm. + + Note that the algorithm type and the model type is decoupled. That is to say, we can use either DPM-Solver or + DPM-Solver++ for both noise prediction model and data prediction model. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the converted model output. + """ + # DPM-Solver++ needs to solve an integral of the data prediction model. + if self.config.algorithm_type == "dpmsolver++": + if self.config.prediction_type == "epsilon": + # DPM-Solver and DPM-Solver++ only need the "mean" output. + if self.config.variance_type in ["learned_range"]: + model_output = model_output[:, :3] + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = (sample - sigma_t * model_output) / alpha_t + elif self.config.prediction_type == "sample": + x0_pred = model_output + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = alpha_t * sample - sigma_t * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the DPMSolverSinglestepScheduler." + ) + + if self.config.thresholding: + x0_pred = self._threshold_sample(x0_pred) + + return x0_pred + # DPM-Solver needs to solve an integral of the noise prediction model. + elif self.config.algorithm_type == "dpmsolver": + if self.config.prediction_type == "epsilon": + # DPM-Solver and DPM-Solver++ only need the "mean" output. + if self.config.variance_type in ["learned_range"]: + model_output = model_output[:, :3] + return model_output + elif self.config.prediction_type == "sample": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + epsilon = (sample - alpha_t * model_output) / sigma_t + return epsilon + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + epsilon = alpha_t * model_output + sigma_t * sample + return epsilon + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the DPMSolverSinglestepScheduler." + ) + + def dpm_solver_first_order_update( + self, + model_output: torch.FloatTensor, + timestep: int, + prev_timestep: int, + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + One step for the first-order DPM-Solver (equivalent to DDIM). + + See https://arxiv.org/abs/2206.00927 for the detailed derivation. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + lambda_t, lambda_s = self.lambda_t[prev_timestep], self.lambda_t[timestep] + alpha_t, alpha_s = self.alpha_t[prev_timestep], self.alpha_t[timestep] + sigma_t, sigma_s = self.sigma_t[prev_timestep], self.sigma_t[timestep] + h = lambda_t - lambda_s + if self.config.algorithm_type == "dpmsolver++": + x_t = (sigma_t / sigma_s) * sample - (alpha_t * (torch.exp(-h) - 1.0)) * model_output + elif self.config.algorithm_type == "dpmsolver": + x_t = (alpha_t / alpha_s) * sample - (sigma_t * (torch.exp(h) - 1.0)) * model_output + return x_t + + def singlestep_dpm_solver_second_order_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + One step for the second-order singlestep DPM-Solver. + + It computes the solution at time `prev_timestep` from the time `timestep_list[-2]`. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + t, s0, s1 = prev_timestep, timestep_list[-1], timestep_list[-2] + m0, m1 = model_output_list[-1], model_output_list[-2] + lambda_t, lambda_s0, lambda_s1 = self.lambda_t[t], self.lambda_t[s0], self.lambda_t[s1] + alpha_t, alpha_s1 = self.alpha_t[t], self.alpha_t[s1] + sigma_t, sigma_s1 = self.sigma_t[t], self.sigma_t[s1] + h, h_0 = lambda_t - lambda_s1, lambda_s0 - lambda_s1 + r0 = h_0 / h + D0, D1 = m1, (1.0 / r0) * (m0 - m1) + if self.config.algorithm_type == "dpmsolver++": + # See https://arxiv.org/abs/2211.01095 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (sigma_t / sigma_s1) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + - 0.5 * (alpha_t * (torch.exp(-h) - 1.0)) * D1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (sigma_t / sigma_s1) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + + (alpha_t * ((torch.exp(-h) - 1.0) / h + 1.0)) * D1 + ) + elif self.config.algorithm_type == "dpmsolver": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (alpha_t / alpha_s1) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - 0.5 * (sigma_t * (torch.exp(h) - 1.0)) * D1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (alpha_t / alpha_s1) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1 + ) + return x_t + + def singlestep_dpm_solver_third_order_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + ) -> torch.FloatTensor: + """ + One step for the third-order singlestep DPM-Solver. + + It computes the solution at time `prev_timestep` from the time `timestep_list[-3]`. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + t, s0, s1, s2 = prev_timestep, timestep_list[-1], timestep_list[-2], timestep_list[-3] + m0, m1, m2 = model_output_list[-1], model_output_list[-2], model_output_list[-3] + lambda_t, lambda_s0, lambda_s1, lambda_s2 = ( + self.lambda_t[t], + self.lambda_t[s0], + self.lambda_t[s1], + self.lambda_t[s2], + ) + alpha_t, alpha_s2 = self.alpha_t[t], self.alpha_t[s2] + sigma_t, sigma_s2 = self.sigma_t[t], self.sigma_t[s2] + h, h_0, h_1 = lambda_t - lambda_s2, lambda_s0 - lambda_s2, lambda_s1 - lambda_s2 + r0, r1 = h_0 / h, h_1 / h + D0 = m2 + D1_0, D1_1 = (1.0 / r1) * (m1 - m2), (1.0 / r0) * (m0 - m2) + D1 = (r0 * D1_0 - r1 * D1_1) / (r0 - r1) + D2 = 2.0 * (D1_1 - D1_0) / (r0 - r1) + if self.config.algorithm_type == "dpmsolver++": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (sigma_t / sigma_s2) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + + (alpha_t * ((torch.exp(-h) - 1.0) / h + 1.0)) * D1_1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (sigma_t / sigma_s2) * sample + - (alpha_t * (torch.exp(-h) - 1.0)) * D0 + + (alpha_t * ((torch.exp(-h) - 1.0) / h + 1.0)) * D1 + - (alpha_t * ((torch.exp(-h) - 1.0 + h) / h**2 - 0.5)) * D2 + ) + elif self.config.algorithm_type == "dpmsolver": + # See https://arxiv.org/abs/2206.00927 for detailed derivations + if self.config.solver_type == "midpoint": + x_t = ( + (alpha_t / alpha_s2) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1_1 + ) + elif self.config.solver_type == "heun": + x_t = ( + (alpha_t / alpha_s2) * sample + - (sigma_t * (torch.exp(h) - 1.0)) * D0 + - (sigma_t * ((torch.exp(h) - 1.0) / h - 1.0)) * D1 + - (sigma_t * ((torch.exp(h) - 1.0 - h) / h**2 - 0.5)) * D2 + ) + return x_t + + def singlestep_dpm_solver_update( + self, + model_output_list: List[torch.FloatTensor], + timestep_list: List[int], + prev_timestep: int, + sample: torch.FloatTensor, + order: int, + ) -> torch.FloatTensor: + """ + One step for the singlestep DPM-Solver. + + Args: + model_output_list (`List[torch.FloatTensor]`): + direct outputs from learned diffusion model at current and latter timesteps. + timestep (`int`): current and latter discrete timestep in the diffusion chain. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + order (`int`): + the solver order at this step. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + if order == 1: + return self.dpm_solver_first_order_update(model_output_list[-1], timestep_list[-1], prev_timestep, sample) + elif order == 2: + return self.singlestep_dpm_solver_second_order_update( + model_output_list, timestep_list, prev_timestep, sample + ) + elif order == 3: + return self.singlestep_dpm_solver_third_order_update( + model_output_list, timestep_list, prev_timestep, sample + ) + else: + raise ValueError(f"Order must be 1, 2, 3, got {order}") + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Step function propagating the sample with the singlestep DPM-Solver. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~scheduling_utils.SchedulerOutput`] or `tuple`: [`~scheduling_utils.SchedulerOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero() + if len(step_index) == 0: + step_index = len(self.timesteps) - 1 + else: + step_index = step_index.item() + prev_timestep = 0 if step_index == len(self.timesteps) - 1 else self.timesteps[step_index + 1] + + model_output = self.convert_model_output(model_output, timestep, sample) + for i in range(self.config.solver_order - 1): + self.model_outputs[i] = self.model_outputs[i + 1] + self.model_outputs[-1] = model_output + + order = self.order_list[step_index] + + # For img2img denoising might start with order>1 which is not possible + # In this case make sure that the first two steps are both order=1 + while self.model_outputs[-order] is None: + order -= 1 + + # For single-step solvers, we use the initial value at each time with order = 1. + if order == 1: + self.sample = sample + + timestep_list = [self.timesteps[step_index - i] for i in range(order - 1, 0, -1)] + [timestep] + prev_sample = self.singlestep_dpm_solver_update( + self.model_outputs, timestep_list, prev_timestep, self.sample, order + ) + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def scale_model_input(self, sample: torch.FloatTensor, *args, **kwargs) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_euler_ancestral_discrete.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_euler_ancestral_discrete.py new file mode 100644 index 0000000000000000000000000000000000000000..065f657032e6ef21bd022f938a3b1e7ada334436 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_euler_ancestral_discrete.py @@ -0,0 +1,358 @@ +# Copyright 2023 Katherine Crowson and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, logging, randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +@dataclass +# Copied from diffusers.schedulers.scheduling_ddpm.DDPMSchedulerOutput with DDPM->EulerAncestralDiscrete +class EulerAncestralDiscreteSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class EulerAncestralDiscreteScheduler(SchedulerMixin, ConfigMixin): + """ + Ancestral sampling with Euler method steps. Based on the original k-diffusion implementation by Katherine Crowson: + https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L72 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear` or `scaled_linear`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + prediction_type: str = "epsilon", + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + sigmas = np.concatenate([sigmas[::-1], [0.0]]).astype(np.float32) + self.sigmas = torch.from_numpy(sigmas) + + # setable values + self.num_inference_steps = None + timesteps = np.linspace(0, num_train_timesteps - 1, num_train_timesteps, dtype=float)[::-1].copy() + self.timesteps = torch.from_numpy(timesteps) + self.is_scale_input_called = False + + @property + def init_noise_sigma(self): + # standard deviation of the initial noise distribution + if self.config.timestep_spacing in ["linspace", "trailing"]: + return self.sigmas.max() + + return (self.sigmas.max() ** 2 + 1) ** 0.5 + + def scale_model_input( + self, sample: torch.FloatTensor, timestep: Union[float, torch.FloatTensor] + ) -> torch.FloatTensor: + """ + Scales the denoising model input by `(sigma**2 + 1) ** 0.5` to match the Euler algorithm. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`float` or `torch.FloatTensor`): the current timestep in the diffusion chain + + Returns: + `torch.FloatTensor`: scaled input sample + """ + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero().item() + sigma = self.sigmas[step_index] + sample = sample / ((sigma**2 + 1) ** 0.5) + self.is_scale_input_called = True + return sample + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + self.num_inference_steps = num_inference_steps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps, dtype=float)[ + ::-1 + ].copy() + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(float) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(self.config.num_train_timesteps, 0, -step_ratio)).round().copy().astype(float) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + sigmas = np.interp(timesteps, np.arange(0, len(sigmas)), sigmas) + sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32) + self.sigmas = torch.from_numpy(sigmas).to(device=device) + if str(device).startswith("mps"): + # mps does not support float64 + self.timesteps = torch.from_numpy(timesteps).to(device, dtype=torch.float32) + else: + self.timesteps = torch.from_numpy(timesteps).to(device=device) + + def step( + self, + model_output: torch.FloatTensor, + timestep: Union[float, torch.FloatTensor], + sample: torch.FloatTensor, + generator: Optional[torch.Generator] = None, + return_dict: bool = True, + ) -> Union[EulerAncestralDiscreteSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`float`): current timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + generator (`torch.Generator`, optional): Random number generator. + return_dict (`bool`): option for returning tuple rather than EulerAncestralDiscreteSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.EulerAncestralDiscreteSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.EulerAncestralDiscreteSchedulerOutput`] if `return_dict` is True, otherwise + a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + + if ( + isinstance(timestep, int) + or isinstance(timestep, torch.IntTensor) + or isinstance(timestep, torch.LongTensor) + ): + raise ValueError( + ( + "Passing integer indices (e.g. from `enumerate(timesteps)`) as timesteps to" + " `EulerDiscreteScheduler.step()` is not supported. Make sure to pass" + " one of the `scheduler.timesteps` as a timestep." + ), + ) + + if not self.is_scale_input_called: + logger.warning( + "The `scale_model_input` function should be called before `step` to ensure correct denoising. " + "See `StableDiffusionPipeline` for a usage example." + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + + step_index = (self.timesteps == timestep).nonzero().item() + sigma = self.sigmas[step_index] + + # 1. compute predicted original sample (x_0) from sigma-scaled predicted noise + if self.config.prediction_type == "epsilon": + pred_original_sample = sample - sigma * model_output + elif self.config.prediction_type == "v_prediction": + # * c_out + input * c_skip + pred_original_sample = model_output * (-sigma / (sigma**2 + 1) ** 0.5) + (sample / (sigma**2 + 1)) + elif self.config.prediction_type == "sample": + raise NotImplementedError("prediction_type not implemented yet: sample") + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, or `v_prediction`" + ) + + sigma_from = self.sigmas[step_index] + sigma_to = self.sigmas[step_index + 1] + sigma_up = (sigma_to**2 * (sigma_from**2 - sigma_to**2) / sigma_from**2) ** 0.5 + sigma_down = (sigma_to**2 - sigma_up**2) ** 0.5 + + # 2. Convert to an ODE derivative + derivative = (sample - pred_original_sample) / sigma + + dt = sigma_down - sigma + + prev_sample = sample + derivative * dt + + device = model_output.device + noise = randn_tensor(model_output.shape, dtype=model_output.dtype, device=device, generator=generator) + + prev_sample = prev_sample + noise * sigma_up + + if not return_dict: + return (prev_sample,) + + return EulerAncestralDiscreteSchedulerOutput( + prev_sample=prev_sample, pred_original_sample=pred_original_sample + ) + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps] + + sigma = sigmas[step_indices].flatten() + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) + + noisy_samples = original_samples + noise * sigma + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_euler_discrete.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_euler_discrete.py new file mode 100644 index 0000000000000000000000000000000000000000..cb126d4b953cd28e23d048c4f1e2cf8ed90cdac0 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_euler_discrete.py @@ -0,0 +1,432 @@ +# Copyright 2023 Katherine Crowson and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, logging, randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +@dataclass +# Copied from diffusers.schedulers.scheduling_ddpm.DDPMSchedulerOutput with DDPM->EulerDiscrete +class EulerDiscreteSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class EulerDiscreteScheduler(SchedulerMixin, ConfigMixin): + """ + Euler scheduler (Algorithm 2) from Karras et al. (2022) https://arxiv.org/abs/2206.00364. . Based on the original + k-diffusion implementation by Katherine Crowson: + https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L51 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear` or `scaled_linear`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + prediction_type (`str`, default `"epsilon"`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + interpolation_type (`str`, default `"linear"`, optional): + interpolation type to compute intermediate sigmas for the scheduler denoising steps. Should be one of + [`"linear"`, `"log_linear"`]. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + prediction_type: str = "epsilon", + interpolation_type: str = "linear", + use_karras_sigmas: Optional[bool] = False, + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + sigmas = np.concatenate([sigmas[::-1], [0.0]]).astype(np.float32) + self.sigmas = torch.from_numpy(sigmas) + + # setable values + self.num_inference_steps = None + timesteps = np.linspace(0, num_train_timesteps - 1, num_train_timesteps, dtype=float)[::-1].copy() + self.timesteps = torch.from_numpy(timesteps) + self.is_scale_input_called = False + self.use_karras_sigmas = use_karras_sigmas + + @property + def init_noise_sigma(self): + # standard deviation of the initial noise distribution + if self.config.timestep_spacing in ["linspace", "trailing"]: + return self.sigmas.max() + + return (self.sigmas.max() ** 2 + 1) ** 0.5 + + def scale_model_input( + self, sample: torch.FloatTensor, timestep: Union[float, torch.FloatTensor] + ) -> torch.FloatTensor: + """ + Scales the denoising model input by `(sigma**2 + 1) ** 0.5` to match the Euler algorithm. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`float` or `torch.FloatTensor`): the current timestep in the diffusion chain + + Returns: + `torch.FloatTensor`: scaled input sample + """ + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero().item() + sigma = self.sigmas[step_index] + + sample = sample / ((sigma**2 + 1) ** 0.5) + + self.is_scale_input_called = True + return sample + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + self.num_inference_steps = num_inference_steps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps, dtype=float)[ + ::-1 + ].copy() + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(float) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(self.config.num_train_timesteps, 0, -step_ratio)).round().copy().astype(float) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + log_sigmas = np.log(sigmas) + + if self.config.interpolation_type == "linear": + sigmas = np.interp(timesteps, np.arange(0, len(sigmas)), sigmas) + elif self.config.interpolation_type == "log_linear": + sigmas = torch.linspace(np.log(sigmas[-1]), np.log(sigmas[0]), num_inference_steps + 1).exp() + else: + raise ValueError( + f"{self.config.interpolation_type} is not implemented. Please specify interpolation_type to either" + " 'linear' or 'log_linear'" + ) + + if self.use_karras_sigmas: + sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=self.num_inference_steps) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]) + + sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32) + self.sigmas = torch.from_numpy(sigmas).to(device=device) + if str(device).startswith("mps"): + # mps does not support float64 + self.timesteps = torch.from_numpy(timesteps).to(device, dtype=torch.float32) + else: + self.timesteps = torch.from_numpy(timesteps).to(device=device) + + def _sigma_to_t(self, sigma, log_sigmas): + # get log sigma + log_sigma = np.log(sigma) + + # get distribution + dists = log_sigma - log_sigmas[:, np.newaxis] + + # get sigmas range + low_idx = np.cumsum((dists >= 0), axis=0).argmax(axis=0).clip(max=log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = log_sigmas[low_idx] + high = log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = np.clip(w, 0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.reshape(sigma.shape) + return t + + # Copied from https://github.com/crowsonkb/k-diffusion/blob/686dbad0f39640ea25c8a8c6a6e56bb40eacefa2/k_diffusion/sampling.py#L17 + def _convert_to_karras(self, in_sigmas: torch.FloatTensor, num_inference_steps) -> torch.FloatTensor: + """Constructs the noise schedule of Karras et al. (2022).""" + + sigma_min: float = in_sigmas[-1].item() + sigma_max: float = in_sigmas[0].item() + + rho = 7.0 # 7.0 is the value used in the paper + ramp = np.linspace(0, 1, num_inference_steps) + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho + return sigmas + + def step( + self, + model_output: torch.FloatTensor, + timestep: Union[float, torch.FloatTensor], + sample: torch.FloatTensor, + s_churn: float = 0.0, + s_tmin: float = 0.0, + s_tmax: float = float("inf"), + s_noise: float = 1.0, + generator: Optional[torch.Generator] = None, + return_dict: bool = True, + ) -> Union[EulerDiscreteSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`float`): current timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + s_churn (`float`) + s_tmin (`float`) + s_tmax (`float`) + s_noise (`float`) + generator (`torch.Generator`, optional): Random number generator. + return_dict (`bool`): option for returning tuple rather than EulerDiscreteSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.EulerDiscreteSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.EulerDiscreteSchedulerOutput`] if `return_dict` is True, otherwise a + `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + + if ( + isinstance(timestep, int) + or isinstance(timestep, torch.IntTensor) + or isinstance(timestep, torch.LongTensor) + ): + raise ValueError( + ( + "Passing integer indices (e.g. from `enumerate(timesteps)`) as timesteps to" + " `EulerDiscreteScheduler.step()` is not supported. Make sure to pass" + " one of the `scheduler.timesteps` as a timestep." + ), + ) + + if not self.is_scale_input_called: + logger.warning( + "The `scale_model_input` function should be called before `step` to ensure correct denoising. " + "See `StableDiffusionPipeline` for a usage example." + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + + step_index = (self.timesteps == timestep).nonzero().item() + sigma = self.sigmas[step_index] + + gamma = min(s_churn / (len(self.sigmas) - 1), 2**0.5 - 1) if s_tmin <= sigma <= s_tmax else 0.0 + + noise = randn_tensor( + model_output.shape, dtype=model_output.dtype, device=model_output.device, generator=generator + ) + + eps = noise * s_noise + sigma_hat = sigma * (gamma + 1) + + if gamma > 0: + sample = sample + eps * (sigma_hat**2 - sigma**2) ** 0.5 + + # 1. compute predicted original sample (x_0) from sigma-scaled predicted noise + # NOTE: "original_sample" should not be an expected prediction_type but is left in for + # backwards compatibility + if self.config.prediction_type == "original_sample" or self.config.prediction_type == "sample": + pred_original_sample = model_output + elif self.config.prediction_type == "epsilon": + pred_original_sample = sample - sigma_hat * model_output + elif self.config.prediction_type == "v_prediction": + # * c_out + input * c_skip + pred_original_sample = model_output * (-sigma / (sigma**2 + 1) ** 0.5) + (sample / (sigma**2 + 1)) + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, or `v_prediction`" + ) + + # 2. Convert to an ODE derivative + derivative = (sample - pred_original_sample) / sigma_hat + + dt = self.sigmas[step_index + 1] - sigma_hat + + prev_sample = sample + derivative * dt + + if not return_dict: + return (prev_sample,) + + return EulerDiscreteSchedulerOutput(prev_sample=prev_sample, pred_original_sample=pred_original_sample) + + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps] + + sigma = sigmas[step_indices].flatten() + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) + + noisy_samples = original_samples + noise * sigma + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_heun_discrete.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_heun_discrete.py new file mode 100644 index 0000000000000000000000000000000000000000..5f694fd60fc9f7f596f0d28d19cc231a26712fd1 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_heun_discrete.py @@ -0,0 +1,426 @@ +# Copyright 2023 Katherine Crowson, The HuggingFace Team and hlky. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from collections import defaultdict +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class HeunDiscreteScheduler(SchedulerMixin, ConfigMixin): + """ + Implements Algorithm 2 (Heun steps) from Karras et al. (2022). for discrete beta schedules. Based on the original + k-diffusion implementation by Katherine Crowson: + https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L90 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. beta_start (`float`): the + starting `beta` value of inference. beta_end (`float`): the final `beta` value. beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear` or `scaled_linear`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf). + clip_sample (`bool`, default `True`): + option to clip predicted sample for numerical stability. + clip_sample_range (`float`, default `1.0`): + the maximum magnitude for sample clipping. Valid only when `clip_sample=True`. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 2 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.00085, # sensible defaults + beta_end: float = 0.012, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + prediction_type: str = "epsilon", + use_karras_sigmas: Optional[bool] = False, + clip_sample: Optional[bool] = False, + clip_sample_range: float = 1.0, + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps, alpha_transform_type="cosine") + elif beta_schedule == "exp": + self.betas = betas_for_alpha_bar(num_train_timesteps, alpha_transform_type="exp") + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + # set all values + self.set_timesteps(num_train_timesteps, None, num_train_timesteps) + self.use_karras_sigmas = use_karras_sigmas + + def index_for_timestep(self, timestep, schedule_timesteps=None): + if schedule_timesteps is None: + schedule_timesteps = self.timesteps + + indices = (schedule_timesteps == timestep).nonzero() + + # The sigma index that is taken for the **very** first `step` + # is always the second index (or the last index if there is only 1) + # This way we can ensure we don't accidentally skip a sigma in + # case we start in the middle of the denoising schedule (e.g. for image-to-image) + if len(self._index_counter) == 0: + pos = 1 if len(indices) > 1 else 0 + else: + timestep_int = timestep.cpu().item() if torch.is_tensor(timestep) else timestep + pos = self._index_counter[timestep_int] + + return indices[pos].item() + + @property + def init_noise_sigma(self): + # standard deviation of the initial noise distribution + if self.config.timestep_spacing in ["linspace", "trailing"]: + return self.sigmas.max() + + return (self.sigmas.max() ** 2 + 1) ** 0.5 + + def scale_model_input( + self, + sample: torch.FloatTensor, + timestep: Union[float, torch.FloatTensor], + ) -> torch.FloatTensor: + """ + Args: + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + sample (`torch.FloatTensor`): input sample timestep (`int`, optional): current timestep + Returns: + `torch.FloatTensor`: scaled input sample + """ + step_index = self.index_for_timestep(timestep) + + sigma = self.sigmas[step_index] + sample = sample / ((sigma**2 + 1) ** 0.5) + return sample + + def set_timesteps( + self, + num_inference_steps: int, + device: Union[str, torch.device] = None, + num_train_timesteps: Optional[int] = None, + ): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + self.num_inference_steps = num_inference_steps + + num_train_timesteps = num_train_timesteps or self.config.num_train_timesteps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = np.linspace(0, num_train_timesteps - 1, num_inference_steps, dtype=float)[::-1].copy() + elif self.config.timestep_spacing == "leading": + step_ratio = num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(float) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(num_train_timesteps, 0, -step_ratio)).round().copy().astype(float) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + log_sigmas = np.log(sigmas) + sigmas = np.interp(timesteps, np.arange(0, len(sigmas)), sigmas) + + if self.config.use_karras_sigmas: + sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=self.num_inference_steps) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]) + + sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32) + sigmas = torch.from_numpy(sigmas).to(device=device) + self.sigmas = torch.cat([sigmas[:1], sigmas[1:-1].repeat_interleave(2), sigmas[-1:]]) + + timesteps = torch.from_numpy(timesteps) + timesteps = torch.cat([timesteps[:1], timesteps[1:].repeat_interleave(2)]) + + if str(device).startswith("mps"): + # mps does not support float64 + self.timesteps = timesteps.to(device, dtype=torch.float32) + else: + self.timesteps = timesteps.to(device=device) + + # empty dt and derivative + self.prev_derivative = None + self.dt = None + + # for exp beta schedules, such as the one for `pipeline_shap_e.py` + # we need an index counter + self._index_counter = defaultdict(int) + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._sigma_to_t + def _sigma_to_t(self, sigma, log_sigmas): + # get log sigma + log_sigma = np.log(sigma) + + # get distribution + dists = log_sigma - log_sigmas[:, np.newaxis] + + # get sigmas range + low_idx = np.cumsum((dists >= 0), axis=0).argmax(axis=0).clip(max=log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = log_sigmas[low_idx] + high = log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = np.clip(w, 0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.reshape(sigma.shape) + return t + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._convert_to_karras + def _convert_to_karras(self, in_sigmas: torch.FloatTensor, num_inference_steps) -> torch.FloatTensor: + """Constructs the noise schedule of Karras et al. (2022).""" + + sigma_min: float = in_sigmas[-1].item() + sigma_max: float = in_sigmas[0].item() + + rho = 7.0 # 7.0 is the value used in the paper + ramp = np.linspace(0, 1, num_inference_steps) + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho + return sigmas + + @property + def state_in_first_order(self): + return self.dt is None + + def step( + self, + model_output: Union[torch.FloatTensor, np.ndarray], + timestep: Union[float, torch.FloatTensor], + sample: Union[torch.FloatTensor, np.ndarray], + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Args: + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + model_output (`torch.FloatTensor` or `np.ndarray`): direct output from learned diffusion model. timestep + (`int`): current discrete timestep in the diffusion chain. sample (`torch.FloatTensor` or `np.ndarray`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + Returns: + [`~schedulers.scheduling_utils.SchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.SchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + """ + step_index = self.index_for_timestep(timestep) + + # advance index counter by 1 + timestep_int = timestep.cpu().item() if torch.is_tensor(timestep) else timestep + self._index_counter[timestep_int] += 1 + + if self.state_in_first_order: + sigma = self.sigmas[step_index] + sigma_next = self.sigmas[step_index + 1] + else: + # 2nd order / Heun's method + sigma = self.sigmas[step_index - 1] + sigma_next = self.sigmas[step_index] + + # currently only gamma=0 is supported. This usually works best anyways. + # We can support gamma in the future but then need to scale the timestep before + # passing it to the model which requires a change in API + gamma = 0 + sigma_hat = sigma * (gamma + 1) # Note: sigma_hat == sigma for now + + # 1. compute predicted original sample (x_0) from sigma-scaled predicted noise + if self.config.prediction_type == "epsilon": + sigma_input = sigma_hat if self.state_in_first_order else sigma_next + pred_original_sample = sample - sigma_input * model_output + elif self.config.prediction_type == "v_prediction": + sigma_input = sigma_hat if self.state_in_first_order else sigma_next + pred_original_sample = model_output * (-sigma_input / (sigma_input**2 + 1) ** 0.5) + ( + sample / (sigma_input**2 + 1) + ) + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, or `v_prediction`" + ) + + if self.config.clip_sample: + pred_original_sample = pred_original_sample.clamp( + -self.config.clip_sample_range, self.config.clip_sample_range + ) + + if self.state_in_first_order: + # 2. Convert to an ODE derivative for 1st order + derivative = (sample - pred_original_sample) / sigma_hat + # 3. delta timestep + dt = sigma_next - sigma_hat + + # store for 2nd order step + self.prev_derivative = derivative + self.dt = dt + self.sample = sample + else: + # 2. 2nd order / Heun's method + derivative = (sample - pred_original_sample) / sigma_next + derivative = (self.prev_derivative + derivative) / 2 + + # 3. take prev timestep & sample + dt = self.dt + sample = self.sample + + # free dt and derivative + # Note, this puts the scheduler in "first order mode" + self.prev_derivative = None + self.dt = None + self.sample = None + + prev_sample = sample + derivative * dt + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + step_indices = [self.index_for_timestep(t, schedule_timesteps) for t in timesteps] + + sigma = sigmas[step_indices].flatten() + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) + + noisy_samples = original_samples + noise * sigma + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ipndm.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ipndm.py new file mode 100644 index 0000000000000000000000000000000000000000..80e521590782de6bc14e9b8c29642c7595fafc93 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_ipndm.py @@ -0,0 +1,161 @@ +# Copyright 2023 Zhejiang University Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils import SchedulerMixin, SchedulerOutput + + +class IPNDMScheduler(SchedulerMixin, ConfigMixin): + """ + Improved Pseudo numerical methods for diffusion models (iPNDM) ported from @crowsonkb's amazing k-diffusion + [library](https://github.com/crowsonkb/v-diffusion-pytorch/blob/987f8985e38208345c1959b0ea767a625831cc9b/diffusion/sampling.py#L296) + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2202.09778 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + """ + + order = 1 + + @register_to_config + def __init__( + self, num_train_timesteps: int = 1000, trained_betas: Optional[Union[np.ndarray, List[float]]] = None + ): + # set `betas`, `alphas`, `timesteps` + self.set_timesteps(num_train_timesteps) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # For now we only support F-PNDM, i.e. the runge-kutta method + # For more information on the algorithm please take a look at the paper: https://arxiv.org/pdf/2202.09778.pdf + # mainly at formula (9), (12), (13) and the Algorithm 2. + self.pndm_order = 4 + + # running values + self.ets = [] + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + self.num_inference_steps = num_inference_steps + steps = torch.linspace(1, 0, num_inference_steps + 1)[:-1] + steps = torch.cat([steps, torch.tensor([0.0])]) + + if self.config.trained_betas is not None: + self.betas = torch.tensor(self.config.trained_betas, dtype=torch.float32) + else: + self.betas = torch.sin(steps * math.pi / 2) ** 2 + + self.alphas = (1.0 - self.betas**2) ** 0.5 + + timesteps = (torch.atan2(self.betas, self.alphas) / math.pi * 2)[:-1] + self.timesteps = timesteps.to(device) + + self.ets = [] + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Step function propagating the sample with the linear multi-step method. This has one forward pass with multiple + times to approximate the solution. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~scheduling_utils.SchedulerOutput`] or `tuple`: [`~scheduling_utils.SchedulerOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + timestep_index = (self.timesteps == timestep).nonzero().item() + prev_timestep_index = timestep_index + 1 + + ets = sample * self.betas[timestep_index] + model_output * self.alphas[timestep_index] + self.ets.append(ets) + + if len(self.ets) == 1: + ets = self.ets[-1] + elif len(self.ets) == 2: + ets = (3 * self.ets[-1] - self.ets[-2]) / 2 + elif len(self.ets) == 3: + ets = (23 * self.ets[-1] - 16 * self.ets[-2] + 5 * self.ets[-3]) / 12 + else: + ets = (1 / 24) * (55 * self.ets[-1] - 59 * self.ets[-2] + 37 * self.ets[-3] - 9 * self.ets[-4]) + + prev_sample = self._get_prev_sample(sample, timestep_index, prev_timestep_index, ets) + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def scale_model_input(self, sample: torch.FloatTensor, *args, **kwargs) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def _get_prev_sample(self, sample, timestep_index, prev_timestep_index, ets): + alpha = self.alphas[timestep_index] + sigma = self.betas[timestep_index] + + next_alpha = self.alphas[prev_timestep_index] + next_sigma = self.betas[prev_timestep_index] + + pred = (sample - sigma * ets) / max(alpha, 1e-8) + prev_sample = next_alpha * pred + ets * next_sigma + + return prev_sample + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_k_dpm_2_ancestral_discrete.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_k_dpm_2_ancestral_discrete.py new file mode 100644 index 0000000000000000000000000000000000000000..bdf9379b9b90a53e3c8aad20a69e9ab7bffc691e --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_k_dpm_2_ancestral_discrete.py @@ -0,0 +1,420 @@ +# Copyright 2023 Katherine Crowson, The HuggingFace Team and hlky. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from collections import defaultdict +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import randn_tensor +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class KDPM2AncestralDiscreteScheduler(SchedulerMixin, ConfigMixin): + """ + Scheduler created by @crowsonkb in [k_diffusion](https://github.com/crowsonkb/k-diffusion), see: + https://github.com/crowsonkb/k-diffusion/blob/5b3af030dd83e0297272d861c19477735d0317ec/k_diffusion/sampling.py#L188 + + Scheduler inspired by DPM-Solver-2 and Algorthim 2 from Karras et al. (2022). + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. beta_start (`float`): the + starting `beta` value of inference. beta_end (`float`): the final `beta` value. beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear` or `scaled_linear`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + options to clip the variance used when adding noise to the denoised sample. Choose from `fixed_small`, + `fixed_small_log`, `fixed_large`, `fixed_large_log`, `learned` or `learned_range`. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 2 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.00085, # sensible defaults + beta_end: float = 0.012, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + prediction_type: str = "epsilon", + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + # set all values + self.set_timesteps(num_train_timesteps, None, num_train_timesteps) + + # Copied from diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler.index_for_timestep + def index_for_timestep(self, timestep, schedule_timesteps=None): + if schedule_timesteps is None: + schedule_timesteps = self.timesteps + + indices = (schedule_timesteps == timestep).nonzero() + + # The sigma index that is taken for the **very** first `step` + # is always the second index (or the last index if there is only 1) + # This way we can ensure we don't accidentally skip a sigma in + # case we start in the middle of the denoising schedule (e.g. for image-to-image) + if len(self._index_counter) == 0: + pos = 1 if len(indices) > 1 else 0 + else: + timestep_int = timestep.cpu().item() if torch.is_tensor(timestep) else timestep + pos = self._index_counter[timestep_int] + + return indices[pos].item() + + @property + def init_noise_sigma(self): + # standard deviation of the initial noise distribution + if self.config.timestep_spacing in ["linspace", "trailing"]: + return self.sigmas.max() + + return (self.sigmas.max() ** 2 + 1) ** 0.5 + + def scale_model_input( + self, + sample: torch.FloatTensor, + timestep: Union[float, torch.FloatTensor], + ) -> torch.FloatTensor: + """ + Args: + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + sample (`torch.FloatTensor`): input sample timestep (`int`, optional): current timestep + Returns: + `torch.FloatTensor`: scaled input sample + """ + step_index = self.index_for_timestep(timestep) + + if self.state_in_first_order: + sigma = self.sigmas[step_index] + else: + sigma = self.sigmas_interpol[step_index - 1] + + sample = sample / ((sigma**2 + 1) ** 0.5) + return sample + + def set_timesteps( + self, + num_inference_steps: int, + device: Union[str, torch.device] = None, + num_train_timesteps: Optional[int] = None, + ): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + self.num_inference_steps = num_inference_steps + + num_train_timesteps = num_train_timesteps or self.config.num_train_timesteps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = np.linspace(0, num_train_timesteps - 1, num_inference_steps, dtype=float)[::-1].copy() + elif self.config.timestep_spacing == "leading": + step_ratio = num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(float) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(num_train_timesteps, 0, -step_ratio)).round().copy().astype(float) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + self.log_sigmas = torch.from_numpy(np.log(sigmas)).to(device) + + sigmas = np.interp(timesteps, np.arange(0, len(sigmas)), sigmas) + sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32) + sigmas = torch.from_numpy(sigmas).to(device=device) + + # compute up and down sigmas + sigmas_next = sigmas.roll(-1) + sigmas_next[-1] = 0.0 + sigmas_up = (sigmas_next**2 * (sigmas**2 - sigmas_next**2) / sigmas**2) ** 0.5 + sigmas_down = (sigmas_next**2 - sigmas_up**2) ** 0.5 + sigmas_down[-1] = 0.0 + + # compute interpolated sigmas + sigmas_interpol = sigmas.log().lerp(sigmas_down.log(), 0.5).exp() + sigmas_interpol[-2:] = 0.0 + + # set sigmas + self.sigmas = torch.cat([sigmas[:1], sigmas[1:].repeat_interleave(2), sigmas[-1:]]) + self.sigmas_interpol = torch.cat( + [sigmas_interpol[:1], sigmas_interpol[1:].repeat_interleave(2), sigmas_interpol[-1:]] + ) + self.sigmas_up = torch.cat([sigmas_up[:1], sigmas_up[1:].repeat_interleave(2), sigmas_up[-1:]]) + self.sigmas_down = torch.cat([sigmas_down[:1], sigmas_down[1:].repeat_interleave(2), sigmas_down[-1:]]) + + if str(device).startswith("mps"): + # mps does not support float64 + timesteps = torch.from_numpy(timesteps).to(device, dtype=torch.float32) + else: + timesteps = torch.from_numpy(timesteps).to(device) + + timesteps_interpol = self.sigma_to_t(sigmas_interpol).to(device, dtype=timesteps.dtype) + interleaved_timesteps = torch.stack((timesteps_interpol[:-2, None], timesteps[1:, None]), dim=-1).flatten() + + self.timesteps = torch.cat([timesteps[:1], interleaved_timesteps]) + + self.sample = None + + # for exp beta schedules, such as the one for `pipeline_shap_e.py` + # we need an index counter + self._index_counter = defaultdict(int) + + def sigma_to_t(self, sigma): + # get log sigma + log_sigma = sigma.log() + + # get distribution + dists = log_sigma - self.log_sigmas[:, None] + + # get sigmas range + low_idx = dists.ge(0).cumsum(dim=0).argmax(dim=0).clamp(max=self.log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = self.log_sigmas[low_idx] + high = self.log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = w.clamp(0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.view(sigma.shape) + return t + + @property + def state_in_first_order(self): + return self.sample is None + + def step( + self, + model_output: Union[torch.FloatTensor, np.ndarray], + timestep: Union[float, torch.FloatTensor], + sample: Union[torch.FloatTensor, np.ndarray], + generator: Optional[torch.Generator] = None, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Args: + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + model_output (`torch.FloatTensor` or `np.ndarray`): direct output from learned diffusion model. timestep + (`int`): current discrete timestep in the diffusion chain. sample (`torch.FloatTensor` or `np.ndarray`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + Returns: + [`~schedulers.scheduling_utils.SchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.SchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + """ + step_index = self.index_for_timestep(timestep) + + # advance index counter by 1 + timestep_int = timestep.cpu().item() if torch.is_tensor(timestep) else timestep + self._index_counter[timestep_int] += 1 + + if self.state_in_first_order: + sigma = self.sigmas[step_index] + sigma_interpol = self.sigmas_interpol[step_index] + sigma_up = self.sigmas_up[step_index] + sigma_down = self.sigmas_down[step_index - 1] + else: + # 2nd order / KPDM2's method + sigma = self.sigmas[step_index - 1] + sigma_interpol = self.sigmas_interpol[step_index - 1] + sigma_up = self.sigmas_up[step_index - 1] + sigma_down = self.sigmas_down[step_index - 1] + + # currently only gamma=0 is supported. This usually works best anyways. + # We can support gamma in the future but then need to scale the timestep before + # passing it to the model which requires a change in API + gamma = 0 + sigma_hat = sigma * (gamma + 1) # Note: sigma_hat == sigma for now + + device = model_output.device + noise = randn_tensor(model_output.shape, dtype=model_output.dtype, device=device, generator=generator) + + # 1. compute predicted original sample (x_0) from sigma-scaled predicted noise + if self.config.prediction_type == "epsilon": + sigma_input = sigma_hat if self.state_in_first_order else sigma_interpol + pred_original_sample = sample - sigma_input * model_output + elif self.config.prediction_type == "v_prediction": + sigma_input = sigma_hat if self.state_in_first_order else sigma_interpol + pred_original_sample = model_output * (-sigma_input / (sigma_input**2 + 1) ** 0.5) + ( + sample / (sigma_input**2 + 1) + ) + elif self.config.prediction_type == "sample": + raise NotImplementedError("prediction_type not implemented yet: sample") + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, or `v_prediction`" + ) + + if self.state_in_first_order: + # 2. Convert to an ODE derivative for 1st order + derivative = (sample - pred_original_sample) / sigma_hat + # 3. delta timestep + dt = sigma_interpol - sigma_hat + + # store for 2nd order step + self.sample = sample + self.dt = dt + prev_sample = sample + derivative * dt + else: + # DPM-Solver-2 + # 2. Convert to an ODE derivative for 2nd order + derivative = (sample - pred_original_sample) / sigma_interpol + # 3. delta timestep + dt = sigma_down - sigma_hat + + sample = self.sample + self.sample = None + + prev_sample = sample + derivative * dt + prev_sample = prev_sample + noise * sigma_up + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + # Copied from diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + step_indices = [self.index_for_timestep(t, schedule_timesteps) for t in timesteps] + + sigma = sigmas[step_indices].flatten() + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) + + noisy_samples = original_samples + noise * sigma + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_k_dpm_2_discrete.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_k_dpm_2_discrete.py new file mode 100644 index 0000000000000000000000000000000000000000..a6a1b4e6640d1bc10ef6475bde39b5f39a87ec80 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_k_dpm_2_discrete.py @@ -0,0 +1,401 @@ +# Copyright 2023 Katherine Crowson, The HuggingFace Team and hlky. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from collections import defaultdict +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class KDPM2DiscreteScheduler(SchedulerMixin, ConfigMixin): + """ + Scheduler created by @crowsonkb in [k_diffusion](https://github.com/crowsonkb/k-diffusion), see: + https://github.com/crowsonkb/k-diffusion/blob/5b3af030dd83e0297272d861c19477735d0317ec/k_diffusion/sampling.py#L188 + + Scheduler inspired by DPM-Solver-2 and Algorthim 2 from Karras et al. (2022). + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. beta_start (`float`): the + starting `beta` value of inference. beta_end (`float`): the final `beta` value. beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear` or `scaled_linear`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + options to clip the variance used when adding noise to the denoised sample. Choose from `fixed_small`, + `fixed_small_log`, `fixed_large`, `fixed_large_log`, `learned` or `learned_range`. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 2 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.00085, # sensible defaults + beta_end: float = 0.012, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + prediction_type: str = "epsilon", + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + # set all values + self.set_timesteps(num_train_timesteps, None, num_train_timesteps) + + # Copied from diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler.index_for_timestep + def index_for_timestep(self, timestep, schedule_timesteps=None): + if schedule_timesteps is None: + schedule_timesteps = self.timesteps + + indices = (schedule_timesteps == timestep).nonzero() + + # The sigma index that is taken for the **very** first `step` + # is always the second index (or the last index if there is only 1) + # This way we can ensure we don't accidentally skip a sigma in + # case we start in the middle of the denoising schedule (e.g. for image-to-image) + if len(self._index_counter) == 0: + pos = 1 if len(indices) > 1 else 0 + else: + timestep_int = timestep.cpu().item() if torch.is_tensor(timestep) else timestep + pos = self._index_counter[timestep_int] + + return indices[pos].item() + + @property + def init_noise_sigma(self): + # standard deviation of the initial noise distribution + if self.config.timestep_spacing in ["linspace", "trailing"]: + return self.sigmas.max() + + return (self.sigmas.max() ** 2 + 1) ** 0.5 + + def scale_model_input( + self, + sample: torch.FloatTensor, + timestep: Union[float, torch.FloatTensor], + ) -> torch.FloatTensor: + """ + Args: + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + sample (`torch.FloatTensor`): input sample timestep (`int`, optional): current timestep + Returns: + `torch.FloatTensor`: scaled input sample + """ + step_index = self.index_for_timestep(timestep) + + if self.state_in_first_order: + sigma = self.sigmas[step_index] + else: + sigma = self.sigmas_interpol[step_index] + + sample = sample / ((sigma**2 + 1) ** 0.5) + return sample + + def set_timesteps( + self, + num_inference_steps: int, + device: Union[str, torch.device] = None, + num_train_timesteps: Optional[int] = None, + ): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + self.num_inference_steps = num_inference_steps + + num_train_timesteps = num_train_timesteps or self.config.num_train_timesteps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = np.linspace(0, num_train_timesteps - 1, num_inference_steps, dtype=float)[::-1].copy() + elif self.config.timestep_spacing == "leading": + step_ratio = num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(float) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(num_train_timesteps, 0, -step_ratio)).round().copy().astype(float) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + self.log_sigmas = torch.from_numpy(np.log(sigmas)).to(device) + + sigmas = np.interp(timesteps, np.arange(0, len(sigmas)), sigmas) + sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32) + sigmas = torch.from_numpy(sigmas).to(device=device) + + # interpolate sigmas + sigmas_interpol = sigmas.log().lerp(sigmas.roll(1).log(), 0.5).exp() + + self.sigmas = torch.cat([sigmas[:1], sigmas[1:].repeat_interleave(2), sigmas[-1:]]) + self.sigmas_interpol = torch.cat( + [sigmas_interpol[:1], sigmas_interpol[1:].repeat_interleave(2), sigmas_interpol[-1:]] + ) + + if str(device).startswith("mps"): + # mps does not support float64 + timesteps = torch.from_numpy(timesteps).to(device, dtype=torch.float32) + else: + timesteps = torch.from_numpy(timesteps).to(device) + + # interpolate timesteps + timesteps_interpol = self.sigma_to_t(sigmas_interpol).to(device, dtype=timesteps.dtype) + interleaved_timesteps = torch.stack((timesteps_interpol[1:-1, None], timesteps[1:, None]), dim=-1).flatten() + + self.timesteps = torch.cat([timesteps[:1], interleaved_timesteps]) + + self.sample = None + + # for exp beta schedules, such as the one for `pipeline_shap_e.py` + # we need an index counter + self._index_counter = defaultdict(int) + + def sigma_to_t(self, sigma): + # get log sigma + log_sigma = sigma.log() + + # get distribution + dists = log_sigma - self.log_sigmas[:, None] + + # get sigmas range + low_idx = dists.ge(0).cumsum(dim=0).argmax(dim=0).clamp(max=self.log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = self.log_sigmas[low_idx] + high = self.log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = w.clamp(0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.view(sigma.shape) + return t + + @property + def state_in_first_order(self): + return self.sample is None + + def step( + self, + model_output: Union[torch.FloatTensor, np.ndarray], + timestep: Union[float, torch.FloatTensor], + sample: Union[torch.FloatTensor, np.ndarray], + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Args: + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + model_output (`torch.FloatTensor` or `np.ndarray`): direct output from learned diffusion model. timestep + (`int`): current discrete timestep in the diffusion chain. sample (`torch.FloatTensor` or `np.ndarray`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + Returns: + [`~schedulers.scheduling_utils.SchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.SchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + """ + step_index = self.index_for_timestep(timestep) + + # advance index counter by 1 + timestep_int = timestep.cpu().item() if torch.is_tensor(timestep) else timestep + self._index_counter[timestep_int] += 1 + + if self.state_in_first_order: + sigma = self.sigmas[step_index] + sigma_interpol = self.sigmas_interpol[step_index + 1] + sigma_next = self.sigmas[step_index + 1] + else: + # 2nd order / KDPM2's method + sigma = self.sigmas[step_index - 1] + sigma_interpol = self.sigmas_interpol[step_index] + sigma_next = self.sigmas[step_index] + + # currently only gamma=0 is supported. This usually works best anyways. + # We can support gamma in the future but then need to scale the timestep before + # passing it to the model which requires a change in API + gamma = 0 + sigma_hat = sigma * (gamma + 1) # Note: sigma_hat == sigma for now + + # 1. compute predicted original sample (x_0) from sigma-scaled predicted noise + if self.config.prediction_type == "epsilon": + sigma_input = sigma_hat if self.state_in_first_order else sigma_interpol + pred_original_sample = sample - sigma_input * model_output + elif self.config.prediction_type == "v_prediction": + sigma_input = sigma_hat if self.state_in_first_order else sigma_interpol + pred_original_sample = model_output * (-sigma_input / (sigma_input**2 + 1) ** 0.5) + ( + sample / (sigma_input**2 + 1) + ) + elif self.config.prediction_type == "sample": + raise NotImplementedError("prediction_type not implemented yet: sample") + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, or `v_prediction`" + ) + + if self.state_in_first_order: + # 2. Convert to an ODE derivative for 1st order + derivative = (sample - pred_original_sample) / sigma_hat + # 3. delta timestep + dt = sigma_interpol - sigma_hat + + # store for 2nd order step + self.sample = sample + else: + # DPM-Solver-2 + # 2. Convert to an ODE derivative for 2nd order + derivative = (sample - pred_original_sample) / sigma_interpol + + # 3. delta timestep + dt = sigma_next - sigma_hat + + sample = self.sample + self.sample = None + + prev_sample = sample + derivative * dt + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + # Copied from diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + step_indices = [self.index_for_timestep(t, schedule_timesteps) for t in timesteps] + + sigma = sigmas[step_indices].flatten() + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) + + noisy_samples = original_samples + noise * sigma + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_karras_ve.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_karras_ve.py new file mode 100644 index 0000000000000000000000000000000000000000..87f6514a4e93e4a75bd6228ed852306b8c005c3d --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_karras_ve.py @@ -0,0 +1,232 @@ +# Copyright 2023 NVIDIA and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, randn_tensor +from .scheduling_utils import SchedulerMixin + + +@dataclass +class KarrasVeOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + derivative (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Derivative of predicted original image sample (x_0). + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + derivative: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +class KarrasVeScheduler(SchedulerMixin, ConfigMixin): + """ + Stochastic sampling from Karras et al. [1] tailored to the Variance-Expanding (VE) models [2]. Use Algorithm 2 and + the VE column of Table 1 from [1] for reference. + + [1] Karras, Tero, et al. "Elucidating the Design Space of Diffusion-Based Generative Models." + https://arxiv.org/abs/2206.00364 [2] Song, Yang, et al. "Score-based generative modeling through stochastic + differential equations." https://arxiv.org/abs/2011.13456 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details on the parameters, see the original paper's Appendix E.: "Elucidating the Design Space of + Diffusion-Based Generative Models." https://arxiv.org/abs/2206.00364. The grid search values used to find the + optimal {s_noise, s_churn, s_min, s_max} for a specific model are described in Table 5 of the paper. + + Args: + sigma_min (`float`): minimum noise magnitude + sigma_max (`float`): maximum noise magnitude + s_noise (`float`): the amount of additional noise to counteract loss of detail during sampling. + A reasonable range is [1.000, 1.011]. + s_churn (`float`): the parameter controlling the overall amount of stochasticity. + A reasonable range is [0, 100]. + s_min (`float`): the start value of the sigma range where we add noise (enable stochasticity). + A reasonable range is [0, 10]. + s_max (`float`): the end value of the sigma range where we add noise. + A reasonable range is [0.2, 80]. + + """ + + order = 2 + + @register_to_config + def __init__( + self, + sigma_min: float = 0.02, + sigma_max: float = 100, + s_noise: float = 1.007, + s_churn: float = 80, + s_min: float = 0.05, + s_max: float = 50, + ): + # standard deviation of the initial noise distribution + self.init_noise_sigma = sigma_max + + # setable values + self.num_inference_steps: int = None + self.timesteps: np.IntTensor = None + self.schedule: torch.FloatTensor = None # sigma(t_i) + + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the continuous timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + + """ + self.num_inference_steps = num_inference_steps + timesteps = np.arange(0, self.num_inference_steps)[::-1].copy() + self.timesteps = torch.from_numpy(timesteps).to(device) + schedule = [ + ( + self.config.sigma_max**2 + * (self.config.sigma_min**2 / self.config.sigma_max**2) ** (i / (num_inference_steps - 1)) + ) + for i in self.timesteps + ] + self.schedule = torch.tensor(schedule, dtype=torch.float32, device=device) + + def add_noise_to_input( + self, sample: torch.FloatTensor, sigma: float, generator: Optional[torch.Generator] = None + ) -> Tuple[torch.FloatTensor, float]: + """ + Explicit Langevin-like "churn" step of adding noise to the sample according to a factor gamma_i ≥ 0 to reach a + higher noise level sigma_hat = sigma_i + gamma_i*sigma_i. + + TODO Args: + """ + if self.config.s_min <= sigma <= self.config.s_max: + gamma = min(self.config.s_churn / self.num_inference_steps, 2**0.5 - 1) + else: + gamma = 0 + + # sample eps ~ N(0, S_noise^2 * I) + eps = self.config.s_noise * randn_tensor(sample.shape, generator=generator).to(sample.device) + sigma_hat = sigma + gamma * sigma + sample_hat = sample + ((sigma_hat**2 - sigma**2) ** 0.5 * eps) + + return sample_hat, sigma_hat + + def step( + self, + model_output: torch.FloatTensor, + sigma_hat: float, + sigma_prev: float, + sample_hat: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[KarrasVeOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + sigma_hat (`float`): TODO + sigma_prev (`float`): TODO + sample_hat (`torch.FloatTensor`): TODO + return_dict (`bool`): option for returning tuple rather than KarrasVeOutput class + + KarrasVeOutput: updated sample in the diffusion chain and derivative (TODO double check). + Returns: + [`~schedulers.scheduling_karras_ve.KarrasVeOutput`] or `tuple`: + [`~schedulers.scheduling_karras_ve.KarrasVeOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + + """ + + pred_original_sample = sample_hat + sigma_hat * model_output + derivative = (sample_hat - pred_original_sample) / sigma_hat + sample_prev = sample_hat + (sigma_prev - sigma_hat) * derivative + + if not return_dict: + return (sample_prev, derivative) + + return KarrasVeOutput( + prev_sample=sample_prev, derivative=derivative, pred_original_sample=pred_original_sample + ) + + def step_correct( + self, + model_output: torch.FloatTensor, + sigma_hat: float, + sigma_prev: float, + sample_hat: torch.FloatTensor, + sample_prev: torch.FloatTensor, + derivative: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[KarrasVeOutput, Tuple]: + """ + Correct the predicted sample based on the output model_output of the network. TODO complete description + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + sigma_hat (`float`): TODO + sigma_prev (`float`): TODO + sample_hat (`torch.FloatTensor`): TODO + sample_prev (`torch.FloatTensor`): TODO + derivative (`torch.FloatTensor`): TODO + return_dict (`bool`): option for returning tuple rather than KarrasVeOutput class + + Returns: + prev_sample (TODO): updated sample in the diffusion chain. derivative (TODO): TODO + + """ + pred_original_sample = sample_prev + sigma_prev * model_output + derivative_corr = (sample_prev - pred_original_sample) / sigma_prev + sample_prev = sample_hat + (sigma_prev - sigma_hat) * (0.5 * derivative + 0.5 * derivative_corr) + + if not return_dict: + return (sample_prev, derivative) + + return KarrasVeOutput( + prev_sample=sample_prev, derivative=derivative, pred_original_sample=pred_original_sample + ) + + def add_noise(self, original_samples, noise, timesteps): + raise NotImplementedError() diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_karras_ve_flax.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_karras_ve_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..45c0dbddf7efd22df21cc9859e68d62b54aa8609 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_karras_ve_flax.py @@ -0,0 +1,237 @@ +# Copyright 2023 NVIDIA and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import flax +import jax.numpy as jnp +from jax import random + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput +from .scheduling_utils_flax import FlaxSchedulerMixin + + +@flax.struct.dataclass +class KarrasVeSchedulerState: + # setable values + num_inference_steps: Optional[int] = None + timesteps: Optional[jnp.ndarray] = None + schedule: Optional[jnp.ndarray] = None # sigma(t_i) + + @classmethod + def create(cls): + return cls() + + +@dataclass +class FlaxKarrasVeOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`jnp.ndarray` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + derivative (`jnp.ndarray` of shape `(batch_size, num_channels, height, width)` for images): + Derivative of predicted original image sample (x_0). + state (`KarrasVeSchedulerState`): the `FlaxKarrasVeScheduler` state data class. + """ + + prev_sample: jnp.ndarray + derivative: jnp.ndarray + state: KarrasVeSchedulerState + + +class FlaxKarrasVeScheduler(FlaxSchedulerMixin, ConfigMixin): + """ + Stochastic sampling from Karras et al. [1] tailored to the Variance-Expanding (VE) models [2]. Use Algorithm 2 and + the VE column of Table 1 from [1] for reference. + + [1] Karras, Tero, et al. "Elucidating the Design Space of Diffusion-Based Generative Models." + https://arxiv.org/abs/2206.00364 [2] Song, Yang, et al. "Score-based generative modeling through stochastic + differential equations." https://arxiv.org/abs/2011.13456 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details on the parameters, see the original paper's Appendix E.: "Elucidating the Design Space of + Diffusion-Based Generative Models." https://arxiv.org/abs/2206.00364. The grid search values used to find the + optimal {s_noise, s_churn, s_min, s_max} for a specific model are described in Table 5 of the paper. + + Args: + sigma_min (`float`): minimum noise magnitude + sigma_max (`float`): maximum noise magnitude + s_noise (`float`): the amount of additional noise to counteract loss of detail during sampling. + A reasonable range is [1.000, 1.011]. + s_churn (`float`): the parameter controlling the overall amount of stochasticity. + A reasonable range is [0, 100]. + s_min (`float`): the start value of the sigma range where we add noise (enable stochasticity). + A reasonable range is [0, 10]. + s_max (`float`): the end value of the sigma range where we add noise. + A reasonable range is [0.2, 80]. + """ + + @property + def has_state(self): + return True + + @register_to_config + def __init__( + self, + sigma_min: float = 0.02, + sigma_max: float = 100, + s_noise: float = 1.007, + s_churn: float = 80, + s_min: float = 0.05, + s_max: float = 50, + ): + pass + + def create_state(self): + return KarrasVeSchedulerState.create() + + def set_timesteps( + self, state: KarrasVeSchedulerState, num_inference_steps: int, shape: Tuple = () + ) -> KarrasVeSchedulerState: + """ + Sets the continuous timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + state (`KarrasVeSchedulerState`): + the `FlaxKarrasVeScheduler` state data class. + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + + """ + timesteps = jnp.arange(0, num_inference_steps)[::-1].copy() + schedule = [ + ( + self.config.sigma_max**2 + * (self.config.sigma_min**2 / self.config.sigma_max**2) ** (i / (num_inference_steps - 1)) + ) + for i in timesteps + ] + + return state.replace( + num_inference_steps=num_inference_steps, + schedule=jnp.array(schedule, dtype=jnp.float32), + timesteps=timesteps, + ) + + def add_noise_to_input( + self, + state: KarrasVeSchedulerState, + sample: jnp.ndarray, + sigma: float, + key: random.KeyArray, + ) -> Tuple[jnp.ndarray, float]: + """ + Explicit Langevin-like "churn" step of adding noise to the sample according to a factor gamma_i ≥ 0 to reach a + higher noise level sigma_hat = sigma_i + gamma_i*sigma_i. + + TODO Args: + """ + if self.config.s_min <= sigma <= self.config.s_max: + gamma = min(self.config.s_churn / state.num_inference_steps, 2**0.5 - 1) + else: + gamma = 0 + + # sample eps ~ N(0, S_noise^2 * I) + key = random.split(key, num=1) + eps = self.config.s_noise * random.normal(key=key, shape=sample.shape) + sigma_hat = sigma + gamma * sigma + sample_hat = sample + ((sigma_hat**2 - sigma**2) ** 0.5 * eps) + + return sample_hat, sigma_hat + + def step( + self, + state: KarrasVeSchedulerState, + model_output: jnp.ndarray, + sigma_hat: float, + sigma_prev: float, + sample_hat: jnp.ndarray, + return_dict: bool = True, + ) -> Union[FlaxKarrasVeOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + state (`KarrasVeSchedulerState`): the `FlaxKarrasVeScheduler` state data class. + model_output (`torch.FloatTensor` or `np.ndarray`): direct output from learned diffusion model. + sigma_hat (`float`): TODO + sigma_prev (`float`): TODO + sample_hat (`torch.FloatTensor` or `np.ndarray`): TODO + return_dict (`bool`): option for returning tuple rather than FlaxKarrasVeOutput class + + Returns: + [`~schedulers.scheduling_karras_ve_flax.FlaxKarrasVeOutput`] or `tuple`: Updated sample in the diffusion + chain and derivative. [`~schedulers.scheduling_karras_ve_flax.FlaxKarrasVeOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + """ + + pred_original_sample = sample_hat + sigma_hat * model_output + derivative = (sample_hat - pred_original_sample) / sigma_hat + sample_prev = sample_hat + (sigma_prev - sigma_hat) * derivative + + if not return_dict: + return (sample_prev, derivative, state) + + return FlaxKarrasVeOutput(prev_sample=sample_prev, derivative=derivative, state=state) + + def step_correct( + self, + state: KarrasVeSchedulerState, + model_output: jnp.ndarray, + sigma_hat: float, + sigma_prev: float, + sample_hat: jnp.ndarray, + sample_prev: jnp.ndarray, + derivative: jnp.ndarray, + return_dict: bool = True, + ) -> Union[FlaxKarrasVeOutput, Tuple]: + """ + Correct the predicted sample based on the output model_output of the network. TODO complete description + + Args: + state (`KarrasVeSchedulerState`): the `FlaxKarrasVeScheduler` state data class. + model_output (`torch.FloatTensor` or `np.ndarray`): direct output from learned diffusion model. + sigma_hat (`float`): TODO + sigma_prev (`float`): TODO + sample_hat (`torch.FloatTensor` or `np.ndarray`): TODO + sample_prev (`torch.FloatTensor` or `np.ndarray`): TODO + derivative (`torch.FloatTensor` or `np.ndarray`): TODO + return_dict (`bool`): option for returning tuple rather than FlaxKarrasVeOutput class + + Returns: + prev_sample (TODO): updated sample in the diffusion chain. derivative (TODO): TODO + + """ + pred_original_sample = sample_prev + sigma_prev * model_output + derivative_corr = (sample_prev - pred_original_sample) / sigma_prev + sample_prev = sample_hat + (sigma_prev - sigma_hat) * (0.5 * derivative + 0.5 * derivative_corr) + + if not return_dict: + return (sample_prev, derivative, state) + + return FlaxKarrasVeOutput(prev_sample=sample_prev, derivative=derivative, state=state) + + def add_noise(self, state: KarrasVeSchedulerState, original_samples, noise, timesteps): + raise NotImplementedError() diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_lms_discrete.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_lms_discrete.py new file mode 100644 index 0000000000000000000000000000000000000000..d58d4ce45bd17645b86905c1ae36ce937015fc29 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_lms_discrete.py @@ -0,0 +1,413 @@ +# Copyright 2023 Katherine Crowson and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import math +import warnings +from dataclasses import dataclass +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch +from scipy import integrate + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin + + +@dataclass +# Copied from diffusers.schedulers.scheduling_ddpm.DDPMSchedulerOutput with DDPM->LMSDiscrete +class LMSDiscreteSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class LMSDiscreteScheduler(SchedulerMixin, ConfigMixin): + """ + Linear Multistep Scheduler for discrete beta schedules. Based on the original k-diffusion implementation by + Katherine Crowson: + https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L181 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear` or `scaled_linear`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + use_karras_sigmas: Optional[bool] = False, + prediction_type: str = "epsilon", + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + sigmas = np.concatenate([sigmas[::-1], [0.0]]).astype(np.float32) + self.sigmas = torch.from_numpy(sigmas) + + # setable values + self.num_inference_steps = None + self.use_karras_sigmas = use_karras_sigmas + self.set_timesteps(num_train_timesteps, None) + self.derivatives = [] + self.is_scale_input_called = False + + @property + def init_noise_sigma(self): + # standard deviation of the initial noise distribution + if self.config.timestep_spacing in ["linspace", "trailing"]: + return self.sigmas.max() + + return (self.sigmas.max() ** 2 + 1) ** 0.5 + + def scale_model_input( + self, sample: torch.FloatTensor, timestep: Union[float, torch.FloatTensor] + ) -> torch.FloatTensor: + """ + Scales the denoising model input by `(sigma**2 + 1) ** 0.5` to match the K-LMS algorithm. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`float` or `torch.FloatTensor`): the current timestep in the diffusion chain + + Returns: + `torch.FloatTensor`: scaled input sample + """ + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero().item() + sigma = self.sigmas[step_index] + sample = sample / ((sigma**2 + 1) ** 0.5) + self.is_scale_input_called = True + return sample + + def get_lms_coefficient(self, order, t, current_order): + """ + Compute a linear multistep coefficient. + + Args: + order (TODO): + t (TODO): + current_order (TODO): + """ + + def lms_derivative(tau): + prod = 1.0 + for k in range(order): + if current_order == k: + continue + prod *= (tau - self.sigmas[t - k]) / (self.sigmas[t - current_order] - self.sigmas[t - k]) + return prod + + integrated_coeff = integrate.quad(lms_derivative, self.sigmas[t], self.sigmas[t + 1], epsrel=1e-4)[0] + + return integrated_coeff + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + self.num_inference_steps = num_inference_steps + + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps, dtype=float)[ + ::-1 + ].copy() + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(float) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(self.config.num_train_timesteps, 0, -step_ratio)).round().copy().astype(float) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + log_sigmas = np.log(sigmas) + sigmas = np.interp(timesteps, np.arange(0, len(sigmas)), sigmas) + + if self.use_karras_sigmas: + sigmas = self._convert_to_karras(in_sigmas=sigmas) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]) + + sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32) + + self.sigmas = torch.from_numpy(sigmas).to(device=device) + if str(device).startswith("mps"): + # mps does not support float64 + self.timesteps = torch.from_numpy(timesteps).to(device, dtype=torch.float32) + else: + self.timesteps = torch.from_numpy(timesteps).to(device=device) + + self.derivatives = [] + + # copied from diffusers.schedulers.scheduling_euler_discrete._sigma_to_t + def _sigma_to_t(self, sigma, log_sigmas): + # get log sigma + log_sigma = np.log(sigma) + + # get distribution + dists = log_sigma - log_sigmas[:, np.newaxis] + + # get sigmas range + low_idx = np.cumsum((dists >= 0), axis=0).argmax(axis=0).clip(max=log_sigmas.shape[0] - 2) + high_idx = low_idx + 1 + + low = log_sigmas[low_idx] + high = log_sigmas[high_idx] + + # interpolate sigmas + w = (low - log_sigma) / (low - high) + w = np.clip(w, 0, 1) + + # transform interpolation to time range + t = (1 - w) * low_idx + w * high_idx + t = t.reshape(sigma.shape) + return t + + # copied from diffusers.schedulers.scheduling_euler_discrete._convert_to_karras + def _convert_to_karras(self, in_sigmas: torch.FloatTensor) -> torch.FloatTensor: + """Constructs the noise schedule of Karras et al. (2022).""" + + sigma_min: float = in_sigmas[-1].item() + sigma_max: float = in_sigmas[0].item() + + rho = 7.0 # 7.0 is the value used in the paper + ramp = np.linspace(0, 1, self.num_inference_steps) + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho + return sigmas + + def step( + self, + model_output: torch.FloatTensor, + timestep: Union[float, torch.FloatTensor], + sample: torch.FloatTensor, + order: int = 4, + return_dict: bool = True, + ) -> Union[LMSDiscreteSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`float`): current timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + order: coefficient for multi-step inference. + return_dict (`bool`): option for returning tuple rather than LMSDiscreteSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.LMSDiscreteSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.LMSDiscreteSchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. + When returning a tuple, the first element is the sample tensor. + + """ + if not self.is_scale_input_called: + warnings.warn( + "The `scale_model_input` function should be called before `step` to ensure correct denoising. " + "See `StableDiffusionPipeline` for a usage example." + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero().item() + sigma = self.sigmas[step_index] + + # 1. compute predicted original sample (x_0) from sigma-scaled predicted noise + if self.config.prediction_type == "epsilon": + pred_original_sample = sample - sigma * model_output + elif self.config.prediction_type == "v_prediction": + # * c_out + input * c_skip + pred_original_sample = model_output * (-sigma / (sigma**2 + 1) ** 0.5) + (sample / (sigma**2 + 1)) + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, or `v_prediction`" + ) + + # 2. Convert to an ODE derivative + derivative = (sample - pred_original_sample) / sigma + self.derivatives.append(derivative) + if len(self.derivatives) > order: + self.derivatives.pop(0) + + # 3. Compute linear multistep coefficients + order = min(step_index + 1, order) + lms_coeffs = [self.get_lms_coefficient(order, step_index, curr_order) for curr_order in range(order)] + + # 4. Compute previous sample based on the derivatives path + prev_sample = sample + sum( + coeff * derivative for coeff, derivative in zip(lms_coeffs, reversed(self.derivatives)) + ) + + if not return_dict: + return (prev_sample,) + + return LMSDiscreteSchedulerOutput(prev_sample=prev_sample, pred_original_sample=pred_original_sample) + + # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps] + + sigma = sigmas[step_indices].flatten() + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) + + noisy_samples = original_samples + noise * sigma + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_lms_discrete_flax.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_lms_discrete_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..f96e602afe121a09876b0ff7db1d3192e441e32a --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_lms_discrete_flax.py @@ -0,0 +1,283 @@ +# Copyright 2023 Katherine Crowson and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import flax +import jax.numpy as jnp +from scipy import integrate + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils_flax import ( + CommonSchedulerState, + FlaxKarrasDiffusionSchedulers, + FlaxSchedulerMixin, + FlaxSchedulerOutput, + broadcast_to_shape_from_left, +) + + +@flax.struct.dataclass +class LMSDiscreteSchedulerState: + common: CommonSchedulerState + + # setable values + init_noise_sigma: jnp.ndarray + timesteps: jnp.ndarray + sigmas: jnp.ndarray + num_inference_steps: Optional[int] = None + + # running values + derivatives: Optional[jnp.ndarray] = None + + @classmethod + def create( + cls, common: CommonSchedulerState, init_noise_sigma: jnp.ndarray, timesteps: jnp.ndarray, sigmas: jnp.ndarray + ): + return cls(common=common, init_noise_sigma=init_noise_sigma, timesteps=timesteps, sigmas=sigmas) + + +@dataclass +class FlaxLMSSchedulerOutput(FlaxSchedulerOutput): + state: LMSDiscreteSchedulerState + + +class FlaxLMSDiscreteScheduler(FlaxSchedulerMixin, ConfigMixin): + """ + Linear Multistep Scheduler for discrete beta schedules. Based on the original k-diffusion implementation by + Katherine Crowson: + https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L181 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear` or `scaled_linear`. + trained_betas (`jnp.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + dtype (`jnp.dtype`, *optional*, defaults to `jnp.float32`): + the `dtype` used for params and computation. + """ + + _compatibles = [e.name for e in FlaxKarrasDiffusionSchedulers] + + dtype: jnp.dtype + + @property + def has_state(self): + return True + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[jnp.ndarray] = None, + prediction_type: str = "epsilon", + dtype: jnp.dtype = jnp.float32, + ): + self.dtype = dtype + + def create_state(self, common: Optional[CommonSchedulerState] = None) -> LMSDiscreteSchedulerState: + if common is None: + common = CommonSchedulerState.create(self) + + timesteps = jnp.arange(0, self.config.num_train_timesteps).round()[::-1] + sigmas = ((1 - common.alphas_cumprod) / common.alphas_cumprod) ** 0.5 + + # standard deviation of the initial noise distribution + init_noise_sigma = sigmas.max() + + return LMSDiscreteSchedulerState.create( + common=common, + init_noise_sigma=init_noise_sigma, + timesteps=timesteps, + sigmas=sigmas, + ) + + def scale_model_input(self, state: LMSDiscreteSchedulerState, sample: jnp.ndarray, timestep: int) -> jnp.ndarray: + """ + Scales the denoising model input by `(sigma**2 + 1) ** 0.5` to match the K-LMS algorithm. + + Args: + state (`LMSDiscreteSchedulerState`): + the `FlaxLMSDiscreteScheduler` state data class instance. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + timestep (`int`): + current discrete timestep in the diffusion chain. + + Returns: + `jnp.ndarray`: scaled input sample + """ + (step_index,) = jnp.where(state.timesteps == timestep, size=1) + step_index = step_index[0] + + sigma = state.sigmas[step_index] + sample = sample / ((sigma**2 + 1) ** 0.5) + return sample + + def get_lms_coefficient(self, state: LMSDiscreteSchedulerState, order, t, current_order): + """ + Compute a linear multistep coefficient. + + Args: + order (TODO): + t (TODO): + current_order (TODO): + """ + + def lms_derivative(tau): + prod = 1.0 + for k in range(order): + if current_order == k: + continue + prod *= (tau - state.sigmas[t - k]) / (state.sigmas[t - current_order] - state.sigmas[t - k]) + return prod + + integrated_coeff = integrate.quad(lms_derivative, state.sigmas[t], state.sigmas[t + 1], epsrel=1e-4)[0] + + return integrated_coeff + + def set_timesteps( + self, state: LMSDiscreteSchedulerState, num_inference_steps: int, shape: Tuple = () + ) -> LMSDiscreteSchedulerState: + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + state (`LMSDiscreteSchedulerState`): + the `FlaxLMSDiscreteScheduler` state data class instance. + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + + timesteps = jnp.linspace(self.config.num_train_timesteps - 1, 0, num_inference_steps, dtype=self.dtype) + + low_idx = jnp.floor(timesteps).astype(jnp.int32) + high_idx = jnp.ceil(timesteps).astype(jnp.int32) + + frac = jnp.mod(timesteps, 1.0) + + sigmas = ((1 - state.common.alphas_cumprod) / state.common.alphas_cumprod) ** 0.5 + sigmas = (1 - frac) * sigmas[low_idx] + frac * sigmas[high_idx] + sigmas = jnp.concatenate([sigmas, jnp.array([0.0], dtype=self.dtype)]) + + timesteps = timesteps.astype(jnp.int32) + + # initial running values + derivatives = jnp.zeros((0,) + shape, dtype=self.dtype) + + return state.replace( + timesteps=timesteps, + sigmas=sigmas, + num_inference_steps=num_inference_steps, + derivatives=derivatives, + ) + + def step( + self, + state: LMSDiscreteSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + order: int = 4, + return_dict: bool = True, + ) -> Union[FlaxLMSSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + state (`LMSDiscreteSchedulerState`): the `FlaxLMSDiscreteScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + order: coefficient for multi-step inference. + return_dict (`bool`): option for returning tuple rather than FlaxLMSSchedulerOutput class + + Returns: + [`FlaxLMSSchedulerOutput`] or `tuple`: [`FlaxLMSSchedulerOutput`] if `return_dict` is True, otherwise a + `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if state.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + sigma = state.sigmas[timestep] + + # 1. compute predicted original sample (x_0) from sigma-scaled predicted noise + if self.config.prediction_type == "epsilon": + pred_original_sample = sample - sigma * model_output + elif self.config.prediction_type == "v_prediction": + # * c_out + input * c_skip + pred_original_sample = model_output * (-sigma / (sigma**2 + 1) ** 0.5) + (sample / (sigma**2 + 1)) + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, or `v_prediction`" + ) + + # 2. Convert to an ODE derivative + derivative = (sample - pred_original_sample) / sigma + state = state.replace(derivatives=jnp.append(state.derivatives, derivative)) + if len(state.derivatives) > order: + state = state.replace(derivatives=jnp.delete(state.derivatives, 0)) + + # 3. Compute linear multistep coefficients + order = min(timestep + 1, order) + lms_coeffs = [self.get_lms_coefficient(state, order, timestep, curr_order) for curr_order in range(order)] + + # 4. Compute previous sample based on the derivatives path + prev_sample = sample + sum( + coeff * derivative for coeff, derivative in zip(lms_coeffs, reversed(state.derivatives)) + ) + + if not return_dict: + return (prev_sample, state) + + return FlaxLMSSchedulerOutput(prev_sample=prev_sample, state=state) + + def add_noise( + self, + state: LMSDiscreteSchedulerState, + original_samples: jnp.ndarray, + noise: jnp.ndarray, + timesteps: jnp.ndarray, + ) -> jnp.ndarray: + sigma = state.sigmas[timesteps].flatten() + sigma = broadcast_to_shape_from_left(sigma, noise.shape) + + noisy_samples = original_samples + noise * sigma + + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_pndm.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_pndm.py new file mode 100644 index 0000000000000000000000000000000000000000..794eb3674c1bb5533b938b00b08d48cd5192c317 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_pndm.py @@ -0,0 +1,462 @@ +# Copyright 2023 Zhejiang University Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/ermongroup/ddim + +import math +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class PNDMScheduler(SchedulerMixin, ConfigMixin): + """ + Pseudo numerical methods for diffusion models (PNDM) proposes using more advanced ODE integration techniques, + namely Runge-Kutta method and a linear multi-step method. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2202.09778 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + skip_prk_steps (`bool`): + allows the scheduler to skip the Runge-Kutta steps that are defined in the original paper as being required + before plms steps; defaults to `False`. + set_alpha_to_one (`bool`, default `False`): + each diffusion step uses the value of alphas product at that step and at the previous one. For the final + step there is no previous alpha. When this option is `True` the previous alpha product is fixed to `1`, + otherwise it uses the value of alpha at step 0. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion process) + or `v_prediction` (see section 2.4 https://imagen.research.google/video/paper.pdf) + timestep_spacing (`str`, default `"leading"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + skip_prk_steps: bool = False, + set_alpha_to_one: bool = False, + prediction_type: str = "epsilon", + timestep_spacing: str = "leading", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + + self.final_alpha_cumprod = torch.tensor(1.0) if set_alpha_to_one else self.alphas_cumprod[0] + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # For now we only support F-PNDM, i.e. the runge-kutta method + # For more information on the algorithm please take a look at the paper: https://arxiv.org/pdf/2202.09778.pdf + # mainly at formula (9), (12), (13) and the Algorithm 2. + self.pndm_order = 4 + + # running values + self.cur_model_output = 0 + self.counter = 0 + self.cur_sample = None + self.ets = [] + + # setable values + self.num_inference_steps = None + self._timesteps = np.arange(0, num_train_timesteps)[::-1].copy() + self.prk_timesteps = None + self.plms_timesteps = None + self.timesteps = None + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + + self.num_inference_steps = num_inference_steps + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + self._timesteps = ( + np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps).round().astype(np.int64) + ) + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + self._timesteps = (np.arange(0, num_inference_steps) * step_ratio).round() + self._timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / self.num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + self._timesteps = np.round(np.arange(self.config.num_train_timesteps, 0, -step_ratio))[::-1].astype( + np.int64 + ) + self._timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + if self.config.skip_prk_steps: + # for some models like stable diffusion the prk steps can/should be skipped to + # produce better results. When using PNDM with `self.config.skip_prk_steps` the implementation + # is based on crowsonkb's PLMS sampler implementation: https://github.com/CompVis/latent-diffusion/pull/51 + self.prk_timesteps = np.array([]) + self.plms_timesteps = np.concatenate([self._timesteps[:-1], self._timesteps[-2:-1], self._timesteps[-1:]])[ + ::-1 + ].copy() + else: + prk_timesteps = np.array(self._timesteps[-self.pndm_order :]).repeat(2) + np.tile( + np.array([0, self.config.num_train_timesteps // num_inference_steps // 2]), self.pndm_order + ) + self.prk_timesteps = (prk_timesteps[:-1].repeat(2)[1:-1])[::-1].copy() + self.plms_timesteps = self._timesteps[:-3][ + ::-1 + ].copy() # we copy to avoid having negative strides which are not supported by torch.from_numpy + + timesteps = np.concatenate([self.prk_timesteps, self.plms_timesteps]).astype(np.int64) + self.timesteps = torch.from_numpy(timesteps).to(device) + + self.ets = [] + self.counter = 0 + self.cur_model_output = 0 + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + This function calls `step_prk()` or `step_plms()` depending on the internal variable `counter`. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.SchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.SchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + + """ + if self.counter < len(self.prk_timesteps) and not self.config.skip_prk_steps: + return self.step_prk(model_output=model_output, timestep=timestep, sample=sample, return_dict=return_dict) + else: + return self.step_plms(model_output=model_output, timestep=timestep, sample=sample, return_dict=return_dict) + + def step_prk( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Step function propagating the sample with the Runge-Kutta method. RK takes 4 forward passes to approximate the + solution to the differential equation. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~scheduling_utils.SchedulerOutput`] or `tuple`: [`~scheduling_utils.SchedulerOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + diff_to_prev = 0 if self.counter % 2 else self.config.num_train_timesteps // self.num_inference_steps // 2 + prev_timestep = timestep - diff_to_prev + timestep = self.prk_timesteps[self.counter // 4 * 4] + + if self.counter % 4 == 0: + self.cur_model_output += 1 / 6 * model_output + self.ets.append(model_output) + self.cur_sample = sample + elif (self.counter - 1) % 4 == 0: + self.cur_model_output += 1 / 3 * model_output + elif (self.counter - 2) % 4 == 0: + self.cur_model_output += 1 / 3 * model_output + elif (self.counter - 3) % 4 == 0: + model_output = self.cur_model_output + 1 / 6 * model_output + self.cur_model_output = 0 + + # cur_sample should not be `None` + cur_sample = self.cur_sample if self.cur_sample is not None else sample + + prev_sample = self._get_prev_sample(cur_sample, timestep, prev_timestep, model_output) + self.counter += 1 + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def step_plms( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Step function propagating the sample with the linear multi-step method. This has one forward pass with multiple + times to approximate the solution. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~scheduling_utils.SchedulerOutput`] or `tuple`: [`~scheduling_utils.SchedulerOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + if not self.config.skip_prk_steps and len(self.ets) < 3: + raise ValueError( + f"{self.__class__} can only be run AFTER scheduler has been run " + "in 'prk' mode for at least 12 iterations " + "See: https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/pipeline_pndm.py " + "for more information." + ) + + prev_timestep = timestep - self.config.num_train_timesteps // self.num_inference_steps + + if self.counter != 1: + self.ets = self.ets[-3:] + self.ets.append(model_output) + else: + prev_timestep = timestep + timestep = timestep + self.config.num_train_timesteps // self.num_inference_steps + + if len(self.ets) == 1 and self.counter == 0: + model_output = model_output + self.cur_sample = sample + elif len(self.ets) == 1 and self.counter == 1: + model_output = (model_output + self.ets[-1]) / 2 + sample = self.cur_sample + self.cur_sample = None + elif len(self.ets) == 2: + model_output = (3 * self.ets[-1] - self.ets[-2]) / 2 + elif len(self.ets) == 3: + model_output = (23 * self.ets[-1] - 16 * self.ets[-2] + 5 * self.ets[-3]) / 12 + else: + model_output = (1 / 24) * (55 * self.ets[-1] - 59 * self.ets[-2] + 37 * self.ets[-3] - 9 * self.ets[-4]) + + prev_sample = self._get_prev_sample(sample, timestep, prev_timestep, model_output) + self.counter += 1 + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def scale_model_input(self, sample: torch.FloatTensor, *args, **kwargs) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def _get_prev_sample(self, sample, timestep, prev_timestep, model_output): + # See formula (9) of PNDM paper https://arxiv.org/pdf/2202.09778.pdf + # this function computes x_(t−δ) using the formula of (9) + # Note that x_t needs to be added to both sides of the equation + + # Notation ( -> + # alpha_prod_t -> α_t + # alpha_prod_t_prev -> α_(t−δ) + # beta_prod_t -> (1 - α_t) + # beta_prod_t_prev -> (1 - α_(t−δ)) + # sample -> x_t + # model_output -> e_θ(x_t, t) + # prev_sample -> x_(t−δ) + alpha_prod_t = self.alphas_cumprod[timestep] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + if self.config.prediction_type == "v_prediction": + model_output = (alpha_prod_t**0.5) * model_output + (beta_prod_t**0.5) * sample + elif self.config.prediction_type != "epsilon": + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon` or `v_prediction`" + ) + + # corresponds to (α_(t−δ) - α_t) divided by + # denominator of x_t in formula (9) and plus 1 + # Note: (α_(t−δ) - α_t) / (sqrt(α_t) * (sqrt(α_(t−δ)) + sqr(α_t))) = + # sqrt(α_(t−δ)) / sqrt(α_t)) + sample_coeff = (alpha_prod_t_prev / alpha_prod_t) ** (0.5) + + # corresponds to denominator of e_θ(x_t, t) in formula (9) + model_output_denom_coeff = alpha_prod_t * beta_prod_t_prev ** (0.5) + ( + alpha_prod_t * beta_prod_t * alpha_prod_t_prev + ) ** (0.5) + + # full formula (9) + prev_sample = ( + sample_coeff * sample - (alpha_prod_t_prev - alpha_prod_t) * model_output / model_output_denom_coeff + ) + + return prev_sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_pndm_flax.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_pndm_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..c654f2de8dd3e4f96403cce4b9db8f8b7b69861f --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_pndm_flax.py @@ -0,0 +1,511 @@ +# Copyright 2023 Zhejiang University Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/ermongroup/ddim + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import flax +import jax +import jax.numpy as jnp + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils_flax import ( + CommonSchedulerState, + FlaxKarrasDiffusionSchedulers, + FlaxSchedulerMixin, + FlaxSchedulerOutput, + add_noise_common, +) + + +@flax.struct.dataclass +class PNDMSchedulerState: + common: CommonSchedulerState + final_alpha_cumprod: jnp.ndarray + + # setable values + init_noise_sigma: jnp.ndarray + timesteps: jnp.ndarray + num_inference_steps: Optional[int] = None + prk_timesteps: Optional[jnp.ndarray] = None + plms_timesteps: Optional[jnp.ndarray] = None + + # running values + cur_model_output: Optional[jnp.ndarray] = None + counter: Optional[jnp.int32] = None + cur_sample: Optional[jnp.ndarray] = None + ets: Optional[jnp.ndarray] = None + + @classmethod + def create( + cls, + common: CommonSchedulerState, + final_alpha_cumprod: jnp.ndarray, + init_noise_sigma: jnp.ndarray, + timesteps: jnp.ndarray, + ): + return cls( + common=common, + final_alpha_cumprod=final_alpha_cumprod, + init_noise_sigma=init_noise_sigma, + timesteps=timesteps, + ) + + +@dataclass +class FlaxPNDMSchedulerOutput(FlaxSchedulerOutput): + state: PNDMSchedulerState + + +class FlaxPNDMScheduler(FlaxSchedulerMixin, ConfigMixin): + """ + Pseudo numerical methods for diffusion models (PNDM) proposes using more advanced ODE integration techniques, + namely Runge-Kutta method and a linear multi-step method. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2202.09778 + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`jnp.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + skip_prk_steps (`bool`): + allows the scheduler to skip the Runge-Kutta steps that are defined in the original paper as being required + before plms steps; defaults to `False`. + set_alpha_to_one (`bool`, default `False`): + each diffusion step uses the value of alphas product at that step and at the previous one. For the final + step there is no previous alpha. When this option is `True` the previous alpha product is fixed to `1`, + otherwise it uses the value of alpha at step 0. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + dtype (`jnp.dtype`, *optional*, defaults to `jnp.float32`): + the `dtype` used for params and computation. + """ + + _compatibles = [e.name for e in FlaxKarrasDiffusionSchedulers] + + dtype: jnp.dtype + pndm_order: int + + @property + def has_state(self): + return True + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[jnp.ndarray] = None, + skip_prk_steps: bool = False, + set_alpha_to_one: bool = False, + steps_offset: int = 0, + prediction_type: str = "epsilon", + dtype: jnp.dtype = jnp.float32, + ): + self.dtype = dtype + + # For now we only support F-PNDM, i.e. the runge-kutta method + # For more information on the algorithm please take a look at the paper: https://arxiv.org/pdf/2202.09778.pdf + # mainly at formula (9), (12), (13) and the Algorithm 2. + self.pndm_order = 4 + + def create_state(self, common: Optional[CommonSchedulerState] = None) -> PNDMSchedulerState: + if common is None: + common = CommonSchedulerState.create(self) + + # At every step in ddim, we are looking into the previous alphas_cumprod + # For the final step, there is no previous alphas_cumprod because we are already at 0 + # `set_alpha_to_one` decides whether we set this parameter simply to one or + # whether we use the final alpha of the "non-previous" one. + final_alpha_cumprod = ( + jnp.array(1.0, dtype=self.dtype) if self.config.set_alpha_to_one else common.alphas_cumprod[0] + ) + + # standard deviation of the initial noise distribution + init_noise_sigma = jnp.array(1.0, dtype=self.dtype) + + timesteps = jnp.arange(0, self.config.num_train_timesteps).round()[::-1] + + return PNDMSchedulerState.create( + common=common, + final_alpha_cumprod=final_alpha_cumprod, + init_noise_sigma=init_noise_sigma, + timesteps=timesteps, + ) + + def set_timesteps(self, state: PNDMSchedulerState, num_inference_steps: int, shape: Tuple) -> PNDMSchedulerState: + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + state (`PNDMSchedulerState`): + the `FlaxPNDMScheduler` state data class instance. + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + shape (`Tuple`): + the shape of the samples to be generated. + """ + + step_ratio = self.config.num_train_timesteps // num_inference_steps + # creates integer timesteps by multiplying by ratio + # rounding to avoid issues when num_inference_step is power of 3 + _timesteps = (jnp.arange(0, num_inference_steps) * step_ratio).round() + self.config.steps_offset + + if self.config.skip_prk_steps: + # for some models like stable diffusion the prk steps can/should be skipped to + # produce better results. When using PNDM with `self.config.skip_prk_steps` the implementation + # is based on crowsonkb's PLMS sampler implementation: https://github.com/CompVis/latent-diffusion/pull/51 + + prk_timesteps = jnp.array([], dtype=jnp.int32) + plms_timesteps = jnp.concatenate([_timesteps[:-1], _timesteps[-2:-1], _timesteps[-1:]])[::-1] + + else: + prk_timesteps = _timesteps[-self.pndm_order :].repeat(2) + jnp.tile( + jnp.array([0, self.config.num_train_timesteps // num_inference_steps // 2], dtype=jnp.int32), + self.pndm_order, + ) + + prk_timesteps = (prk_timesteps[:-1].repeat(2)[1:-1])[::-1] + plms_timesteps = _timesteps[:-3][::-1] + + timesteps = jnp.concatenate([prk_timesteps, plms_timesteps]) + + # initial running values + + cur_model_output = jnp.zeros(shape, dtype=self.dtype) + counter = jnp.int32(0) + cur_sample = jnp.zeros(shape, dtype=self.dtype) + ets = jnp.zeros((4,) + shape, dtype=self.dtype) + + return state.replace( + timesteps=timesteps, + num_inference_steps=num_inference_steps, + prk_timesteps=prk_timesteps, + plms_timesteps=plms_timesteps, + cur_model_output=cur_model_output, + counter=counter, + cur_sample=cur_sample, + ets=ets, + ) + + def scale_model_input( + self, state: PNDMSchedulerState, sample: jnp.ndarray, timestep: Optional[int] = None + ) -> jnp.ndarray: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + state (`PNDMSchedulerState`): the `FlaxPNDMScheduler` state data class instance. + sample (`jnp.ndarray`): input sample + timestep (`int`, optional): current timestep + + Returns: + `jnp.ndarray`: scaled input sample + """ + return sample + + def step( + self, + state: PNDMSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + return_dict: bool = True, + ) -> Union[FlaxPNDMSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + This function calls `step_prk()` or `step_plms()` depending on the internal variable `counter`. + + Args: + state (`PNDMSchedulerState`): the `FlaxPNDMScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than FlaxPNDMSchedulerOutput class + + Returns: + [`FlaxPNDMSchedulerOutput`] or `tuple`: [`FlaxPNDMSchedulerOutput`] if `return_dict` is True, otherwise a + `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + + if state.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + if self.config.skip_prk_steps: + prev_sample, state = self.step_plms(state, model_output, timestep, sample) + else: + prk_prev_sample, prk_state = self.step_prk(state, model_output, timestep, sample) + plms_prev_sample, plms_state = self.step_plms(state, model_output, timestep, sample) + + cond = state.counter < len(state.prk_timesteps) + + prev_sample = jax.lax.select(cond, prk_prev_sample, plms_prev_sample) + + state = state.replace( + cur_model_output=jax.lax.select(cond, prk_state.cur_model_output, plms_state.cur_model_output), + ets=jax.lax.select(cond, prk_state.ets, plms_state.ets), + cur_sample=jax.lax.select(cond, prk_state.cur_sample, plms_state.cur_sample), + counter=jax.lax.select(cond, prk_state.counter, plms_state.counter), + ) + + if not return_dict: + return (prev_sample, state) + + return FlaxPNDMSchedulerOutput(prev_sample=prev_sample, state=state) + + def step_prk( + self, + state: PNDMSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + ) -> Union[FlaxPNDMSchedulerOutput, Tuple]: + """ + Step function propagating the sample with the Runge-Kutta method. RK takes 4 forward passes to approximate the + solution to the differential equation. + + Args: + state (`PNDMSchedulerState`): the `FlaxPNDMScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than FlaxPNDMSchedulerOutput class + + Returns: + [`FlaxPNDMSchedulerOutput`] or `tuple`: [`FlaxPNDMSchedulerOutput`] if `return_dict` is True, otherwise a + `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + + if state.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + diff_to_prev = jnp.where( + state.counter % 2, 0, self.config.num_train_timesteps // state.num_inference_steps // 2 + ) + prev_timestep = timestep - diff_to_prev + timestep = state.prk_timesteps[state.counter // 4 * 4] + + model_output = jax.lax.select( + (state.counter % 4) != 3, + model_output, # remainder 0, 1, 2 + state.cur_model_output + 1 / 6 * model_output, # remainder 3 + ) + + state = state.replace( + cur_model_output=jax.lax.select_n( + state.counter % 4, + state.cur_model_output + 1 / 6 * model_output, # remainder 0 + state.cur_model_output + 1 / 3 * model_output, # remainder 1 + state.cur_model_output + 1 / 3 * model_output, # remainder 2 + jnp.zeros_like(state.cur_model_output), # remainder 3 + ), + ets=jax.lax.select( + (state.counter % 4) == 0, + state.ets.at[0:3].set(state.ets[1:4]).at[3].set(model_output), # remainder 0 + state.ets, # remainder 1, 2, 3 + ), + cur_sample=jax.lax.select( + (state.counter % 4) == 0, + sample, # remainder 0 + state.cur_sample, # remainder 1, 2, 3 + ), + ) + + cur_sample = state.cur_sample + prev_sample = self._get_prev_sample(state, cur_sample, timestep, prev_timestep, model_output) + state = state.replace(counter=state.counter + 1) + + return (prev_sample, state) + + def step_plms( + self, + state: PNDMSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + ) -> Union[FlaxPNDMSchedulerOutput, Tuple]: + """ + Step function propagating the sample with the linear multi-step method. This has one forward pass with multiple + times to approximate the solution. + + Args: + state (`PNDMSchedulerState`): the `FlaxPNDMScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than FlaxPNDMSchedulerOutput class + + Returns: + [`FlaxPNDMSchedulerOutput`] or `tuple`: [`FlaxPNDMSchedulerOutput`] if `return_dict` is True, otherwise a + `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + + if state.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + # NOTE: There is no way to check in the jitted runtime if the prk mode was ran before + + prev_timestep = timestep - self.config.num_train_timesteps // state.num_inference_steps + prev_timestep = jnp.where(prev_timestep > 0, prev_timestep, 0) + + # Reference: + # if state.counter != 1: + # state.ets.append(model_output) + # else: + # prev_timestep = timestep + # timestep = timestep + self.config.num_train_timesteps // state.num_inference_steps + + prev_timestep = jnp.where(state.counter == 1, timestep, prev_timestep) + timestep = jnp.where( + state.counter == 1, timestep + self.config.num_train_timesteps // state.num_inference_steps, timestep + ) + + # Reference: + # if len(state.ets) == 1 and state.counter == 0: + # model_output = model_output + # state.cur_sample = sample + # elif len(state.ets) == 1 and state.counter == 1: + # model_output = (model_output + state.ets[-1]) / 2 + # sample = state.cur_sample + # state.cur_sample = None + # elif len(state.ets) == 2: + # model_output = (3 * state.ets[-1] - state.ets[-2]) / 2 + # elif len(state.ets) == 3: + # model_output = (23 * state.ets[-1] - 16 * state.ets[-2] + 5 * state.ets[-3]) / 12 + # else: + # model_output = (1 / 24) * (55 * state.ets[-1] - 59 * state.ets[-2] + 37 * state.ets[-3] - 9 * state.ets[-4]) + + state = state.replace( + ets=jax.lax.select( + state.counter != 1, + state.ets.at[0:3].set(state.ets[1:4]).at[3].set(model_output), # counter != 1 + state.ets, # counter 1 + ), + cur_sample=jax.lax.select( + state.counter != 1, + sample, # counter != 1 + state.cur_sample, # counter 1 + ), + ) + + state = state.replace( + cur_model_output=jax.lax.select_n( + jnp.clip(state.counter, 0, 4), + model_output, # counter 0 + (model_output + state.ets[-1]) / 2, # counter 1 + (3 * state.ets[-1] - state.ets[-2]) / 2, # counter 2 + (23 * state.ets[-1] - 16 * state.ets[-2] + 5 * state.ets[-3]) / 12, # counter 3 + (1 / 24) + * (55 * state.ets[-1] - 59 * state.ets[-2] + 37 * state.ets[-3] - 9 * state.ets[-4]), # counter >= 4 + ), + ) + + sample = state.cur_sample + model_output = state.cur_model_output + prev_sample = self._get_prev_sample(state, sample, timestep, prev_timestep, model_output) + state = state.replace(counter=state.counter + 1) + + return (prev_sample, state) + + def _get_prev_sample(self, state: PNDMSchedulerState, sample, timestep, prev_timestep, model_output): + # See formula (9) of PNDM paper https://arxiv.org/pdf/2202.09778.pdf + # this function computes x_(t−δ) using the formula of (9) + # Note that x_t needs to be added to both sides of the equation + + # Notation ( -> + # alpha_prod_t -> α_t + # alpha_prod_t_prev -> α_(t−δ) + # beta_prod_t -> (1 - α_t) + # beta_prod_t_prev -> (1 - α_(t−δ)) + # sample -> x_t + # model_output -> e_θ(x_t, t) + # prev_sample -> x_(t−δ) + alpha_prod_t = state.common.alphas_cumprod[timestep] + alpha_prod_t_prev = jnp.where( + prev_timestep >= 0, state.common.alphas_cumprod[prev_timestep], state.final_alpha_cumprod + ) + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + if self.config.prediction_type == "v_prediction": + model_output = (alpha_prod_t**0.5) * model_output + (beta_prod_t**0.5) * sample + elif self.config.prediction_type != "epsilon": + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon` or `v_prediction`" + ) + + # corresponds to (α_(t−δ) - α_t) divided by + # denominator of x_t in formula (9) and plus 1 + # Note: (α_(t−δ) - α_t) / (sqrt(α_t) * (sqrt(α_(t−δ)) + sqr(α_t))) = + # sqrt(α_(t−δ)) / sqrt(α_t)) + sample_coeff = (alpha_prod_t_prev / alpha_prod_t) ** (0.5) + + # corresponds to denominator of e_θ(x_t, t) in formula (9) + model_output_denom_coeff = alpha_prod_t * beta_prod_t_prev ** (0.5) + ( + alpha_prod_t * beta_prod_t * alpha_prod_t_prev + ) ** (0.5) + + # full formula (9) + prev_sample = ( + sample_coeff * sample - (alpha_prod_t_prev - alpha_prod_t) * model_output / model_output_denom_coeff + ) + + return prev_sample + + def add_noise( + self, + state: PNDMSchedulerState, + original_samples: jnp.ndarray, + noise: jnp.ndarray, + timesteps: jnp.ndarray, + ) -> jnp.ndarray: + return add_noise_common(state.common, original_samples, noise, timesteps) + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_repaint.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_repaint.py new file mode 100644 index 0000000000000000000000000000000000000000..41e7450d2df68c40c3b4f49669513832e443c5e3 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_repaint.py @@ -0,0 +1,344 @@ +# Copyright 2023 ETH Zurich Computer Vision Lab and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, randn_tensor +from .scheduling_utils import SchedulerMixin + + +@dataclass +class RePaintSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from + the current timestep. `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: torch.FloatTensor + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class RePaintScheduler(SchedulerMixin, ConfigMixin): + """ + RePaint is a schedule for DDPM inpainting inside a given mask. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/pdf/2201.09865.pdf + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, `squaredcos_cap_v2` or `sigmoid`. + eta (`float`): + The weight of noise for added noise in a diffusion step. Its value is between 0.0 and 1.0 -0.0 is DDIM and + 1.0 is DDPM scheduler respectively. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + variance_type (`str`): + options to clip the variance used when adding noise to the denoised sample. Choose from `fixed_small`, + `fixed_small_log`, `fixed_large`, `fixed_large_log`, `learned` or `learned_range`. + clip_sample (`bool`, default `True`): + option to clip predicted sample between -1 and 1 for numerical stability. + + """ + + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + eta: float = 0.0, + trained_betas: Optional[np.ndarray] = None, + clip_sample: bool = True, + ): + if trained_betas is not None: + self.betas = torch.from_numpy(trained_betas) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + elif beta_schedule == "sigmoid": + # GeoDiff sigmoid schedule + betas = torch.linspace(-6, 6, num_train_timesteps) + self.betas = torch.sigmoid(betas) * (beta_end - beta_start) + beta_start + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + self.one = torch.tensor(1.0) + + self.final_alpha_cumprod = torch.tensor(1.0) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # setable values + self.num_inference_steps = None + self.timesteps = torch.from_numpy(np.arange(0, num_train_timesteps)[::-1].copy()) + + self.eta = eta + + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def set_timesteps( + self, + num_inference_steps: int, + jump_length: int = 10, + jump_n_sample: int = 10, + device: Union[str, torch.device] = None, + ): + num_inference_steps = min(self.config.num_train_timesteps, num_inference_steps) + self.num_inference_steps = num_inference_steps + + timesteps = [] + + jumps = {} + for j in range(0, num_inference_steps - jump_length, jump_length): + jumps[j] = jump_n_sample - 1 + + t = num_inference_steps + while t >= 1: + t = t - 1 + timesteps.append(t) + + if jumps.get(t, 0) > 0: + jumps[t] = jumps[t] - 1 + for _ in range(jump_length): + t = t + 1 + timesteps.append(t) + + timesteps = np.array(timesteps) * (self.config.num_train_timesteps // self.num_inference_steps) + self.timesteps = torch.from_numpy(timesteps).to(device) + + def _get_variance(self, t): + prev_timestep = t - self.config.num_train_timesteps // self.num_inference_steps + + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + # For t > 0, compute predicted variance βt (see formula (6) and (7) from + # https://arxiv.org/pdf/2006.11239.pdf) and sample from it to get + # previous sample x_{t-1} ~ N(pred_prev_sample, variance) == add + # variance to pred_sample + # Is equivalent to formula (16) in https://arxiv.org/pdf/2010.02502.pdf + # without eta. + # variance = (1 - alpha_prod_t_prev) / (1 - alpha_prod_t) * self.betas[t] + variance = (beta_prod_t_prev / beta_prod_t) * (1 - alpha_prod_t / alpha_prod_t_prev) + + return variance + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + original_image: torch.FloatTensor, + mask: torch.FloatTensor, + generator: Optional[torch.Generator] = None, + return_dict: bool = True, + ) -> Union[RePaintSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned + diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + original_image (`torch.FloatTensor`): + the original image to inpaint on. + mask (`torch.FloatTensor`): + the mask where 0.0 values define which part of the original image to inpaint (change). + generator (`torch.Generator`, *optional*): random number generator. + return_dict (`bool`): option for returning tuple rather than + DDPMSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.RePaintSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.RePaintSchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + + """ + t = timestep + prev_timestep = timestep - self.config.num_train_timesteps // self.num_inference_steps + + # 1. compute alphas, betas + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod + beta_prod_t = 1 - alpha_prod_t + + # 2. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (15) from https://arxiv.org/pdf/2006.11239.pdf + pred_original_sample = (sample - beta_prod_t**0.5 * model_output) / alpha_prod_t**0.5 + + # 3. Clip "predicted x_0" + if self.config.clip_sample: + pred_original_sample = torch.clamp(pred_original_sample, -1, 1) + + # We choose to follow RePaint Algorithm 1 to get x_{t-1}, however we + # substitute formula (7) in the algorithm coming from DDPM paper + # (formula (4) Algorithm 2 - Sampling) with formula (12) from DDIM paper. + # DDIM schedule gives the same results as DDPM with eta = 1.0 + # Noise is being reused in 7. and 8., but no impact on quality has + # been observed. + + # 5. Add noise + device = model_output.device + noise = randn_tensor(model_output.shape, generator=generator, device=device, dtype=model_output.dtype) + std_dev_t = self.eta * self._get_variance(timestep) ** 0.5 + + variance = 0 + if t > 0 and self.eta > 0: + variance = std_dev_t * noise + + # 6. compute "direction pointing to x_t" of formula (12) + # from https://arxiv.org/pdf/2010.02502.pdf + pred_sample_direction = (1 - alpha_prod_t_prev - std_dev_t**2) ** 0.5 * model_output + + # 7. compute x_{t-1} of formula (12) from https://arxiv.org/pdf/2010.02502.pdf + prev_unknown_part = alpha_prod_t_prev**0.5 * pred_original_sample + pred_sample_direction + variance + + # 8. Algorithm 1 Line 5 https://arxiv.org/pdf/2201.09865.pdf + prev_known_part = (alpha_prod_t_prev**0.5) * original_image + ((1 - alpha_prod_t_prev) ** 0.5) * noise + + # 9. Algorithm 1 Line 8 https://arxiv.org/pdf/2201.09865.pdf + pred_prev_sample = mask * prev_known_part + (1.0 - mask) * prev_unknown_part + + if not return_dict: + return ( + pred_prev_sample, + pred_original_sample, + ) + + return RePaintSchedulerOutput(prev_sample=pred_prev_sample, pred_original_sample=pred_original_sample) + + def undo_step(self, sample, timestep, generator=None): + n = self.config.num_train_timesteps // self.num_inference_steps + + for i in range(n): + beta = self.betas[timestep + i] + if sample.device.type == "mps": + # randn does not work reproducibly on mps + noise = randn_tensor(sample.shape, dtype=sample.dtype, generator=generator) + noise = noise.to(sample.device) + else: + noise = randn_tensor(sample.shape, generator=generator, device=sample.device, dtype=sample.dtype) + + # 10. Algorithm 1 Line 10 https://arxiv.org/pdf/2201.09865.pdf + sample = (1 - beta) ** 0.5 * sample + beta**0.5 * noise + + return sample + + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + raise NotImplementedError("Use `DDPMScheduler.add_noise()` to train for sampling with RePaint.") + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_ve.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_ve.py new file mode 100644 index 0000000000000000000000000000000000000000..339edfbb02eb6ac0f79b3969004418bb29e212b5 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_ve.py @@ -0,0 +1,288 @@ +# Copyright 2023 Google Brain and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/yang-song/score_sde_pytorch + +import math +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, randn_tensor +from .scheduling_utils import SchedulerMixin, SchedulerOutput + + +@dataclass +class SdeVeOutput(BaseOutput): + """ + Output class for the ScoreSdeVeScheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + prev_sample_mean (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Mean averaged `prev_sample`. Same as `prev_sample`, only mean-averaged over previous timesteps. + """ + + prev_sample: torch.FloatTensor + prev_sample_mean: torch.FloatTensor + + +class ScoreSdeVeScheduler(SchedulerMixin, ConfigMixin): + """ + The variance exploding stochastic differential equation (SDE) scheduler. + + For more information, see the original paper: https://arxiv.org/abs/2011.13456 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + snr (`float`): + coefficient weighting the step from the model_output sample (from the network) to the random noise. + sigma_min (`float`): + initial noise scale for sigma sequence in sampling procedure. The minimum sigma should mirror the + distribution of the data. + sigma_max (`float`): maximum value used for the range of continuous timesteps passed into the model. + sampling_eps (`float`): the end value of sampling, where timesteps decrease progressively from 1 to + epsilon. + correct_steps (`int`): number of correction steps performed on a produced sample. + """ + + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 2000, + snr: float = 0.15, + sigma_min: float = 0.01, + sigma_max: float = 1348.0, + sampling_eps: float = 1e-5, + correct_steps: int = 1, + ): + # standard deviation of the initial noise distribution + self.init_noise_sigma = sigma_max + + # setable values + self.timesteps = None + + self.set_sigmas(num_train_timesteps, sigma_min, sigma_max, sampling_eps) + + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def set_timesteps( + self, num_inference_steps: int, sampling_eps: float = None, device: Union[str, torch.device] = None + ): + """ + Sets the continuous timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + sampling_eps (`float`, optional): + final timestep value (overrides value given at Scheduler instantiation). + + """ + sampling_eps = sampling_eps if sampling_eps is not None else self.config.sampling_eps + + self.timesteps = torch.linspace(1, sampling_eps, num_inference_steps, device=device) + + def set_sigmas( + self, num_inference_steps: int, sigma_min: float = None, sigma_max: float = None, sampling_eps: float = None + ): + """ + Sets the noise scales used for the diffusion chain. Supporting function to be run before inference. + + The sigmas control the weight of the `drift` and `diffusion` components of sample update. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + sigma_min (`float`, optional): + initial noise scale value (overrides value given at Scheduler instantiation). + sigma_max (`float`, optional): + final noise scale value (overrides value given at Scheduler instantiation). + sampling_eps (`float`, optional): + final timestep value (overrides value given at Scheduler instantiation). + + """ + sigma_min = sigma_min if sigma_min is not None else self.config.sigma_min + sigma_max = sigma_max if sigma_max is not None else self.config.sigma_max + sampling_eps = sampling_eps if sampling_eps is not None else self.config.sampling_eps + if self.timesteps is None: + self.set_timesteps(num_inference_steps, sampling_eps) + + self.sigmas = sigma_min * (sigma_max / sigma_min) ** (self.timesteps / sampling_eps) + self.discrete_sigmas = torch.exp(torch.linspace(math.log(sigma_min), math.log(sigma_max), num_inference_steps)) + self.sigmas = torch.tensor([sigma_min * (sigma_max / sigma_min) ** t for t in self.timesteps]) + + def get_adjacent_sigma(self, timesteps, t): + return torch.where( + timesteps == 0, + torch.zeros_like(t.to(timesteps.device)), + self.discrete_sigmas[timesteps - 1].to(timesteps.device), + ) + + def step_pred( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + generator: Optional[torch.Generator] = None, + return_dict: bool = True, + ) -> Union[SdeVeOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + generator: random number generator. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~schedulers.scheduling_sde_ve.SdeVeOutput`] or `tuple`: [`~schedulers.scheduling_sde_ve.SdeVeOutput`] if + `return_dict` is True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.timesteps is None: + raise ValueError( + "`self.timesteps` is not set, you need to run 'set_timesteps' after creating the scheduler" + ) + + timestep = timestep * torch.ones( + sample.shape[0], device=sample.device + ) # torch.repeat_interleave(timestep, sample.shape[0]) + timesteps = (timestep * (len(self.timesteps) - 1)).long() + + # mps requires indices to be in the same device, so we use cpu as is the default with cuda + timesteps = timesteps.to(self.discrete_sigmas.device) + + sigma = self.discrete_sigmas[timesteps].to(sample.device) + adjacent_sigma = self.get_adjacent_sigma(timesteps, timestep).to(sample.device) + drift = torch.zeros_like(sample) + diffusion = (sigma**2 - adjacent_sigma**2) ** 0.5 + + # equation 6 in the paper: the model_output modeled by the network is grad_x log pt(x) + # also equation 47 shows the analog from SDE models to ancestral sampling methods + diffusion = diffusion.flatten() + while len(diffusion.shape) < len(sample.shape): + diffusion = diffusion.unsqueeze(-1) + drift = drift - diffusion**2 * model_output + + # equation 6: sample noise for the diffusion term of + noise = randn_tensor( + sample.shape, layout=sample.layout, generator=generator, device=sample.device, dtype=sample.dtype + ) + prev_sample_mean = sample - drift # subtract because `dt` is a small negative timestep + # TODO is the variable diffusion the correct scaling term for the noise? + prev_sample = prev_sample_mean + diffusion * noise # add impact of diffusion field g + + if not return_dict: + return (prev_sample, prev_sample_mean) + + return SdeVeOutput(prev_sample=prev_sample, prev_sample_mean=prev_sample_mean) + + def step_correct( + self, + model_output: torch.FloatTensor, + sample: torch.FloatTensor, + generator: Optional[torch.Generator] = None, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Correct the predicted sample based on the output model_output of the network. This is often run repeatedly + after making the prediction for the previous timestep. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + generator: random number generator. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~schedulers.scheduling_sde_ve.SdeVeOutput`] or `tuple`: [`~schedulers.scheduling_sde_ve.SdeVeOutput`] if + `return_dict` is True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + if self.timesteps is None: + raise ValueError( + "`self.timesteps` is not set, you need to run 'set_timesteps' after creating the scheduler" + ) + + # For small batch sizes, the paper "suggest replacing norm(z) with sqrt(d), where d is the dim. of z" + # sample noise for correction + noise = randn_tensor(sample.shape, layout=sample.layout, generator=generator).to(sample.device) + + # compute step size from the model_output, the noise, and the snr + grad_norm = torch.norm(model_output.reshape(model_output.shape[0], -1), dim=-1).mean() + noise_norm = torch.norm(noise.reshape(noise.shape[0], -1), dim=-1).mean() + step_size = (self.config.snr * noise_norm / grad_norm) ** 2 * 2 + step_size = step_size * torch.ones(sample.shape[0]).to(sample.device) + # self.repeat_scalar(step_size, sample.shape[0]) + + # compute corrected sample: model_output term and noise term + step_size = step_size.flatten() + while len(step_size.shape) < len(sample.shape): + step_size = step_size.unsqueeze(-1) + prev_sample_mean = sample + step_size * model_output + prev_sample = prev_sample_mean + ((step_size * 2) ** 0.5) * noise + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.FloatTensor, + ) -> torch.FloatTensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + timesteps = timesteps.to(original_samples.device) + sigmas = self.discrete_sigmas.to(original_samples.device)[timesteps] + noise = ( + noise * sigmas[:, None, None, None] + if noise is not None + else torch.randn_like(original_samples) * sigmas[:, None, None, None] + ) + noisy_samples = noise + original_samples + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_ve_flax.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_ve_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..b6240559fc88fa45e4612dc3005ba66e10d3269d --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_ve_flax.py @@ -0,0 +1,279 @@ +# Copyright 2023 Google Brain and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/yang-song/score_sde_pytorch + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import flax +import jax.numpy as jnp +from jax import random + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils_flax import FlaxSchedulerMixin, FlaxSchedulerOutput, broadcast_to_shape_from_left + + +@flax.struct.dataclass +class ScoreSdeVeSchedulerState: + # setable values + timesteps: Optional[jnp.ndarray] = None + discrete_sigmas: Optional[jnp.ndarray] = None + sigmas: Optional[jnp.ndarray] = None + + @classmethod + def create(cls): + return cls() + + +@dataclass +class FlaxSdeVeOutput(FlaxSchedulerOutput): + """ + Output class for the ScoreSdeVeScheduler's step function output. + + Args: + state (`ScoreSdeVeSchedulerState`): + prev_sample (`jnp.ndarray` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + prev_sample_mean (`jnp.ndarray` of shape `(batch_size, num_channels, height, width)` for images): + Mean averaged `prev_sample`. Same as `prev_sample`, only mean-averaged over previous timesteps. + """ + + state: ScoreSdeVeSchedulerState + prev_sample: jnp.ndarray + prev_sample_mean: Optional[jnp.ndarray] = None + + +class FlaxScoreSdeVeScheduler(FlaxSchedulerMixin, ConfigMixin): + """ + The variance exploding stochastic differential equation (SDE) scheduler. + + For more information, see the original paper: https://arxiv.org/abs/2011.13456 + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + snr (`float`): + coefficient weighting the step from the model_output sample (from the network) to the random noise. + sigma_min (`float`): + initial noise scale for sigma sequence in sampling procedure. The minimum sigma should mirror the + distribution of the data. + sigma_max (`float`): maximum value used for the range of continuous timesteps passed into the model. + sampling_eps (`float`): the end value of sampling, where timesteps decrease progressively from 1 to + epsilon. + correct_steps (`int`): number of correction steps performed on a produced sample. + """ + + @property + def has_state(self): + return True + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 2000, + snr: float = 0.15, + sigma_min: float = 0.01, + sigma_max: float = 1348.0, + sampling_eps: float = 1e-5, + correct_steps: int = 1, + ): + pass + + def create_state(self): + state = ScoreSdeVeSchedulerState.create() + return self.set_sigmas( + state, + self.config.num_train_timesteps, + self.config.sigma_min, + self.config.sigma_max, + self.config.sampling_eps, + ) + + def set_timesteps( + self, state: ScoreSdeVeSchedulerState, num_inference_steps: int, shape: Tuple = (), sampling_eps: float = None + ) -> ScoreSdeVeSchedulerState: + """ + Sets the continuous timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + state (`ScoreSdeVeSchedulerState`): the `FlaxScoreSdeVeScheduler` state data class instance. + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + sampling_eps (`float`, optional): + final timestep value (overrides value given at Scheduler instantiation). + + """ + sampling_eps = sampling_eps if sampling_eps is not None else self.config.sampling_eps + + timesteps = jnp.linspace(1, sampling_eps, num_inference_steps) + return state.replace(timesteps=timesteps) + + def set_sigmas( + self, + state: ScoreSdeVeSchedulerState, + num_inference_steps: int, + sigma_min: float = None, + sigma_max: float = None, + sampling_eps: float = None, + ) -> ScoreSdeVeSchedulerState: + """ + Sets the noise scales used for the diffusion chain. Supporting function to be run before inference. + + The sigmas control the weight of the `drift` and `diffusion` components of sample update. + + Args: + state (`ScoreSdeVeSchedulerState`): the `FlaxScoreSdeVeScheduler` state data class instance. + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + sigma_min (`float`, optional): + initial noise scale value (overrides value given at Scheduler instantiation). + sigma_max (`float`, optional): + final noise scale value (overrides value given at Scheduler instantiation). + sampling_eps (`float`, optional): + final timestep value (overrides value given at Scheduler instantiation). + """ + sigma_min = sigma_min if sigma_min is not None else self.config.sigma_min + sigma_max = sigma_max if sigma_max is not None else self.config.sigma_max + sampling_eps = sampling_eps if sampling_eps is not None else self.config.sampling_eps + if state.timesteps is None: + state = self.set_timesteps(state, num_inference_steps, sampling_eps) + + discrete_sigmas = jnp.exp(jnp.linspace(jnp.log(sigma_min), jnp.log(sigma_max), num_inference_steps)) + sigmas = jnp.array([sigma_min * (sigma_max / sigma_min) ** t for t in state.timesteps]) + + return state.replace(discrete_sigmas=discrete_sigmas, sigmas=sigmas) + + def get_adjacent_sigma(self, state, timesteps, t): + return jnp.where(timesteps == 0, jnp.zeros_like(t), state.discrete_sigmas[timesteps - 1]) + + def step_pred( + self, + state: ScoreSdeVeSchedulerState, + model_output: jnp.ndarray, + timestep: int, + sample: jnp.ndarray, + key: random.KeyArray, + return_dict: bool = True, + ) -> Union[FlaxSdeVeOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + state (`ScoreSdeVeSchedulerState`): the `FlaxScoreSdeVeScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + generator: random number generator. + return_dict (`bool`): option for returning tuple rather than FlaxSdeVeOutput class + + Returns: + [`FlaxSdeVeOutput`] or `tuple`: [`FlaxSdeVeOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + + """ + if state.timesteps is None: + raise ValueError( + "`state.timesteps` is not set, you need to run 'set_timesteps' after creating the scheduler" + ) + + timestep = timestep * jnp.ones( + sample.shape[0], + ) + timesteps = (timestep * (len(state.timesteps) - 1)).long() + + sigma = state.discrete_sigmas[timesteps] + adjacent_sigma = self.get_adjacent_sigma(state, timesteps, timestep) + drift = jnp.zeros_like(sample) + diffusion = (sigma**2 - adjacent_sigma**2) ** 0.5 + + # equation 6 in the paper: the model_output modeled by the network is grad_x log pt(x) + # also equation 47 shows the analog from SDE models to ancestral sampling methods + diffusion = diffusion.flatten() + diffusion = broadcast_to_shape_from_left(diffusion, sample.shape) + drift = drift - diffusion**2 * model_output + + # equation 6: sample noise for the diffusion term of + key = random.split(key, num=1) + noise = random.normal(key=key, shape=sample.shape) + prev_sample_mean = sample - drift # subtract because `dt` is a small negative timestep + # TODO is the variable diffusion the correct scaling term for the noise? + prev_sample = prev_sample_mean + diffusion * noise # add impact of diffusion field g + + if not return_dict: + return (prev_sample, prev_sample_mean, state) + + return FlaxSdeVeOutput(prev_sample=prev_sample, prev_sample_mean=prev_sample_mean, state=state) + + def step_correct( + self, + state: ScoreSdeVeSchedulerState, + model_output: jnp.ndarray, + sample: jnp.ndarray, + key: random.KeyArray, + return_dict: bool = True, + ) -> Union[FlaxSdeVeOutput, Tuple]: + """ + Correct the predicted sample based on the output model_output of the network. This is often run repeatedly + after making the prediction for the previous timestep. + + Args: + state (`ScoreSdeVeSchedulerState`): the `FlaxScoreSdeVeScheduler` state data class instance. + model_output (`jnp.ndarray`): direct output from learned diffusion model. + sample (`jnp.ndarray`): + current instance of sample being created by diffusion process. + generator: random number generator. + return_dict (`bool`): option for returning tuple rather than FlaxSdeVeOutput class + + Returns: + [`FlaxSdeVeOutput`] or `tuple`: [`FlaxSdeVeOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + + """ + if state.timesteps is None: + raise ValueError( + "`state.timesteps` is not set, you need to run 'set_timesteps' after creating the scheduler" + ) + + # For small batch sizes, the paper "suggest replacing norm(z) with sqrt(d), where d is the dim. of z" + # sample noise for correction + key = random.split(key, num=1) + noise = random.normal(key=key, shape=sample.shape) + + # compute step size from the model_output, the noise, and the snr + grad_norm = jnp.linalg.norm(model_output) + noise_norm = jnp.linalg.norm(noise) + step_size = (self.config.snr * noise_norm / grad_norm) ** 2 * 2 + step_size = step_size * jnp.ones(sample.shape[0]) + + # compute corrected sample: model_output term and noise term + step_size = step_size.flatten() + step_size = broadcast_to_shape_from_left(step_size, sample.shape) + prev_sample_mean = sample + step_size * model_output + prev_sample = prev_sample_mean + ((step_size * 2) ** 0.5) * noise + + if not return_dict: + return (prev_sample, state) + + return FlaxSdeVeOutput(prev_sample=prev_sample, state=state) + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_vp.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_vp.py new file mode 100644 index 0000000000000000000000000000000000000000..6e2ead90edb57cd1eb1d270695e222d404064180 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_sde_vp.py @@ -0,0 +1,90 @@ +# Copyright 2023 Google Brain and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: This file is strongly influenced by https://github.com/yang-song/score_sde_pytorch + +import math +from typing import Union + +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import randn_tensor +from .scheduling_utils import SchedulerMixin + + +class ScoreSdeVpScheduler(SchedulerMixin, ConfigMixin): + """ + The variance preserving stochastic differential equation (SDE) scheduler. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more information, see the original paper: https://arxiv.org/abs/2011.13456 + + UNDER CONSTRUCTION + + """ + + order = 1 + + @register_to_config + def __init__(self, num_train_timesteps=2000, beta_min=0.1, beta_max=20, sampling_eps=1e-3): + self.sigmas = None + self.discrete_sigmas = None + self.timesteps = None + + def set_timesteps(self, num_inference_steps, device: Union[str, torch.device] = None): + self.timesteps = torch.linspace(1, self.config.sampling_eps, num_inference_steps, device=device) + + def step_pred(self, score, x, t, generator=None): + if self.timesteps is None: + raise ValueError( + "`self.timesteps` is not set, you need to run 'set_timesteps' after creating the scheduler" + ) + + # TODO(Patrick) better comments + non-PyTorch + # postprocess model score + log_mean_coeff = ( + -0.25 * t**2 * (self.config.beta_max - self.config.beta_min) - 0.5 * t * self.config.beta_min + ) + std = torch.sqrt(1.0 - torch.exp(2.0 * log_mean_coeff)) + std = std.flatten() + while len(std.shape) < len(score.shape): + std = std.unsqueeze(-1) + score = -score / std + + # compute + dt = -1.0 / len(self.timesteps) + + beta_t = self.config.beta_min + t * (self.config.beta_max - self.config.beta_min) + beta_t = beta_t.flatten() + while len(beta_t.shape) < len(x.shape): + beta_t = beta_t.unsqueeze(-1) + drift = -0.5 * beta_t * x + + diffusion = torch.sqrt(beta_t) + drift = drift - diffusion**2 * score + x_mean = x + drift * dt + + # add noise + noise = randn_tensor(x.shape, layout=x.layout, generator=generator, device=x.device, dtype=x.dtype) + x = x_mean + diffusion * math.sqrt(-dt) * noise + + return x, x_mean + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_unclip.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_unclip.py new file mode 100644 index 0000000000000000000000000000000000000000..fd23e48bad00d16a1086f31b6584ff9df03129fb --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_unclip.py @@ -0,0 +1,348 @@ +# Copyright 2023 Kakao Brain and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput, randn_tensor +from .scheduling_utils import SchedulerMixin + + +@dataclass +# Copied from diffusers.schedulers.scheduling_ddpm.DDPMSchedulerOutput with DDPM->UnCLIP +class UnCLIPSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + The predicted denoised sample (x_{0}) based on the model output from the current timestep. + `pred_original_sample` can be used to preview progress or for guidance. + """ + + prev_sample: torch.FloatTensor + pred_original_sample: Optional[torch.FloatTensor] = None + + +# Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar +def betas_for_alpha_bar( + num_diffusion_timesteps, + max_beta=0.999, + alpha_transform_type="cosine", +): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar. + Choose from `cosine` or `exp` + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + if alpha_transform_type == "cosine": + + def alpha_bar_fn(t): + return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2 + + elif alpha_transform_type == "exp": + + def alpha_bar_fn(t): + return math.exp(t * -12.0) + + else: + raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}") + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class UnCLIPScheduler(SchedulerMixin, ConfigMixin): + """ + NOTE: do not use this scheduler. The DDPM scheduler has been updated to support the changes made here. This + scheduler will be removed and replaced with DDPM. + + This is a modified DDPM Scheduler specifically for the karlo unCLIP model. + + This scheduler has some minor variations in how it calculates the learned range variance and dynamically + re-calculates betas based off the timesteps it is skipping. + + The scheduler also uses a slightly different step ratio when computing timesteps to use for inference. + + See [`~DDPMScheduler`] for more information on DDPM scheduling + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + variance_type (`str`): + options to clip the variance used when adding noise to the denoised sample. Choose from `fixed_small_log` + or `learned_range`. + clip_sample (`bool`, default `True`): + option to clip predicted sample between `-clip_sample_range` and `clip_sample_range` for numerical + stability. + clip_sample_range (`float`, default `1.0`): + The range to clip the sample between. See `clip_sample`. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion process) + or `sample` (directly predicting the noisy sample`) + """ + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + variance_type: str = "fixed_small_log", + clip_sample: bool = True, + clip_sample_range: Optional[float] = 1.0, + prediction_type: str = "epsilon", + beta_schedule: str = "squaredcos_cap_v2", + ): + if beta_schedule != "squaredcos_cap_v2": + raise ValueError("UnCLIPScheduler only supports `beta_schedule`: 'squaredcos_cap_v2'") + + self.betas = betas_for_alpha_bar(num_train_timesteps) + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + self.one = torch.tensor(1.0) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + # setable values + self.num_inference_steps = None + self.timesteps = torch.from_numpy(np.arange(0, num_train_timesteps)[::-1].copy()) + + self.variance_type = variance_type + + def scale_model_input(self, sample: torch.FloatTensor, timestep: Optional[int] = None) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + timestep (`int`, optional): current timestep + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Note that this scheduler uses a slightly different step ratio than the other diffusers schedulers. The + different step ratio is to mimic the original karlo implementation and does not affect the quality or accuracy + of the results. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + """ + self.num_inference_steps = num_inference_steps + step_ratio = (self.config.num_train_timesteps - 1) / (self.num_inference_steps - 1) + timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64) + self.timesteps = torch.from_numpy(timesteps).to(device) + + def _get_variance(self, t, prev_timestep=None, predicted_variance=None, variance_type=None): + if prev_timestep is None: + prev_timestep = t - 1 + + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.one + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + if prev_timestep == t - 1: + beta = self.betas[t] + else: + beta = 1 - alpha_prod_t / alpha_prod_t_prev + + # For t > 0, compute predicted variance βt (see formula (6) and (7) from https://arxiv.org/pdf/2006.11239.pdf) + # and sample from it to get previous sample + # x_{t-1} ~ N(pred_prev_sample, variance) == add variance to pred_sample + variance = beta_prod_t_prev / beta_prod_t * beta + + if variance_type is None: + variance_type = self.config.variance_type + + # hacks - were probably added for training stability + if variance_type == "fixed_small_log": + variance = torch.log(torch.clamp(variance, min=1e-20)) + variance = torch.exp(0.5 * variance) + elif variance_type == "learned_range": + # NOTE difference with DDPM scheduler + min_log = variance.log() + max_log = beta.log() + + frac = (predicted_variance + 1) / 2 + variance = frac * max_log + (1 - frac) * min_log + + return variance + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + prev_timestep: Optional[int] = None, + generator=None, + return_dict: bool = True, + ) -> Union[UnCLIPSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion + process from the learned model outputs (most often the predicted noise). + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + prev_timestep (`int`, *optional*): The previous timestep to predict the previous sample at. + Used to dynamically compute beta. If not given, `t-1` is used and the pre-computed beta is used. + generator: random number generator. + return_dict (`bool`): option for returning tuple rather than UnCLIPSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.UnCLIPSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.UnCLIPSchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + + """ + t = timestep + + if model_output.shape[1] == sample.shape[1] * 2 and self.variance_type == "learned_range": + model_output, predicted_variance = torch.split(model_output, sample.shape[1], dim=1) + else: + predicted_variance = None + + # 1. compute alphas, betas + if prev_timestep is None: + prev_timestep = t - 1 + + alpha_prod_t = self.alphas_cumprod[t] + alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.one + beta_prod_t = 1 - alpha_prod_t + beta_prod_t_prev = 1 - alpha_prod_t_prev + + if prev_timestep == t - 1: + beta = self.betas[t] + alpha = self.alphas[t] + else: + beta = 1 - alpha_prod_t / alpha_prod_t_prev + alpha = 1 - beta + + # 2. compute predicted original sample from predicted noise also called + # "predicted x_0" of formula (15) from https://arxiv.org/pdf/2006.11239.pdf + if self.config.prediction_type == "epsilon": + pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_prod_t ** (0.5) + elif self.config.prediction_type == "sample": + pred_original_sample = model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon` or `sample`" + " for the UnCLIPScheduler." + ) + + # 3. Clip "predicted x_0" + if self.config.clip_sample: + pred_original_sample = torch.clamp( + pred_original_sample, -self.config.clip_sample_range, self.config.clip_sample_range + ) + + # 4. Compute coefficients for pred_original_sample x_0 and current sample x_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_original_sample_coeff = (alpha_prod_t_prev ** (0.5) * beta) / beta_prod_t + current_sample_coeff = alpha ** (0.5) * beta_prod_t_prev / beta_prod_t + + # 5. Compute predicted previous sample µ_t + # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf + pred_prev_sample = pred_original_sample_coeff * pred_original_sample + current_sample_coeff * sample + + # 6. Add noise + variance = 0 + if t > 0: + variance_noise = randn_tensor( + model_output.shape, dtype=model_output.dtype, generator=generator, device=model_output.device + ) + + variance = self._get_variance( + t, + predicted_variance=predicted_variance, + prev_timestep=prev_timestep, + ) + + if self.variance_type == "fixed_small_log": + variance = variance + elif self.variance_type == "learned_range": + variance = (0.5 * variance).exp() + else: + raise ValueError( + f"variance_type given as {self.variance_type} must be one of `fixed_small_log` or `learned_range`" + " for the UnCLIPScheduler." + ) + + variance = variance * variance_noise + + pred_prev_sample = pred_prev_sample + variance + + if not return_dict: + return (pred_prev_sample,) + + return UnCLIPSchedulerOutput(prev_sample=pred_prev_sample, pred_original_sample=pred_original_sample) + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_unipc_multistep.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_unipc_multistep.py new file mode 100644 index 0000000000000000000000000000000000000000..3caa01a58562f5f12d46354ef6112a64875da79d --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_unipc_multistep.py @@ -0,0 +1,681 @@ +# Copyright 2023 TSAIL Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# DISCLAIMER: check https://arxiv.org/abs/2302.04867 and https://github.com/wl-zhao/UniPC for more info +# The codebase is modified based on https://github.com/huggingface/diffusers/blob/main/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py + +import math +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch + +from ..configuration_utils import ConfigMixin, register_to_config +from .scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput + + +def betas_for_alpha_bar(num_diffusion_timesteps, max_beta=0.999): + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + + Returns: + betas (`np.ndarray`): the betas used by the scheduler to step the model outputs + """ + + def alpha_bar(time_step): + return math.cos((time_step + 0.008) / 1.008 * math.pi / 2) ** 2 + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar(t2) / alpha_bar(t1), max_beta)) + return torch.tensor(betas, dtype=torch.float32) + + +class UniPCMultistepScheduler(SchedulerMixin, ConfigMixin): + """ + UniPC is a training-free framework designed for the fast sampling of diffusion models, which consists of a + corrector (UniC) and a predictor (UniP) that share a unified analytical form and support arbitrary orders. UniPC is + by desinged model-agnostic, supporting pixel-space/latent-space DPMs on unconditional/conditional sampling. It can + also be applied to both noise prediction model and data prediction model. The corrector UniC can be also applied + after any off-the-shelf solvers to increase the order of accuracy. + + For more details, see the original paper: https://arxiv.org/abs/2302.04867 + + Currently, we support the multistep UniPC for both noise prediction models and data prediction models. We recommend + to use `solver_order=2` for guided sampling, and `solver_order=3` for unconditional sampling. + + We also support the "dynamic thresholding" method in Imagen (https://arxiv.org/abs/2205.11487). For pixel-space + diffusion models, you can set both `predict_x0=True` and `thresholding=True` to use the dynamic thresholding. Note + that the thresholding method is unsuitable for latent-space diffusion models (such as stable-diffusion). + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + Args: + num_train_timesteps (`int`): number of diffusion steps used to train the model. + beta_start (`float`): the starting `beta` value of inference. + beta_end (`float`): the final `beta` value. + beta_schedule (`str`): + the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from + `linear`, `scaled_linear`, or `squaredcos_cap_v2`. + trained_betas (`np.ndarray`, optional): + option to pass an array of betas directly to the constructor to bypass `beta_start`, `beta_end` etc. + solver_order (`int`, default `2`): + the order of UniPC, also the p in UniPC-p; can be any positive integer. Note that the effective order of + accuracy is `solver_order + 1` due to the UniC. We recommend to use `solver_order=2` for guided sampling, + and `solver_order=3` for unconditional sampling. + prediction_type (`str`, default `epsilon`, optional): + prediction type of the scheduler function, one of `epsilon` (predicting the noise of the diffusion + process), `sample` (directly predicting the noisy sample`) or `v_prediction` (see section 2.4 + https://imagen.research.google/video/paper.pdf) + thresholding (`bool`, default `False`): + whether to use the "dynamic thresholding" method (introduced by Imagen, https://arxiv.org/abs/2205.11487). + For pixel-space diffusion models, you can set both `predict_x0=True` and `thresholding=True` to use the + dynamic thresholding. Note that the thresholding method is unsuitable for latent-space diffusion models + (such as stable-diffusion). + dynamic_thresholding_ratio (`float`, default `0.995`): + the ratio for the dynamic thresholding method. Default is `0.995`, the same as Imagen + (https://arxiv.org/abs/2205.11487). + sample_max_value (`float`, default `1.0`): + the threshold value for dynamic thresholding. Valid only when `thresholding=True` and `predict_x0=True`. + predict_x0 (`bool`, default `True`): + whether to use the updating algrithm on the predicted x0. See https://arxiv.org/abs/2211.01095 for details + solver_type (`str`, default `bh2`): + the solver type of UniPC. We recommend use `bh1` for unconditional sampling when steps < 10, and use `bh2` + otherwise. + lower_order_final (`bool`, default `True`): + whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. We empirically + find this trick can stabilize the sampling of DPM-Solver for steps < 15, especially for steps <= 10. + disable_corrector (`list`, default `[]`): + decide which step to disable the corrector. For large guidance scale, the misalignment between the + `epsilon_theta(x_t, c)`and `epsilon_theta(x_t^c, c)` might influence the convergence. This can be mitigated + by disable the corrector at the first few steps (e.g., disable_corrector=[0]) + solver_p (`SchedulerMixin`, default `None`): + can be any other scheduler. If specified, the algorithm will become solver_p + UniC. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the + noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence + of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. + timestep_spacing (`str`, default `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2. of [Common Diffusion Noise Schedules and Sample + Steps are Flawed](https://arxiv.org/abs/2305.08891) for more information. + steps_offset (`int`, default `0`): + an offset added to the inference steps. You can use a combination of `offset=1` and + `set_alpha_to_one=False`, to make the last step use step 0 for the previous alpha product, as done in + stable diffusion. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + beta_start: float = 0.0001, + beta_end: float = 0.02, + beta_schedule: str = "linear", + trained_betas: Optional[Union[np.ndarray, List[float]]] = None, + solver_order: int = 2, + prediction_type: str = "epsilon", + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + sample_max_value: float = 1.0, + predict_x0: bool = True, + solver_type: str = "bh2", + lower_order_final: bool = True, + disable_corrector: List[int] = [], + solver_p: SchedulerMixin = None, + use_karras_sigmas: Optional[bool] = False, + timestep_spacing: str = "linspace", + steps_offset: int = 0, + ): + if trained_betas is not None: + self.betas = torch.tensor(trained_betas, dtype=torch.float32) + elif beta_schedule == "linear": + self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32) + elif beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + self.betas = ( + torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2 + ) + elif beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + self.betas = betas_for_alpha_bar(num_train_timesteps) + else: + raise NotImplementedError(f"{beta_schedule} does is not implemented for {self.__class__}") + + self.alphas = 1.0 - self.betas + self.alphas_cumprod = torch.cumprod(self.alphas, dim=0) + # Currently we only support VP-type noise schedule + self.alpha_t = torch.sqrt(self.alphas_cumprod) + self.sigma_t = torch.sqrt(1 - self.alphas_cumprod) + self.lambda_t = torch.log(self.alpha_t) - torch.log(self.sigma_t) + + # standard deviation of the initial noise distribution + self.init_noise_sigma = 1.0 + + if solver_type not in ["bh1", "bh2"]: + if solver_type in ["midpoint", "heun", "logrho"]: + self.register_to_config(solver_type="bh2") + else: + raise NotImplementedError(f"{solver_type} does is not implemented for {self.__class__}") + + self.predict_x0 = predict_x0 + # setable values + self.num_inference_steps = None + timesteps = np.linspace(0, num_train_timesteps - 1, num_train_timesteps, dtype=np.float32)[::-1].copy() + self.timesteps = torch.from_numpy(timesteps) + self.model_outputs = [None] * solver_order + self.timestep_list = [None] * solver_order + self.lower_order_nums = 0 + self.disable_corrector = disable_corrector + self.solver_p = solver_p + self.last_sample = None + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + device (`str` or `torch.device`, optional): + the device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + # "linspace", "leading", "trailing" corresponds to annotation of Table 2. of https://arxiv.org/abs/2305.08891 + if self.config.timestep_spacing == "linspace": + timesteps = ( + np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps + 1) + .round()[::-1][:-1] + .copy() + .astype(np.int64) + ) + elif self.config.timestep_spacing == "leading": + step_ratio = self.config.num_train_timesteps // (num_inference_steps + 1) + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = (np.arange(0, num_inference_steps + 1) * step_ratio).round()[::-1][:-1].copy().astype(np.int64) + timesteps += self.config.steps_offset + elif self.config.timestep_spacing == "trailing": + step_ratio = self.config.num_train_timesteps / num_inference_steps + # creates integer timesteps by multiplying by ratio + # casting to int to avoid issues when num_inference_step is power of 3 + timesteps = np.arange(self.config.num_train_timesteps, 0, -step_ratio).round().copy().astype(np.int64) + timesteps -= 1 + else: + raise ValueError( + f"{self.config.timestep_spacing} is not supported. Please make sure to choose one of 'linspace', 'leading' or 'trailing'." + ) + + sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5) + if self.config.use_karras_sigmas: + log_sigmas = np.log(sigmas) + sigmas = self._convert_to_karras(in_sigmas=sigmas, num_inference_steps=num_inference_steps) + timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]).round() + timesteps = np.flip(timesteps).copy().astype(np.int64) + + self.sigmas = torch.from_numpy(sigmas) + + # when num_inference_steps == num_train_timesteps, we can end up with + # duplicates in timesteps. + _, unique_indices = np.unique(timesteps, return_index=True) + timesteps = timesteps[np.sort(unique_indices)] + + self.timesteps = torch.from_numpy(timesteps).to(device) + + self.num_inference_steps = len(timesteps) + + self.model_outputs = [ + None, + ] * self.config.solver_order + self.lower_order_nums = 0 + self.last_sample = None + if self.solver_p: + self.solver_p.set_timesteps(self.num_inference_steps, device=device) + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, height, width = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * height * width) + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] + + s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0 + sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s" + + sample = sample.reshape(batch_size, channels, height, width) + sample = sample.to(dtype) + + return sample + + def convert_model_output( + self, model_output: torch.FloatTensor, timestep: int, sample: torch.FloatTensor + ) -> torch.FloatTensor: + r""" + Convert the model output to the corresponding type that the algorithm PC needs. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + + Returns: + `torch.FloatTensor`: the converted model output. + """ + if self.predict_x0: + if self.config.prediction_type == "epsilon": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = (sample - sigma_t * model_output) / alpha_t + elif self.config.prediction_type == "sample": + x0_pred = model_output + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + x0_pred = alpha_t * sample - sigma_t * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the UniPCMultistepScheduler." + ) + + if self.config.thresholding: + x0_pred = self._threshold_sample(x0_pred) + + return x0_pred + else: + if self.config.prediction_type == "epsilon": + return model_output + elif self.config.prediction_type == "sample": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + epsilon = (sample - alpha_t * model_output) / sigma_t + return epsilon + elif self.config.prediction_type == "v_prediction": + alpha_t, sigma_t = self.alpha_t[timestep], self.sigma_t[timestep] + epsilon = alpha_t * model_output + sigma_t * sample + return epsilon + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`, or" + " `v_prediction` for the UniPCMultistepScheduler." + ) + + def multistep_uni_p_bh_update( + self, + model_output: torch.FloatTensor, + prev_timestep: int, + sample: torch.FloatTensor, + order: int, + ) -> torch.FloatTensor: + """ + One step for the UniP (B(h) version). Alternatively, `self.solver_p` is used if is specified. + + Args: + model_output (`torch.FloatTensor`): + direct outputs from learned diffusion model at the current timestep. + prev_timestep (`int`): previous discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + order (`int`): the order of UniP at this step, also the p in UniPC-p. + + Returns: + `torch.FloatTensor`: the sample tensor at the previous timestep. + """ + timestep_list = self.timestep_list + model_output_list = self.model_outputs + + s0, t = self.timestep_list[-1], prev_timestep + m0 = model_output_list[-1] + x = sample + + if self.solver_p: + x_t = self.solver_p.step(model_output, s0, x).prev_sample + return x_t + + lambda_t, lambda_s0 = self.lambda_t[t], self.lambda_t[s0] + alpha_t, alpha_s0 = self.alpha_t[t], self.alpha_t[s0] + sigma_t, sigma_s0 = self.sigma_t[t], self.sigma_t[s0] + + h = lambda_t - lambda_s0 + device = sample.device + + rks = [] + D1s = [] + for i in range(1, order): + si = timestep_list[-(i + 1)] + mi = model_output_list[-(i + 1)] + lambda_si = self.lambda_t[si] + rk = (lambda_si - lambda_s0) / h + rks.append(rk) + D1s.append((mi - m0) / rk) + + rks.append(1.0) + rks = torch.tensor(rks, device=device) + + R = [] + b = [] + + hh = -h if self.predict_x0 else h + h_phi_1 = torch.expm1(hh) # h\phi_1(h) = e^h - 1 + h_phi_k = h_phi_1 / hh - 1 + + factorial_i = 1 + + if self.config.solver_type == "bh1": + B_h = hh + elif self.config.solver_type == "bh2": + B_h = torch.expm1(hh) + else: + raise NotImplementedError() + + for i in range(1, order + 1): + R.append(torch.pow(rks, i - 1)) + b.append(h_phi_k * factorial_i / B_h) + factorial_i *= i + 1 + h_phi_k = h_phi_k / hh - 1 / factorial_i + + R = torch.stack(R) + b = torch.tensor(b, device=device) + + if len(D1s) > 0: + D1s = torch.stack(D1s, dim=1) # (B, K) + # for order 2, we use a simplified version + if order == 2: + rhos_p = torch.tensor([0.5], dtype=x.dtype, device=device) + else: + rhos_p = torch.linalg.solve(R[:-1, :-1], b[:-1]) + else: + D1s = None + + if self.predict_x0: + x_t_ = sigma_t / sigma_s0 * x - alpha_t * h_phi_1 * m0 + if D1s is not None: + pred_res = torch.einsum("k,bkchw->bchw", rhos_p, D1s) + else: + pred_res = 0 + x_t = x_t_ - alpha_t * B_h * pred_res + else: + x_t_ = alpha_t / alpha_s0 * x - sigma_t * h_phi_1 * m0 + if D1s is not None: + pred_res = torch.einsum("k,bkchw->bchw", rhos_p, D1s) + else: + pred_res = 0 + x_t = x_t_ - sigma_t * B_h * pred_res + + x_t = x_t.to(x.dtype) + return x_t + + def multistep_uni_c_bh_update( + self, + this_model_output: torch.FloatTensor, + this_timestep: int, + last_sample: torch.FloatTensor, + this_sample: torch.FloatTensor, + order: int, + ) -> torch.FloatTensor: + """ + One step for the UniC (B(h) version). + + Args: + this_model_output (`torch.FloatTensor`): the model outputs at `x_t` + this_timestep (`int`): the current timestep `t` + last_sample (`torch.FloatTensor`): the generated sample before the last predictor: `x_{t-1}` + this_sample (`torch.FloatTensor`): the generated sample after the last predictor: `x_{t}` + order (`int`): the `p` of UniC-p at this step. Note that the effective order of accuracy + should be order + 1 + + Returns: + `torch.FloatTensor`: the corrected sample tensor at the current timestep. + """ + timestep_list = self.timestep_list + model_output_list = self.model_outputs + + s0, t = timestep_list[-1], this_timestep + m0 = model_output_list[-1] + x = last_sample + x_t = this_sample + model_t = this_model_output + + lambda_t, lambda_s0 = self.lambda_t[t], self.lambda_t[s0] + alpha_t, alpha_s0 = self.alpha_t[t], self.alpha_t[s0] + sigma_t, sigma_s0 = self.sigma_t[t], self.sigma_t[s0] + + h = lambda_t - lambda_s0 + device = this_sample.device + + rks = [] + D1s = [] + for i in range(1, order): + si = timestep_list[-(i + 1)] + mi = model_output_list[-(i + 1)] + lambda_si = self.lambda_t[si] + rk = (lambda_si - lambda_s0) / h + rks.append(rk) + D1s.append((mi - m0) / rk) + + rks.append(1.0) + rks = torch.tensor(rks, device=device) + + R = [] + b = [] + + hh = -h if self.predict_x0 else h + h_phi_1 = torch.expm1(hh) # h\phi_1(h) = e^h - 1 + h_phi_k = h_phi_1 / hh - 1 + + factorial_i = 1 + + if self.config.solver_type == "bh1": + B_h = hh + elif self.config.solver_type == "bh2": + B_h = torch.expm1(hh) + else: + raise NotImplementedError() + + for i in range(1, order + 1): + R.append(torch.pow(rks, i - 1)) + b.append(h_phi_k * factorial_i / B_h) + factorial_i *= i + 1 + h_phi_k = h_phi_k / hh - 1 / factorial_i + + R = torch.stack(R) + b = torch.tensor(b, device=device) + + if len(D1s) > 0: + D1s = torch.stack(D1s, dim=1) + else: + D1s = None + + # for order 1, we use a simplified version + if order == 1: + rhos_c = torch.tensor([0.5], dtype=x.dtype, device=device) + else: + rhos_c = torch.linalg.solve(R, b) + + if self.predict_x0: + x_t_ = sigma_t / sigma_s0 * x - alpha_t * h_phi_1 * m0 + if D1s is not None: + corr_res = torch.einsum("k,bkchw->bchw", rhos_c[:-1], D1s) + else: + corr_res = 0 + D1_t = model_t - m0 + x_t = x_t_ - alpha_t * B_h * (corr_res + rhos_c[-1] * D1_t) + else: + x_t_ = alpha_t / alpha_s0 * x - sigma_t * h_phi_1 * m0 + if D1s is not None: + corr_res = torch.einsum("k,bkchw->bchw", rhos_c[:-1], D1s) + else: + corr_res = 0 + D1_t = model_t - m0 + x_t = x_t_ - sigma_t * B_h * (corr_res + rhos_c[-1] * D1_t) + x_t = x_t.to(x.dtype) + return x_t + + def step( + self, + model_output: torch.FloatTensor, + timestep: int, + sample: torch.FloatTensor, + return_dict: bool = True, + ) -> Union[SchedulerOutput, Tuple]: + """ + Step function propagating the sample with the multistep UniPC. + + Args: + model_output (`torch.FloatTensor`): direct output from learned diffusion model. + timestep (`int`): current discrete timestep in the diffusion chain. + sample (`torch.FloatTensor`): + current instance of sample being created by diffusion process. + return_dict (`bool`): option for returning tuple rather than SchedulerOutput class + + Returns: + [`~scheduling_utils.SchedulerOutput`] or `tuple`: [`~scheduling_utils.SchedulerOutput`] if `return_dict` is + True, otherwise a `tuple`. When returning a tuple, the first element is the sample tensor. + + """ + + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + step_index = (self.timesteps == timestep).nonzero() + if len(step_index) == 0: + step_index = len(self.timesteps) - 1 + else: + step_index = step_index.item() + + use_corrector = ( + step_index > 0 and step_index - 1 not in self.disable_corrector and self.last_sample is not None + ) + + model_output_convert = self.convert_model_output(model_output, timestep, sample) + if use_corrector: + sample = self.multistep_uni_c_bh_update( + this_model_output=model_output_convert, + this_timestep=timestep, + last_sample=self.last_sample, + this_sample=sample, + order=self.this_order, + ) + + # now prepare to run the predictor + prev_timestep = 0 if step_index == len(self.timesteps) - 1 else self.timesteps[step_index + 1] + + for i in range(self.config.solver_order - 1): + self.model_outputs[i] = self.model_outputs[i + 1] + self.timestep_list[i] = self.timestep_list[i + 1] + + self.model_outputs[-1] = model_output_convert + self.timestep_list[-1] = timestep + + if self.config.lower_order_final: + this_order = min(self.config.solver_order, len(self.timesteps) - step_index) + else: + this_order = self.config.solver_order + + self.this_order = min(this_order, self.lower_order_nums + 1) # warmup for multistep + assert self.this_order > 0 + + self.last_sample = sample + prev_sample = self.multistep_uni_p_bh_update( + model_output=model_output, # pass the original non-converted model output, in case solver-p is used + prev_timestep=prev_timestep, + sample=sample, + order=self.this_order, + ) + + if self.lower_order_nums < self.config.solver_order: + self.lower_order_nums += 1 + + if not return_dict: + return (prev_sample,) + + return SchedulerOutput(prev_sample=prev_sample) + + def scale_model_input(self, sample: torch.FloatTensor, *args, **kwargs) -> torch.FloatTensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.FloatTensor`): input sample + + Returns: + `torch.FloatTensor`: scaled input sample + """ + return sample + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise + def add_noise( + self, + original_samples: torch.FloatTensor, + noise: torch.FloatTensor, + timesteps: torch.IntTensor, + ) -> torch.FloatTensor: + # Make sure alphas_cumprod and timestep have same device and dtype as original_samples + alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device, dtype=original_samples.dtype) + timesteps = timesteps.to(original_samples.device) + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + while len(sqrt_alpha_prod.shape) < len(original_samples.shape): + sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape): + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1) + + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_utils.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..0f95beb022ac042b6e1ef588a72365b2623338de --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_utils.py @@ -0,0 +1,177 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import importlib +import os +from dataclasses import dataclass +from enum import Enum +from typing import Any, Dict, Optional, Union + +import torch + +from ..utils import BaseOutput + + +SCHEDULER_CONFIG_NAME = "scheduler_config.json" + + +# NOTE: We make this type an enum because it simplifies usage in docs and prevents +# circular imports when used for `_compatibles` within the schedulers module. +# When it's used as a type in pipelines, it really is a Union because the actual +# scheduler instance is passed in. +class KarrasDiffusionSchedulers(Enum): + DDIMScheduler = 1 + DDPMScheduler = 2 + PNDMScheduler = 3 + LMSDiscreteScheduler = 4 + EulerDiscreteScheduler = 5 + HeunDiscreteScheduler = 6 + EulerAncestralDiscreteScheduler = 7 + DPMSolverMultistepScheduler = 8 + DPMSolverSinglestepScheduler = 9 + KDPM2DiscreteScheduler = 10 + KDPM2AncestralDiscreteScheduler = 11 + DEISMultistepScheduler = 12 + UniPCMultistepScheduler = 13 + DPMSolverSDEScheduler = 14 + + +@dataclass +class SchedulerOutput(BaseOutput): + """ + Base class for the scheduler's step function output. + + Args: + prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + """ + + prev_sample: torch.FloatTensor + + +class SchedulerMixin: + """ + Mixin containing common functions for the schedulers. + + Class attributes: + - **_compatibles** (`List[str]`) -- A list of classes that are compatible with the parent class, so that + `from_config` can be used from a class different than the one used to save the config (should be overridden + by parent class). + """ + + config_name = SCHEDULER_CONFIG_NAME + _compatibles = [] + has_compatibles = True + + @classmethod + def from_pretrained( + cls, + pretrained_model_name_or_path: Dict[str, Any] = None, + subfolder: Optional[str] = None, + return_unused_kwargs=False, + **kwargs, + ): + r""" + Instantiate a Scheduler class from a pre-defined JSON configuration file inside a directory or Hub repo. + + Parameters: + pretrained_model_name_or_path (`str` or `os.PathLike`, *optional*): + Can be either: + + - A string, the *model id* of a model repo on huggingface.co. Valid model ids should have an + organization name, like `google/ddpm-celebahq-256`. + - A path to a *directory* containing the schedluer configurations saved using + [`~SchedulerMixin.save_pretrained`], e.g., `./my_model_directory/`. + subfolder (`str`, *optional*): + In case the relevant files are located inside a subfolder of the model repo (either remote in + huggingface.co or downloaded locally), you can specify the folder name here. + return_unused_kwargs (`bool`, *optional*, defaults to `False`): + Whether kwargs that are not consumed by the Python class should be returned or not. + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory in which a downloaded pretrained model configuration should be cached if the + standard cache should not be used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to delete incompletely received files. Will attempt to resume the download if such a + file exists. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + output_loading_info(`bool`, *optional*, defaults to `False`): + Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. + local_files_only(`bool`, *optional*, defaults to `False`): + Whether or not to only look at local files (i.e., do not try to download the model). + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated + when running `transformers-cli login` (stored in `~/.huggingface`). + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a + git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any + identifier allowed by git. + + + + It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated + models](https://huggingface.co/docs/hub/models-gated#gated-models). + + + + + + Activate the special ["offline-mode"](https://huggingface.co/transformers/installation.html#offline-mode) to + use this method in a firewalled environment. + + + + """ + config, kwargs, commit_hash = cls.load_config( + pretrained_model_name_or_path=pretrained_model_name_or_path, + subfolder=subfolder, + return_unused_kwargs=True, + return_commit_hash=True, + **kwargs, + ) + return cls.from_config(config, return_unused_kwargs=return_unused_kwargs, **kwargs) + + def save_pretrained(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs): + """ + Save a scheduler configuration object to the directory `save_directory`, so that it can be re-loaded using the + [`~SchedulerMixin.from_pretrained`] class method. + + Args: + save_directory (`str` or `os.PathLike`): + Directory where the configuration JSON file will be saved (will be created if it does not exist). + """ + self.save_config(save_directory=save_directory, push_to_hub=push_to_hub, **kwargs) + + @property + def compatibles(self): + """ + Returns all schedulers that are compatible with this scheduler + + Returns: + `List[SchedulerMixin]`: List of compatible schedulers + """ + return self._get_compatibles() + + @classmethod + def _get_compatibles(cls): + compatible_classes_str = list(set([cls.__name__] + cls._compatibles)) + diffusers_library = importlib.import_module(__name__.split(".")[0]) + compatible_classes = [ + getattr(diffusers_library, c) for c in compatible_classes_str if hasattr(diffusers_library, c) + ] + return compatible_classes diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_utils_flax.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_utils_flax.py new file mode 100644 index 0000000000000000000000000000000000000000..19ce5b8360b9be5bb4b4ec46fbeac0715d6b5869 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_utils_flax.py @@ -0,0 +1,284 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import importlib +import math +import os +from dataclasses import dataclass +from enum import Enum +from typing import Any, Dict, Optional, Tuple, Union + +import flax +import jax.numpy as jnp + +from ..utils import BaseOutput + + +SCHEDULER_CONFIG_NAME = "scheduler_config.json" + + +# NOTE: We make this type an enum because it simplifies usage in docs and prevents +# circular imports when used for `_compatibles` within the schedulers module. +# When it's used as a type in pipelines, it really is a Union because the actual +# scheduler instance is passed in. +class FlaxKarrasDiffusionSchedulers(Enum): + FlaxDDIMScheduler = 1 + FlaxDDPMScheduler = 2 + FlaxPNDMScheduler = 3 + FlaxLMSDiscreteScheduler = 4 + FlaxDPMSolverMultistepScheduler = 5 + + +@dataclass +class FlaxSchedulerOutput(BaseOutput): + """ + Base class for the scheduler's step function output. + + Args: + prev_sample (`jnp.ndarray` of shape `(batch_size, num_channels, height, width)` for images): + Computed sample (x_{t-1}) of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + """ + + prev_sample: jnp.ndarray + + +class FlaxSchedulerMixin: + """ + Mixin containing common functions for the schedulers. + + Class attributes: + - **_compatibles** (`List[str]`) -- A list of classes that are compatible with the parent class, so that + `from_config` can be used from a class different than the one used to save the config (should be overridden + by parent class). + """ + + config_name = SCHEDULER_CONFIG_NAME + ignore_for_config = ["dtype"] + _compatibles = [] + has_compatibles = True + + @classmethod + def from_pretrained( + cls, + pretrained_model_name_or_path: Dict[str, Any] = None, + subfolder: Optional[str] = None, + return_unused_kwargs=False, + **kwargs, + ): + r""" + Instantiate a Scheduler class from a pre-defined JSON-file. + + Parameters: + pretrained_model_name_or_path (`str` or `os.PathLike`, *optional*): + Can be either: + + - A string, the *model id* of a model repo on huggingface.co. Valid model ids should have an + organization name, like `google/ddpm-celebahq-256`. + - A path to a *directory* containing model weights saved using [`~SchedulerMixin.save_pretrained`], + e.g., `./my_model_directory/`. + subfolder (`str`, *optional*): + In case the relevant files are located inside a subfolder of the model repo (either remote in + huggingface.co or downloaded locally), you can specify the folder name here. + return_unused_kwargs (`bool`, *optional*, defaults to `False`): + Whether kwargs that are not consumed by the Python class should be returned or not. + + cache_dir (`Union[str, os.PathLike]`, *optional*): + Path to a directory in which a downloaded pretrained model configuration should be cached if the + standard cache should not be used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force the (re-)download of the model weights and configuration files, overriding the + cached versions if they exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to delete incompletely received files. Will attempt to resume the download if such a + file exists. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. + output_loading_info(`bool`, *optional*, defaults to `False`): + Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. + local_files_only(`bool`, *optional*, defaults to `False`): + Whether or not to only look at local files (i.e., do not try to download the model). + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated + when running `transformers-cli login` (stored in `~/.huggingface`). + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a + git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any + identifier allowed by git. + + + + It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated + models](https://huggingface.co/docs/hub/models-gated#gated-models). + + + + + + Activate the special ["offline-mode"](https://huggingface.co/transformers/installation.html#offline-mode) to + use this method in a firewalled environment. + + + + """ + config, kwargs = cls.load_config( + pretrained_model_name_or_path=pretrained_model_name_or_path, + subfolder=subfolder, + return_unused_kwargs=True, + **kwargs, + ) + scheduler, unused_kwargs = cls.from_config(config, return_unused_kwargs=True, **kwargs) + + if hasattr(scheduler, "create_state") and getattr(scheduler, "has_state", False): + state = scheduler.create_state() + + if return_unused_kwargs: + return scheduler, state, unused_kwargs + + return scheduler, state + + def save_pretrained(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs): + """ + Save a scheduler configuration object to the directory `save_directory`, so that it can be re-loaded using the + [`~FlaxSchedulerMixin.from_pretrained`] class method. + + Args: + save_directory (`str` or `os.PathLike`): + Directory where the configuration JSON file will be saved (will be created if it does not exist). + """ + self.save_config(save_directory=save_directory, push_to_hub=push_to_hub, **kwargs) + + @property + def compatibles(self): + """ + Returns all schedulers that are compatible with this scheduler + + Returns: + `List[SchedulerMixin]`: List of compatible schedulers + """ + return self._get_compatibles() + + @classmethod + def _get_compatibles(cls): + compatible_classes_str = list(set([cls.__name__] + cls._compatibles)) + diffusers_library = importlib.import_module(__name__.split(".")[0]) + compatible_classes = [ + getattr(diffusers_library, c) for c in compatible_classes_str if hasattr(diffusers_library, c) + ] + return compatible_classes + + +def broadcast_to_shape_from_left(x: jnp.ndarray, shape: Tuple[int]) -> jnp.ndarray: + assert len(shape) >= x.ndim + return jnp.broadcast_to(x.reshape(x.shape + (1,) * (len(shape) - x.ndim)), shape) + + +def betas_for_alpha_bar(num_diffusion_timesteps: int, max_beta=0.999, dtype=jnp.float32) -> jnp.ndarray: + """ + Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of + (1-beta) over time from t = [0,1]. + + Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up + to that part of the diffusion process. + + + Args: + num_diffusion_timesteps (`int`): the number of betas to produce. + max_beta (`float`): the maximum beta to use; use values lower than 1 to + prevent singularities. + + Returns: + betas (`jnp.ndarray`): the betas used by the scheduler to step the model outputs + """ + + def alpha_bar(time_step): + return math.cos((time_step + 0.008) / 1.008 * math.pi / 2) ** 2 + + betas = [] + for i in range(num_diffusion_timesteps): + t1 = i / num_diffusion_timesteps + t2 = (i + 1) / num_diffusion_timesteps + betas.append(min(1 - alpha_bar(t2) / alpha_bar(t1), max_beta)) + return jnp.array(betas, dtype=dtype) + + +@flax.struct.dataclass +class CommonSchedulerState: + alphas: jnp.ndarray + betas: jnp.ndarray + alphas_cumprod: jnp.ndarray + + @classmethod + def create(cls, scheduler): + config = scheduler.config + + if config.trained_betas is not None: + betas = jnp.asarray(config.trained_betas, dtype=scheduler.dtype) + elif config.beta_schedule == "linear": + betas = jnp.linspace(config.beta_start, config.beta_end, config.num_train_timesteps, dtype=scheduler.dtype) + elif config.beta_schedule == "scaled_linear": + # this schedule is very specific to the latent diffusion model. + betas = ( + jnp.linspace( + config.beta_start**0.5, config.beta_end**0.5, config.num_train_timesteps, dtype=scheduler.dtype + ) + ** 2 + ) + elif config.beta_schedule == "squaredcos_cap_v2": + # Glide cosine schedule + betas = betas_for_alpha_bar(config.num_train_timesteps, dtype=scheduler.dtype) + else: + raise NotImplementedError( + f"beta_schedule {config.beta_schedule} is not implemented for scheduler {scheduler.__class__.__name__}" + ) + + alphas = 1.0 - betas + + alphas_cumprod = jnp.cumprod(alphas, axis=0) + + return cls( + alphas=alphas, + betas=betas, + alphas_cumprod=alphas_cumprod, + ) + + +def get_sqrt_alpha_prod( + state: CommonSchedulerState, original_samples: jnp.ndarray, noise: jnp.ndarray, timesteps: jnp.ndarray +): + alphas_cumprod = state.alphas_cumprod + + sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5 + sqrt_alpha_prod = sqrt_alpha_prod.flatten() + sqrt_alpha_prod = broadcast_to_shape_from_left(sqrt_alpha_prod, original_samples.shape) + + sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5 + sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten() + sqrt_one_minus_alpha_prod = broadcast_to_shape_from_left(sqrt_one_minus_alpha_prod, original_samples.shape) + + return sqrt_alpha_prod, sqrt_one_minus_alpha_prod + + +def add_noise_common( + state: CommonSchedulerState, original_samples: jnp.ndarray, noise: jnp.ndarray, timesteps: jnp.ndarray +): + sqrt_alpha_prod, sqrt_one_minus_alpha_prod = get_sqrt_alpha_prod(state, original_samples, noise, timesteps) + noisy_samples = sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise + return noisy_samples + + +def get_velocity_common(state: CommonSchedulerState, sample: jnp.ndarray, noise: jnp.ndarray, timesteps: jnp.ndarray): + sqrt_alpha_prod, sqrt_one_minus_alpha_prod = get_sqrt_alpha_prod(state, sample, noise, timesteps) + velocity = sqrt_alpha_prod * noise - sqrt_one_minus_alpha_prod * sample + return velocity diff --git a/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_vq_diffusion.py b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_vq_diffusion.py new file mode 100644 index 0000000000000000000000000000000000000000..b92722e4d462ca675bbf11230c1c39810de48b6e --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/schedulers/scheduling_vq_diffusion.py @@ -0,0 +1,496 @@ +# Copyright 2023 Microsoft and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import dataclass +from typing import Optional, Tuple, Union + +import numpy as np +import torch +import torch.nn.functional as F + +from ..configuration_utils import ConfigMixin, register_to_config +from ..utils import BaseOutput +from .scheduling_utils import SchedulerMixin + + +@dataclass +class VQDiffusionSchedulerOutput(BaseOutput): + """ + Output class for the scheduler's step function output. + + Args: + prev_sample (`torch.LongTensor` of shape `(batch size, num latent pixels)`): + Computed sample x_{t-1} of previous timestep. `prev_sample` should be used as next model input in the + denoising loop. + """ + + prev_sample: torch.LongTensor + + +def index_to_log_onehot(x: torch.LongTensor, num_classes: int) -> torch.FloatTensor: + """ + Convert batch of vector of class indices into batch of log onehot vectors + + Args: + x (`torch.LongTensor` of shape `(batch size, vector length)`): + Batch of class indices + + num_classes (`int`): + number of classes to be used for the onehot vectors + + Returns: + `torch.FloatTensor` of shape `(batch size, num classes, vector length)`: + Log onehot vectors + """ + x_onehot = F.one_hot(x, num_classes) + x_onehot = x_onehot.permute(0, 2, 1) + log_x = torch.log(x_onehot.float().clamp(min=1e-30)) + return log_x + + +def gumbel_noised(logits: torch.FloatTensor, generator: Optional[torch.Generator]) -> torch.FloatTensor: + """ + Apply gumbel noise to `logits` + """ + uniform = torch.rand(logits.shape, device=logits.device, generator=generator) + gumbel_noise = -torch.log(-torch.log(uniform + 1e-30) + 1e-30) + noised = gumbel_noise + logits + return noised + + +def alpha_schedules(num_diffusion_timesteps: int, alpha_cum_start=0.99999, alpha_cum_end=0.000009): + """ + Cumulative and non-cumulative alpha schedules. + + See section 4.1. + """ + att = ( + np.arange(0, num_diffusion_timesteps) / (num_diffusion_timesteps - 1) * (alpha_cum_end - alpha_cum_start) + + alpha_cum_start + ) + att = np.concatenate(([1], att)) + at = att[1:] / att[:-1] + att = np.concatenate((att[1:], [1])) + return at, att + + +def gamma_schedules(num_diffusion_timesteps: int, gamma_cum_start=0.000009, gamma_cum_end=0.99999): + """ + Cumulative and non-cumulative gamma schedules. + + See section 4.1. + """ + ctt = ( + np.arange(0, num_diffusion_timesteps) / (num_diffusion_timesteps - 1) * (gamma_cum_end - gamma_cum_start) + + gamma_cum_start + ) + ctt = np.concatenate(([0], ctt)) + one_minus_ctt = 1 - ctt + one_minus_ct = one_minus_ctt[1:] / one_minus_ctt[:-1] + ct = 1 - one_minus_ct + ctt = np.concatenate((ctt[1:], [0])) + return ct, ctt + + +class VQDiffusionScheduler(SchedulerMixin, ConfigMixin): + """ + The VQ-diffusion transformer outputs predicted probabilities of the initial unnoised image. + + The VQ-diffusion scheduler converts the transformer's output into a sample for the unnoised image at the previous + diffusion timestep. + + [`~ConfigMixin`] takes care of storing all config attributes that are passed in the scheduler's `__init__` + function, such as `num_train_timesteps`. They can be accessed via `scheduler.config.num_train_timesteps`. + [`SchedulerMixin`] provides general loading and saving functionality via the [`SchedulerMixin.save_pretrained`] and + [`~SchedulerMixin.from_pretrained`] functions. + + For more details, see the original paper: https://arxiv.org/abs/2111.14822 + + Args: + num_vec_classes (`int`): + The number of classes of the vector embeddings of the latent pixels. Includes the class for the masked + latent pixel. + + num_train_timesteps (`int`): + Number of diffusion steps used to train the model. + + alpha_cum_start (`float`): + The starting cumulative alpha value. + + alpha_cum_end (`float`): + The ending cumulative alpha value. + + gamma_cum_start (`float`): + The starting cumulative gamma value. + + gamma_cum_end (`float`): + The ending cumulative gamma value. + """ + + order = 1 + + @register_to_config + def __init__( + self, + num_vec_classes: int, + num_train_timesteps: int = 100, + alpha_cum_start: float = 0.99999, + alpha_cum_end: float = 0.000009, + gamma_cum_start: float = 0.000009, + gamma_cum_end: float = 0.99999, + ): + self.num_embed = num_vec_classes + + # By convention, the index for the mask class is the last class index + self.mask_class = self.num_embed - 1 + + at, att = alpha_schedules(num_train_timesteps, alpha_cum_start=alpha_cum_start, alpha_cum_end=alpha_cum_end) + ct, ctt = gamma_schedules(num_train_timesteps, gamma_cum_start=gamma_cum_start, gamma_cum_end=gamma_cum_end) + + num_non_mask_classes = self.num_embed - 1 + bt = (1 - at - ct) / num_non_mask_classes + btt = (1 - att - ctt) / num_non_mask_classes + + at = torch.tensor(at.astype("float64")) + bt = torch.tensor(bt.astype("float64")) + ct = torch.tensor(ct.astype("float64")) + log_at = torch.log(at) + log_bt = torch.log(bt) + log_ct = torch.log(ct) + + att = torch.tensor(att.astype("float64")) + btt = torch.tensor(btt.astype("float64")) + ctt = torch.tensor(ctt.astype("float64")) + log_cumprod_at = torch.log(att) + log_cumprod_bt = torch.log(btt) + log_cumprod_ct = torch.log(ctt) + + self.log_at = log_at.float() + self.log_bt = log_bt.float() + self.log_ct = log_ct.float() + self.log_cumprod_at = log_cumprod_at.float() + self.log_cumprod_bt = log_cumprod_bt.float() + self.log_cumprod_ct = log_cumprod_ct.float() + + # setable values + self.num_inference_steps = None + self.timesteps = torch.from_numpy(np.arange(0, num_train_timesteps)[::-1].copy()) + + def set_timesteps(self, num_inference_steps: int, device: Union[str, torch.device] = None): + """ + Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference. + + Args: + num_inference_steps (`int`): + the number of diffusion steps used when generating samples with a pre-trained model. + + device (`str` or `torch.device`): + device to place the timesteps and the diffusion process parameters (alpha, beta, gamma) on. + """ + self.num_inference_steps = num_inference_steps + timesteps = np.arange(0, self.num_inference_steps)[::-1].copy() + self.timesteps = torch.from_numpy(timesteps).to(device) + + self.log_at = self.log_at.to(device) + self.log_bt = self.log_bt.to(device) + self.log_ct = self.log_ct.to(device) + self.log_cumprod_at = self.log_cumprod_at.to(device) + self.log_cumprod_bt = self.log_cumprod_bt.to(device) + self.log_cumprod_ct = self.log_cumprod_ct.to(device) + + def step( + self, + model_output: torch.FloatTensor, + timestep: torch.long, + sample: torch.LongTensor, + generator: Optional[torch.Generator] = None, + return_dict: bool = True, + ) -> Union[VQDiffusionSchedulerOutput, Tuple]: + """ + Predict the sample at the previous timestep via the reverse transition distribution i.e. Equation (11). See the + docstring for `self.q_posterior` for more in depth docs on how Equation (11) is computed. + + Args: + log_p_x_0: (`torch.FloatTensor` of shape `(batch size, num classes - 1, num latent pixels)`): + The log probabilities for the predicted classes of the initial latent pixels. Does not include a + prediction for the masked class as the initial unnoised image cannot be masked. + + t (`torch.long`): + The timestep that determines which transition matrices are used. + + x_t: (`torch.LongTensor` of shape `(batch size, num latent pixels)`): + The classes of each latent pixel at time `t` + + generator: (`torch.Generator` or None): + RNG for the noise applied to p(x_{t-1} | x_t) before it is sampled from. + + return_dict (`bool`): + option for returning tuple rather than VQDiffusionSchedulerOutput class + + Returns: + [`~schedulers.scheduling_utils.VQDiffusionSchedulerOutput`] or `tuple`: + [`~schedulers.scheduling_utils.VQDiffusionSchedulerOutput`] if `return_dict` is True, otherwise a `tuple`. + When returning a tuple, the first element is the sample tensor. + """ + if timestep == 0: + log_p_x_t_min_1 = model_output + else: + log_p_x_t_min_1 = self.q_posterior(model_output, sample, timestep) + + log_p_x_t_min_1 = gumbel_noised(log_p_x_t_min_1, generator) + + x_t_min_1 = log_p_x_t_min_1.argmax(dim=1) + + if not return_dict: + return (x_t_min_1,) + + return VQDiffusionSchedulerOutput(prev_sample=x_t_min_1) + + def q_posterior(self, log_p_x_0, x_t, t): + """ + Calculates the log probabilities for the predicted classes of the image at timestep `t-1`. I.e. Equation (11). + + Instead of directly computing equation (11), we use Equation (5) to restate Equation (11) in terms of only + forward probabilities. + + Equation (11) stated in terms of forward probabilities via Equation (5): + + Where: + - the sum is over x_0 = {C_0 ... C_{k-1}} (classes for x_0) + + p(x_{t-1} | x_t) = sum( q(x_t | x_{t-1}) * q(x_{t-1} | x_0) * p(x_0) / q(x_t | x_0) ) + + Args: + log_p_x_0: (`torch.FloatTensor` of shape `(batch size, num classes - 1, num latent pixels)`): + The log probabilities for the predicted classes of the initial latent pixels. Does not include a + prediction for the masked class as the initial unnoised image cannot be masked. + + x_t: (`torch.LongTensor` of shape `(batch size, num latent pixels)`): + The classes of each latent pixel at time `t` + + t (torch.Long): + The timestep that determines which transition matrix is used. + + Returns: + `torch.FloatTensor` of shape `(batch size, num classes, num latent pixels)`: + The log probabilities for the predicted classes of the image at timestep `t-1`. I.e. Equation (11). + """ + log_onehot_x_t = index_to_log_onehot(x_t, self.num_embed) + + log_q_x_t_given_x_0 = self.log_Q_t_transitioning_to_known_class( + t=t, x_t=x_t, log_onehot_x_t=log_onehot_x_t, cumulative=True + ) + + log_q_t_given_x_t_min_1 = self.log_Q_t_transitioning_to_known_class( + t=t, x_t=x_t, log_onehot_x_t=log_onehot_x_t, cumulative=False + ) + + # p_0(x_0=C_0 | x_t) / q(x_t | x_0=C_0) ... p_n(x_0=C_0 | x_t) / q(x_t | x_0=C_0) + # . . . + # . . . + # . . . + # p_0(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) ... p_n(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) + q = log_p_x_0 - log_q_x_t_given_x_0 + + # sum_0 = p_0(x_0=C_0 | x_t) / q(x_t | x_0=C_0) + ... + p_0(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}), ... , + # sum_n = p_n(x_0=C_0 | x_t) / q(x_t | x_0=C_0) + ... + p_n(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) + q_log_sum_exp = torch.logsumexp(q, dim=1, keepdim=True) + + # p_0(x_0=C_0 | x_t) / q(x_t | x_0=C_0) / sum_0 ... p_n(x_0=C_0 | x_t) / q(x_t | x_0=C_0) / sum_n + # . . . + # . . . + # . . . + # p_0(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) / sum_0 ... p_n(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) / sum_n + q = q - q_log_sum_exp + + # (p_0(x_0=C_0 | x_t) / q(x_t | x_0=C_0) / sum_0) * a_cumulative_{t-1} + b_cumulative_{t-1} ... (p_n(x_0=C_0 | x_t) / q(x_t | x_0=C_0) / sum_n) * a_cumulative_{t-1} + b_cumulative_{t-1} + # . . . + # . . . + # . . . + # (p_0(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) / sum_0) * a_cumulative_{t-1} + b_cumulative_{t-1} ... (p_n(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) / sum_n) * a_cumulative_{t-1} + b_cumulative_{t-1} + # c_cumulative_{t-1} ... c_cumulative_{t-1} + q = self.apply_cumulative_transitions(q, t - 1) + + # ((p_0(x_0=C_0 | x_t) / q(x_t | x_0=C_0) / sum_0) * a_cumulative_{t-1} + b_cumulative_{t-1}) * q(x_t | x_{t-1}=C_0) * sum_0 ... ((p_n(x_0=C_0 | x_t) / q(x_t | x_0=C_0) / sum_n) * a_cumulative_{t-1} + b_cumulative_{t-1}) * q(x_t | x_{t-1}=C_0) * sum_n + # . . . + # . . . + # . . . + # ((p_0(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) / sum_0) * a_cumulative_{t-1} + b_cumulative_{t-1}) * q(x_t | x_{t-1}=C_{k-1}) * sum_0 ... ((p_n(x_0=C_{k-1} | x_t) / q(x_t | x_0=C_{k-1}) / sum_n) * a_cumulative_{t-1} + b_cumulative_{t-1}) * q(x_t | x_{t-1}=C_{k-1}) * sum_n + # c_cumulative_{t-1} * q(x_t | x_{t-1}=C_k) * sum_0 ... c_cumulative_{t-1} * q(x_t | x_{t-1}=C_k) * sum_0 + log_p_x_t_min_1 = q + log_q_t_given_x_t_min_1 + q_log_sum_exp + + # For each column, there are two possible cases. + # + # Where: + # - sum(p_n(x_0))) is summing over all classes for x_0 + # - C_i is the class transitioning from (not to be confused with c_t and c_cumulative_t being used for gamma's) + # - C_j is the class transitioning to + # + # 1. x_t is masked i.e. x_t = c_k + # + # Simplifying the expression, the column vector is: + # . + # . + # . + # (c_t / c_cumulative_t) * (a_cumulative_{t-1} * p_n(x_0 = C_i | x_t) + b_cumulative_{t-1} * sum(p_n(x_0))) + # . + # . + # . + # (c_cumulative_{t-1} / c_cumulative_t) * sum(p_n(x_0)) + # + # From equation (11) stated in terms of forward probabilities, the last row is trivially verified. + # + # For the other rows, we can state the equation as ... + # + # (c_t / c_cumulative_t) * [b_cumulative_{t-1} * p(x_0=c_0) + ... + (a_cumulative_{t-1} + b_cumulative_{t-1}) * p(x_0=C_i) + ... + b_cumulative_{k-1} * p(x_0=c_{k-1})] + # + # This verifies the other rows. + # + # 2. x_t is not masked + # + # Simplifying the expression, there are two cases for the rows of the column vector, where C_j = C_i and where C_j != C_i: + # . + # . + # . + # C_j != C_i: b_t * ((b_cumulative_{t-1} / b_cumulative_t) * p_n(x_0 = c_0) + ... + ((a_cumulative_{t-1} + b_cumulative_{t-1}) / b_cumulative_t) * p_n(x_0 = C_i) + ... + (b_cumulative_{t-1} / (a_cumulative_t + b_cumulative_t)) * p_n(c_0=C_j) + ... + (b_cumulative_{t-1} / b_cumulative_t) * p_n(x_0 = c_{k-1})) + # . + # . + # . + # C_j = C_i: (a_t + b_t) * ((b_cumulative_{t-1} / b_cumulative_t) * p_n(x_0 = c_0) + ... + ((a_cumulative_{t-1} + b_cumulative_{t-1}) / (a_cumulative_t + b_cumulative_t)) * p_n(x_0 = C_i = C_j) + ... + (b_cumulative_{t-1} / b_cumulative_t) * p_n(x_0 = c_{k-1})) + # . + # . + # . + # 0 + # + # The last row is trivially verified. The other rows can be verified by directly expanding equation (11) stated in terms of forward probabilities. + return log_p_x_t_min_1 + + def log_Q_t_transitioning_to_known_class( + self, *, t: torch.int, x_t: torch.LongTensor, log_onehot_x_t: torch.FloatTensor, cumulative: bool + ): + """ + Returns the log probabilities of the rows from the (cumulative or non-cumulative) transition matrix for each + latent pixel in `x_t`. + + See equation (7) for the complete non-cumulative transition matrix. The complete cumulative transition matrix + is the same structure except the parameters (alpha, beta, gamma) are the cumulative analogs. + + Args: + t (torch.Long): + The timestep that determines which transition matrix is used. + + x_t (`torch.LongTensor` of shape `(batch size, num latent pixels)`): + The classes of each latent pixel at time `t`. + + log_onehot_x_t (`torch.FloatTensor` of shape `(batch size, num classes, num latent pixels)`): + The log one-hot vectors of `x_t` + + cumulative (`bool`): + If cumulative is `False`, we use the single step transition matrix `t-1`->`t`. If cumulative is `True`, + we use the cumulative transition matrix `0`->`t`. + + Returns: + `torch.FloatTensor` of shape `(batch size, num classes - 1, num latent pixels)`: + Each _column_ of the returned matrix is a _row_ of log probabilities of the complete probability + transition matrix. + + When non cumulative, returns `self.num_classes - 1` rows because the initial latent pixel cannot be + masked. + + Where: + - `q_n` is the probability distribution for the forward process of the `n`th latent pixel. + - C_0 is a class of a latent pixel embedding + - C_k is the class of the masked latent pixel + + non-cumulative result (omitting logarithms): + ``` + q_0(x_t | x_{t-1} = C_0) ... q_n(x_t | x_{t-1} = C_0) + . . . + . . . + . . . + q_0(x_t | x_{t-1} = C_k) ... q_n(x_t | x_{t-1} = C_k) + ``` + + cumulative result (omitting logarithms): + ``` + q_0_cumulative(x_t | x_0 = C_0) ... q_n_cumulative(x_t | x_0 = C_0) + . . . + . . . + . . . + q_0_cumulative(x_t | x_0 = C_{k-1}) ... q_n_cumulative(x_t | x_0 = C_{k-1}) + ``` + """ + if cumulative: + a = self.log_cumprod_at[t] + b = self.log_cumprod_bt[t] + c = self.log_cumprod_ct[t] + else: + a = self.log_at[t] + b = self.log_bt[t] + c = self.log_ct[t] + + if not cumulative: + # The values in the onehot vector can also be used as the logprobs for transitioning + # from masked latent pixels. If we are not calculating the cumulative transitions, + # we need to save these vectors to be re-appended to the final matrix so the values + # aren't overwritten. + # + # `P(x_t!=mask|x_{t-1=mask}) = 0` and 0 will be the value of the last row of the onehot vector + # if x_t is not masked + # + # `P(x_t=mask|x_{t-1=mask}) = 1` and 1 will be the value of the last row of the onehot vector + # if x_t is masked + log_onehot_x_t_transitioning_from_masked = log_onehot_x_t[:, -1, :].unsqueeze(1) + + # `index_to_log_onehot` will add onehot vectors for masked pixels, + # so the default one hot matrix has one too many rows. See the doc string + # for an explanation of the dimensionality of the returned matrix. + log_onehot_x_t = log_onehot_x_t[:, :-1, :] + + # this is a cheeky trick to produce the transition probabilities using log one-hot vectors. + # + # Don't worry about what values this sets in the columns that mark transitions + # to masked latent pixels. They are overwrote later with the `mask_class_mask`. + # + # Looking at the below logspace formula in non-logspace, each value will evaluate to either + # `1 * a + b = a + b` where `log_Q_t` has the one hot value in the column + # or + # `0 * a + b = b` where `log_Q_t` has the 0 values in the column. + # + # See equation 7 for more details. + log_Q_t = (log_onehot_x_t + a).logaddexp(b) + + # The whole column of each masked pixel is `c` + mask_class_mask = x_t == self.mask_class + mask_class_mask = mask_class_mask.unsqueeze(1).expand(-1, self.num_embed - 1, -1) + log_Q_t[mask_class_mask] = c + + if not cumulative: + log_Q_t = torch.cat((log_Q_t, log_onehot_x_t_transitioning_from_masked), dim=1) + + return log_Q_t + + def apply_cumulative_transitions(self, q, t): + bsz = q.shape[0] + a = self.log_cumprod_at[t] + b = self.log_cumprod_bt[t] + c = self.log_cumprod_ct[t] + + num_latent_pixels = q.shape[2] + c = c.expand(bsz, 1, num_latent_pixels) + + q = (q + a).logaddexp(b) + q = torch.cat((q, c), dim=1) + + return q diff --git a/gf/sd-webui-rich-text/diffusers_official/training_utils.py b/gf/sd-webui-rich-text/diffusers_official/training_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..eaa9ed64554bf8830e35efd220a77bd2de207f18 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/training_utils.py @@ -0,0 +1,314 @@ +import contextlib +import copy +import random +from typing import Any, Dict, Iterable, Optional, Union + +import numpy as np +import torch + +from .utils import deprecate, is_transformers_available + + +if is_transformers_available(): + import transformers + + +def set_seed(seed: int): + """ + Args: + Helper function for reproducible behavior to set the seed in `random`, `numpy`, `torch`. + seed (`int`): The seed to set. + """ + random.seed(seed) + np.random.seed(seed) + torch.manual_seed(seed) + torch.cuda.manual_seed_all(seed) + # ^^ safe to call this function even if cuda is not available + + +# Adapted from torch-ema https://github.com/fadel/pytorch_ema/blob/master/torch_ema/ema.py#L14 +class EMAModel: + """ + Exponential Moving Average of models weights + """ + + def __init__( + self, + parameters: Iterable[torch.nn.Parameter], + decay: float = 0.9999, + min_decay: float = 0.0, + update_after_step: int = 0, + use_ema_warmup: bool = False, + inv_gamma: Union[float, int] = 1.0, + power: Union[float, int] = 2 / 3, + model_cls: Optional[Any] = None, + model_config: Dict[str, Any] = None, + **kwargs, + ): + """ + Args: + parameters (Iterable[torch.nn.Parameter]): The parameters to track. + decay (float): The decay factor for the exponential moving average. + min_decay (float): The minimum decay factor for the exponential moving average. + update_after_step (int): The number of steps to wait before starting to update the EMA weights. + use_ema_warmup (bool): Whether to use EMA warmup. + inv_gamma (float): + Inverse multiplicative factor of EMA warmup. Default: 1. Only used if `use_ema_warmup` is True. + power (float): Exponential factor of EMA warmup. Default: 2/3. Only used if `use_ema_warmup` is True. + device (Optional[Union[str, torch.device]]): The device to store the EMA weights on. If None, the EMA + weights will be stored on CPU. + + @crowsonkb's notes on EMA Warmup: + If gamma=1 and power=1, implements a simple average. gamma=1, power=2/3 are good values for models you plan + to train for a million or more steps (reaches decay factor 0.999 at 31.6K steps, 0.9999 at 1M steps), + gamma=1, power=3/4 for models you plan to train for less (reaches decay factor 0.999 at 10K steps, 0.9999 + at 215.4k steps). + """ + + if isinstance(parameters, torch.nn.Module): + deprecation_message = ( + "Passing a `torch.nn.Module` to `ExponentialMovingAverage` is deprecated. " + "Please pass the parameters of the module instead." + ) + deprecate( + "passing a `torch.nn.Module` to `ExponentialMovingAverage`", + "1.0.0", + deprecation_message, + standard_warn=False, + ) + parameters = parameters.parameters() + + # set use_ema_warmup to True if a torch.nn.Module is passed for backwards compatibility + use_ema_warmup = True + + if kwargs.get("max_value", None) is not None: + deprecation_message = "The `max_value` argument is deprecated. Please use `decay` instead." + deprecate("max_value", "1.0.0", deprecation_message, standard_warn=False) + decay = kwargs["max_value"] + + if kwargs.get("min_value", None) is not None: + deprecation_message = "The `min_value` argument is deprecated. Please use `min_decay` instead." + deprecate("min_value", "1.0.0", deprecation_message, standard_warn=False) + min_decay = kwargs["min_value"] + + parameters = list(parameters) + self.shadow_params = [p.clone().detach() for p in parameters] + + if kwargs.get("device", None) is not None: + deprecation_message = "The `device` argument is deprecated. Please use `to` instead." + deprecate("device", "1.0.0", deprecation_message, standard_warn=False) + self.to(device=kwargs["device"]) + + self.temp_stored_params = None + + self.decay = decay + self.min_decay = min_decay + self.update_after_step = update_after_step + self.use_ema_warmup = use_ema_warmup + self.inv_gamma = inv_gamma + self.power = power + self.optimization_step = 0 + self.cur_decay_value = None # set in `step()` + + self.model_cls = model_cls + self.model_config = model_config + + @classmethod + def from_pretrained(cls, path, model_cls) -> "EMAModel": + _, ema_kwargs = model_cls.load_config(path, return_unused_kwargs=True) + model = model_cls.from_pretrained(path) + + ema_model = cls(model.parameters(), model_cls=model_cls, model_config=model.config) + + ema_model.load_state_dict(ema_kwargs) + return ema_model + + def save_pretrained(self, path): + if self.model_cls is None: + raise ValueError("`save_pretrained` can only be used if `model_cls` was defined at __init__.") + + if self.model_config is None: + raise ValueError("`save_pretrained` can only be used if `model_config` was defined at __init__.") + + model = self.model_cls.from_config(self.model_config) + state_dict = self.state_dict() + state_dict.pop("shadow_params", None) + + model.register_to_config(**state_dict) + self.copy_to(model.parameters()) + model.save_pretrained(path) + + def get_decay(self, optimization_step: int) -> float: + """ + Compute the decay factor for the exponential moving average. + """ + step = max(0, optimization_step - self.update_after_step - 1) + + if step <= 0: + return 0.0 + + if self.use_ema_warmup: + cur_decay_value = 1 - (1 + step / self.inv_gamma) ** -self.power + else: + cur_decay_value = (1 + step) / (10 + step) + + cur_decay_value = min(cur_decay_value, self.decay) + # make sure decay is not smaller than min_decay + cur_decay_value = max(cur_decay_value, self.min_decay) + return cur_decay_value + + @torch.no_grad() + def step(self, parameters: Iterable[torch.nn.Parameter]): + if isinstance(parameters, torch.nn.Module): + deprecation_message = ( + "Passing a `torch.nn.Module` to `ExponentialMovingAverage.step` is deprecated. " + "Please pass the parameters of the module instead." + ) + deprecate( + "passing a `torch.nn.Module` to `ExponentialMovingAverage.step`", + "1.0.0", + deprecation_message, + standard_warn=False, + ) + parameters = parameters.parameters() + + parameters = list(parameters) + + self.optimization_step += 1 + + # Compute the decay factor for the exponential moving average. + decay = self.get_decay(self.optimization_step) + self.cur_decay_value = decay + one_minus_decay = 1 - decay + + context_manager = contextlib.nullcontext + if is_transformers_available() and transformers.deepspeed.is_deepspeed_zero3_enabled(): + import deepspeed + + for s_param, param in zip(self.shadow_params, parameters): + if is_transformers_available() and transformers.deepspeed.is_deepspeed_zero3_enabled(): + context_manager = deepspeed.zero.GatheredParameters(param, modifier_rank=None) + + with context_manager(): + if param.requires_grad: + s_param.sub_(one_minus_decay * (s_param - param)) + else: + s_param.copy_(param) + + def copy_to(self, parameters: Iterable[torch.nn.Parameter]) -> None: + """ + Copy current averaged parameters into given collection of parameters. + + Args: + parameters: Iterable of `torch.nn.Parameter`; the parameters to be + updated with the stored moving averages. If `None`, the parameters with which this + `ExponentialMovingAverage` was initialized will be used. + """ + parameters = list(parameters) + for s_param, param in zip(self.shadow_params, parameters): + param.data.copy_(s_param.to(param.device).data) + + def to(self, device=None, dtype=None) -> None: + r"""Move internal buffers of the ExponentialMovingAverage to `device`. + + Args: + device: like `device` argument to `torch.Tensor.to` + """ + # .to() on the tensors handles None correctly + self.shadow_params = [ + p.to(device=device, dtype=dtype) if p.is_floating_point() else p.to(device=device) + for p in self.shadow_params + ] + + def state_dict(self) -> dict: + r""" + Returns the state of the ExponentialMovingAverage as a dict. This method is used by accelerate during + checkpointing to save the ema state dict. + """ + # Following PyTorch conventions, references to tensors are returned: + # "returns a reference to the state and not its copy!" - + # https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict + return { + "decay": self.decay, + "min_decay": self.min_decay, + "optimization_step": self.optimization_step, + "update_after_step": self.update_after_step, + "use_ema_warmup": self.use_ema_warmup, + "inv_gamma": self.inv_gamma, + "power": self.power, + "shadow_params": self.shadow_params, + } + + def store(self, parameters: Iterable[torch.nn.Parameter]) -> None: + r""" + Args: + Save the current parameters for restoring later. + parameters: Iterable of `torch.nn.Parameter`; the parameters to be + temporarily stored. + """ + self.temp_stored_params = [param.detach().cpu().clone() for param in parameters] + + def restore(self, parameters: Iterable[torch.nn.Parameter]) -> None: + r""" + Args: + Restore the parameters stored with the `store` method. Useful to validate the model with EMA parameters without: + affecting the original optimization process. Store the parameters before the `copy_to()` method. After + validation (or model saving), use this to restore the former parameters. + parameters: Iterable of `torch.nn.Parameter`; the parameters to be + updated with the stored parameters. If `None`, the parameters with which this + `ExponentialMovingAverage` was initialized will be used. + """ + if self.temp_stored_params is None: + raise RuntimeError("This ExponentialMovingAverage has no `store()`ed weights " "to `restore()`") + for c_param, param in zip(self.temp_stored_params, parameters): + param.data.copy_(c_param.data) + + # Better memory-wise. + self.temp_stored_params = None + + def load_state_dict(self, state_dict: dict) -> None: + r""" + Args: + Loads the ExponentialMovingAverage state. This method is used by accelerate during checkpointing to save the + ema state dict. + state_dict (dict): EMA state. Should be an object returned + from a call to :meth:`state_dict`. + """ + # deepcopy, to be consistent with module API + state_dict = copy.deepcopy(state_dict) + + self.decay = state_dict.get("decay", self.decay) + if self.decay < 0.0 or self.decay > 1.0: + raise ValueError("Decay must be between 0 and 1") + + self.min_decay = state_dict.get("min_decay", self.min_decay) + if not isinstance(self.min_decay, float): + raise ValueError("Invalid min_decay") + + self.optimization_step = state_dict.get("optimization_step", self.optimization_step) + if not isinstance(self.optimization_step, int): + raise ValueError("Invalid optimization_step") + + self.update_after_step = state_dict.get("update_after_step", self.update_after_step) + if not isinstance(self.update_after_step, int): + raise ValueError("Invalid update_after_step") + + self.use_ema_warmup = state_dict.get("use_ema_warmup", self.use_ema_warmup) + if not isinstance(self.use_ema_warmup, bool): + raise ValueError("Invalid use_ema_warmup") + + self.inv_gamma = state_dict.get("inv_gamma", self.inv_gamma) + if not isinstance(self.inv_gamma, (float, int)): + raise ValueError("Invalid inv_gamma") + + self.power = state_dict.get("power", self.power) + if not isinstance(self.power, (float, int)): + raise ValueError("Invalid power") + + shadow_params = state_dict.get("shadow_params", None) + if shadow_params is not None: + self.shadow_params = shadow_params + if not isinstance(self.shadow_params, list): + raise ValueError("shadow_params must be a list") + if not all(isinstance(p, torch.Tensor) for p in self.shadow_params): + raise ValueError("shadow_params must all be Tensors") diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__init__.py b/gf/sd-webui-rich-text/diffusers_official/utils/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..7449df99ba809c0055159a7592bb269121c3950c --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/__init__.py @@ -0,0 +1,123 @@ +# Copyright 2023 The HuggingFace Inc. team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +import os + +from packaging import version + +from .. import __version__ +from .accelerate_utils import apply_forward_hook +from .constants import ( + CONFIG_NAME, + DEPRECATED_REVISION_ARGS, + DIFFUSERS_CACHE, + DIFFUSERS_DYNAMIC_MODULE_NAME, + FLAX_WEIGHTS_NAME, + HF_MODULES_CACHE, + HUGGINGFACE_CO_RESOLVE_ENDPOINT, + ONNX_EXTERNAL_WEIGHTS_NAME, + ONNX_WEIGHTS_NAME, + SAFETENSORS_WEIGHTS_NAME, + TEXT_ENCODER_ATTN_MODULE, + WEIGHTS_NAME, +) +from .deprecation_utils import deprecate +from .doc_utils import replace_example_docstring +from .dynamic_modules_utils import get_class_from_dynamic_module +from .hub_utils import ( + HF_HUB_OFFLINE, + _add_variant, + _get_model_file, + extract_commit_hash, + http_user_agent, +) +from .import_utils import ( + BACKENDS_MAPPING, + ENV_VARS_TRUE_AND_AUTO_VALUES, + ENV_VARS_TRUE_VALUES, + USE_JAX, + USE_TF, + USE_TORCH, + DummyObject, + OptionalDependencyNotAvailable, + is_accelerate_available, + is_accelerate_version, + is_bs4_available, + is_flax_available, + is_ftfy_available, + is_inflect_available, + is_invisible_watermark_available, + is_k_diffusion_available, + is_k_diffusion_version, + is_librosa_available, + is_note_seq_available, + is_omegaconf_available, + is_onnx_available, + is_safetensors_available, + is_scipy_available, + is_tensorboard_available, + is_tf_available, + is_torch_available, + is_torch_version, + is_torchsde_available, + is_transformers_available, + is_transformers_version, + is_unidecode_available, + is_wandb_available, + is_xformers_available, + requires_backends, +) +from .logging import get_logger +from .outputs import BaseOutput +from .pil_utils import PIL_INTERPOLATION, numpy_to_pil, pt_to_pil +from .torch_utils import is_compiled_module, randn_tensor + + +if is_torch_available(): + from .testing_utils import ( + floats_tensor, + load_hf_numpy, + load_image, + load_numpy, + load_pt, + nightly, + parse_flag_from_env, + print_tensor_test, + require_torch_2, + require_torch_gpu, + skip_mps, + slow, + torch_all_close, + torch_device, + ) + from .torch_utils import maybe_allow_in_graph + +from .testing_utils import export_to_gif, export_to_video + + +logger = get_logger(__name__) + + +def check_min_version(min_version): + if version.parse(__version__) < version.parse(min_version): + if "dev" in min_version: + error_message = ( + "This example requires a source install from HuggingFace diffusers (see " + "`https://huggingface.co/docs/diffusers/installation#install-from-source`)," + ) + else: + error_message = f"This example requires a minimum version of {min_version}," + error_message += f" but the version found is {__version__}.\n" + raise ImportError(error_message) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/__init__.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/__init__.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..04c6f7bfcc531e3f1106cdde60ff9318cbcb8755 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/__init__.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/accelerate_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/accelerate_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..e3692f36990bd84e391ce4ce1b775d3cfe6a64a8 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/accelerate_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/constants.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/constants.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..6e766499f9178818a2dcedd839e1e34ecf1e2912 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/constants.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/deprecation_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/deprecation_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..18e251ef9e0303db24999594f5756b48a7ffd732 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/deprecation_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/doc_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/doc_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..dbd1af45507d9682c5170ff9658c0fa59e65f222 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/doc_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_flax_and_transformers_objects.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_flax_and_transformers_objects.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..4f8fdd3258c3185764df3372951a7e952d3b8de7 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_flax_and_transformers_objects.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_flax_objects.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_flax_objects.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..03192470155b5fb6c003e1920d6e4d426411d493 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_flax_objects.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_note_seq_objects.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_note_seq_objects.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..b54bac5aec8784a4fbe91a647d3e81422450b59b Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_note_seq_objects.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_torch_and_librosa_objects.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_torch_and_librosa_objects.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..512dcd18162ac12b5d9c8bbd323d0f3c2ae640f8 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_torch_and_librosa_objects.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_torch_and_transformers_and_k_diffusion_objects.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_torch_and_transformers_and_k_diffusion_objects.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..8278ebe9716610c1935477b79aaac3807bbd682d Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_torch_and_transformers_and_k_diffusion_objects.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_transformers_and_torch_and_note_seq_objects.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_transformers_and_torch_and_note_seq_objects.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..a109ed6920bf8ec691d414ab5f45b1f01804df2a Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dummy_transformers_and_torch_and_note_seq_objects.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dynamic_modules_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dynamic_modules_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..3a62d29e9034d5cd028bae8a3725d96833200b73 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/dynamic_modules_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/hub_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/hub_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..39473d5a274a1071af2720f631a60a7e2c44f638 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/hub_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/import_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/import_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..f09ef7a50885061df3906ba27b2b067ca1848fd8 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/import_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/logging.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/logging.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..194afa7e1210a21446bf66644add3cad109f2793 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/logging.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/outputs.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/outputs.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..446d14791142fb1b7d7b8ef40d03e3c3aefd845a Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/outputs.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/pil_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/pil_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..e03a95ccf8f087afb9384ee7170426813a2cf944 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/pil_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/testing_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/testing_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..eaa38558ec353d9749d94d377d7cbcd7bd032f12 Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/testing_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/torch_utils.cpython-310.pyc b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/torch_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..95a71491cc73d6bcf27e50c708c0e34fe61c020f Binary files /dev/null and b/gf/sd-webui-rich-text/diffusers_official/utils/__pycache__/torch_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/accelerate_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/accelerate_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..10a83e1dd209cca198f4038d0d7e7228f9671859 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/accelerate_utils.py @@ -0,0 +1,48 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" +Accelerate utilities: Utilities related to accelerate +""" + +from packaging import version + +from .import_utils import is_accelerate_available + + +if is_accelerate_available(): + import accelerate + + +def apply_forward_hook(method): + """ + Decorator that applies a registered CpuOffload hook to an arbitrary function rather than `forward`. This is useful + for cases where a PyTorch module provides functions other than `forward` that should trigger a move to the + appropriate acceleration device. This is the case for `encode` and `decode` in [`AutoencoderKL`]. + + This decorator looks inside the internal `_hf_hook` property to find a registered offload hook. + + :param method: The method to decorate. This method should be a method of a PyTorch module. + """ + if not is_accelerate_available(): + return method + accelerate_version = version.parse(accelerate.__version__).base_version + if version.parse(accelerate_version) < version.parse("0.17.0"): + return method + + def wrapper(self, *args, **kwargs): + if hasattr(self, "_hf_hook") and hasattr(self._hf_hook, "pre_forward"): + self._hf_hook.pre_forward(self) + return method(self, *args, **kwargs) + + return wrapper diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/constants.py b/gf/sd-webui-rich-text/diffusers_official/utils/constants.py new file mode 100644 index 0000000000000000000000000000000000000000..3c641a259a81145403512f05ea11ac56af09392b --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/constants.py @@ -0,0 +1,33 @@ +# Copyright 2023 The HuggingFace Inc. team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os + +from huggingface_hub.constants import HUGGINGFACE_HUB_CACHE, hf_cache_home + + +default_cache_path = HUGGINGFACE_HUB_CACHE + + +CONFIG_NAME = "config.json" +WEIGHTS_NAME = "diffusion_pytorch_model.bin" +FLAX_WEIGHTS_NAME = "diffusion_flax_model.msgpack" +ONNX_WEIGHTS_NAME = "model.onnx" +SAFETENSORS_WEIGHTS_NAME = "diffusion_pytorch_model.safetensors" +ONNX_EXTERNAL_WEIGHTS_NAME = "weights.pb" +HUGGINGFACE_CO_RESOLVE_ENDPOINT = "https://huggingface.co" +DIFFUSERS_CACHE = default_cache_path +DIFFUSERS_DYNAMIC_MODULE_NAME = "diffusers_modules" +HF_MODULES_CACHE = os.getenv("HF_MODULES_CACHE", os.path.join(hf_cache_home, "modules")) +DEPRECATED_REVISION_ARGS = ["fp16", "non-ema"] +TEXT_ENCODER_ATTN_MODULE = ".self_attn" diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/deprecation_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/deprecation_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..f482deddd2f46b8d2e29d5229faa0e9a21f2fd98 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/deprecation_utils.py @@ -0,0 +1,49 @@ +import inspect +import warnings +from typing import Any, Dict, Optional, Union + +from packaging import version + + +def deprecate(*args, take_from: Optional[Union[Dict, Any]] = None, standard_warn=True, stacklevel=2): + from .. import __version__ + + deprecated_kwargs = take_from + values = () + if not isinstance(args[0], tuple): + args = (args,) + + for attribute, version_name, message in args: + if version.parse(version.parse(__version__).base_version) >= version.parse(version_name): + raise ValueError( + f"The deprecation tuple {(attribute, version_name, message)} should be removed since diffusers'" + f" version {__version__} is >= {version_name}" + ) + + warning = None + if isinstance(deprecated_kwargs, dict) and attribute in deprecated_kwargs: + values += (deprecated_kwargs.pop(attribute),) + warning = f"The `{attribute}` argument is deprecated and will be removed in version {version_name}." + elif hasattr(deprecated_kwargs, attribute): + values += (getattr(deprecated_kwargs, attribute),) + warning = f"The `{attribute}` attribute is deprecated and will be removed in version {version_name}." + elif deprecated_kwargs is None: + warning = f"`{attribute}` is deprecated and will be removed in version {version_name}." + + if warning is not None: + warning = warning + " " if standard_warn else "" + warnings.warn(warning + message, FutureWarning, stacklevel=stacklevel) + + if isinstance(deprecated_kwargs, dict) and len(deprecated_kwargs) > 0: + call_frame = inspect.getouterframes(inspect.currentframe())[1] + filename = call_frame.filename + line_number = call_frame.lineno + function = call_frame.function + key, value = next(iter(deprecated_kwargs.items())) + raise TypeError(f"{function} in {filename} line {line_number-1} got an unexpected keyword argument `{key}`") + + if len(values) == 0: + return + elif len(values) == 1: + return values[0] + return values diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/doc_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/doc_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..f1f87743f99802931334bd51bf99985775116d59 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/doc_utils.py @@ -0,0 +1,38 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" +Doc utilities: Utilities related to documentation +""" +import re + + +def replace_example_docstring(example_docstring): + def docstring_decorator(fn): + func_doc = fn.__doc__ + lines = func_doc.split("\n") + i = 0 + while i < len(lines) and re.search(r"^\s*Examples?:\s*$", lines[i]) is None: + i += 1 + if i < len(lines): + lines[i] = example_docstring + func_doc = "\n".join(lines) + else: + raise ValueError( + f"The function {fn} should have an empty 'Examples:' in its docstring as placeholder, " + f"current docstring is:\n{func_doc}" + ) + fn.__doc__ = func_doc + return fn + + return docstring_decorator diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_flax_and_transformers_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_flax_and_transformers_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..162bac1c4331149c4b5abde1eadd8013ab0cda99 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_flax_and_transformers_objects.py @@ -0,0 +1,62 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class FlaxStableDiffusionControlNetPipeline(metaclass=DummyObject): + _backends = ["flax", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax", "transformers"]) + + +class FlaxStableDiffusionImg2ImgPipeline(metaclass=DummyObject): + _backends = ["flax", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax", "transformers"]) + + +class FlaxStableDiffusionInpaintPipeline(metaclass=DummyObject): + _backends = ["flax", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax", "transformers"]) + + +class FlaxStableDiffusionPipeline(metaclass=DummyObject): + _backends = ["flax", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax", "transformers"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_flax_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_flax_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..2bb80d136f338d193c67773266355956afd1d98a --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_flax_objects.py @@ -0,0 +1,197 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class FlaxControlNetModel(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxModelMixin(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxUNet2DConditionModel(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxAutoencoderKL(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxDiffusionPipeline(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxDDIMScheduler(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxDDPMScheduler(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxDPMSolverMultistepScheduler(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxKarrasVeScheduler(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxLMSDiscreteScheduler(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxPNDMScheduler(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxSchedulerMixin(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + +class FlaxScoreSdeVeScheduler(metaclass=DummyObject): + _backends = ["flax"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["flax"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["flax"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_note_seq_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_note_seq_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..c02d0b015aedc37c01fb3b843bc79547aae5da68 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_note_seq_objects.py @@ -0,0 +1,17 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class MidiProcessor(metaclass=DummyObject): + _backends = ["note_seq"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["note_seq"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["note_seq"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["note_seq"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_onnx_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_onnx_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..bde5f6ad0793e2d81bc638600b46ff81748d09ee --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_onnx_objects.py @@ -0,0 +1,17 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class OnnxRuntimeModel(metaclass=DummyObject): + _backends = ["onnx"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["onnx"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["onnx"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["onnx"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_pt_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_pt_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..20dbf84681d31d0b79d152871146d5b5f45dd8f6 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_pt_objects.py @@ -0,0 +1,780 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class AutoencoderKL(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class ControlNetModel(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class ModelMixin(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class PriorTransformer(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class T5FilmDecoder(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class Transformer2DModel(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class UNet1DModel(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class UNet2DConditionModel(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class UNet2DModel(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class UNet3DConditionModel(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class VQModel(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +def get_constant_schedule(*args, **kwargs): + requires_backends(get_constant_schedule, ["torch"]) + + +def get_constant_schedule_with_warmup(*args, **kwargs): + requires_backends(get_constant_schedule_with_warmup, ["torch"]) + + +def get_cosine_schedule_with_warmup(*args, **kwargs): + requires_backends(get_cosine_schedule_with_warmup, ["torch"]) + + +def get_cosine_with_hard_restarts_schedule_with_warmup(*args, **kwargs): + requires_backends(get_cosine_with_hard_restarts_schedule_with_warmup, ["torch"]) + + +def get_linear_schedule_with_warmup(*args, **kwargs): + requires_backends(get_linear_schedule_with_warmup, ["torch"]) + + +def get_polynomial_decay_schedule_with_warmup(*args, **kwargs): + requires_backends(get_polynomial_decay_schedule_with_warmup, ["torch"]) + + +def get_scheduler(*args, **kwargs): + requires_backends(get_scheduler, ["torch"]) + + +class AudioPipelineOutput(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class ConsistencyModelPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DanceDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DDIMPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DDPMPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DiffusionPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DiTPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class ImagePipelineOutput(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class KarrasVePipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class LDMPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class LDMSuperResolutionPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class PNDMPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class RePaintPipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class ScoreSdeVePipeline(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class CMStochasticIterativeScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DDIMInverseScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DDIMParallelScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DDIMScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DDPMParallelScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DDPMScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DEISMultistepScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DPMSolverMultistepInverseScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DPMSolverMultistepScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class DPMSolverSinglestepScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class EulerAncestralDiscreteScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class EulerDiscreteScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class HeunDiscreteScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class IPNDMScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class KarrasVeScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class KDPM2AncestralDiscreteScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class KDPM2DiscreteScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class PNDMScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class RePaintScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class SchedulerMixin(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class ScoreSdeVeScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class UnCLIPScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class UniPCMultistepScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class VQDiffusionScheduler(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + +class EMAModel(metaclass=DummyObject): + _backends = ["torch"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_librosa_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_librosa_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..2088bc4a744198284f22fe54e6f1055cf3568566 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_librosa_objects.py @@ -0,0 +1,32 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class AudioDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "librosa"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "librosa"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "librosa"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "librosa"]) + + +class Mel(metaclass=DummyObject): + _backends = ["torch", "librosa"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "librosa"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "librosa"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "librosa"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_scipy_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_scipy_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..a1ff25863822b04971d2c6dfdc17f5b28774cf05 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_scipy_objects.py @@ -0,0 +1,17 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class LMSDiscreteScheduler(metaclass=DummyObject): + _backends = ["torch", "scipy"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "scipy"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "scipy"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "scipy"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_torchsde_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_torchsde_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..a81bbb316f32267c31b06598519f1eef9ddde643 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_torchsde_objects.py @@ -0,0 +1,17 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class DPMSolverSDEScheduler(metaclass=DummyObject): + _backends = ["torch", "torchsde"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "torchsde"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "torchsde"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "torchsde"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_invisible_watermark_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_invisible_watermark_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..6b09b971fbc5808b2090195918e021874cbf56c9 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_invisible_watermark_objects.py @@ -0,0 +1,32 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class StableDiffusionXLImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers", "invisible_watermark"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "invisible_watermark"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "invisible_watermark"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "invisible_watermark"]) + + +class StableDiffusionXLPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers", "invisible_watermark"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "invisible_watermark"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "invisible_watermark"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "invisible_watermark"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_k_diffusion_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_k_diffusion_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..56836f0b6d77b8daa25e956101694863e418339f --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_k_diffusion_objects.py @@ -0,0 +1,17 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class StableDiffusionKDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers", "k_diffusion"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "k_diffusion"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "k_diffusion"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "k_diffusion"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_onnx_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_onnx_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..b7afad8226b87292100270e3e7daad6885be0e7f --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_and_onnx_objects.py @@ -0,0 +1,92 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class OnnxStableDiffusionImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers", "onnx"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "onnx"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + +class OnnxStableDiffusionInpaintPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers", "onnx"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "onnx"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + +class OnnxStableDiffusionInpaintPipelineLegacy(metaclass=DummyObject): + _backends = ["torch", "transformers", "onnx"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "onnx"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + +class OnnxStableDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers", "onnx"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "onnx"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + +class OnnxStableDiffusionUpscalePipeline(metaclass=DummyObject): + _backends = ["torch", "transformers", "onnx"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "onnx"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + +class StableDiffusionOnnxPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers", "onnx"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers", "onnx"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers", "onnx"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..164206d776fa01b30d81465a2192783eb6c4b7fd --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_torch_and_transformers_objects.py @@ -0,0 +1,947 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class AltDiffusionImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class AltDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class AudioLDMPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class CycleDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class IFImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class IFImg2ImgSuperResolutionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class IFInpaintingPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class IFInpaintingSuperResolutionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class IFPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class IFSuperResolutionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class ImageTextPipelineOutput(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyInpaintPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyPriorPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyV22ControlnetImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyV22ControlnetPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyV22Img2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyV22InpaintPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyV22Pipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyV22PriorEmb2EmbPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class KandinskyV22PriorPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class LDMTextToImagePipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class PaintByExamplePipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class SemanticStableDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class ShapEImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class ShapEPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionAttendAndExcitePipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionControlNetImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionControlNetInpaintPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionControlNetPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionDepth2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionDiffEditPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionImageVariationPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionInpaintPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionInpaintPipelineLegacy(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionInstructPix2PixPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionLatentUpscalePipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionLDM3DPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionModelEditingPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionPanoramaPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionParadigmsPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionPipelineSafe(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionPix2PixZeroPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionSAGPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableDiffusionUpscalePipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableUnCLIPImg2ImgPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class StableUnCLIPPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class TextToVideoSDPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class TextToVideoZeroPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class UnCLIPImageVariationPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class UnCLIPPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class UniDiffuserModel(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class UniDiffuserPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class UniDiffuserTextDecoder(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class VersatileDiffusionDualGuidedPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class VersatileDiffusionImageVariationPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class VersatileDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class VersatileDiffusionTextToImagePipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class VideoToVideoSDPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + +class VQDiffusionPipeline(metaclass=DummyObject): + _backends = ["torch", "transformers"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["torch", "transformers"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["torch", "transformers"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dummy_transformers_and_torch_and_note_seq_objects.py b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_transformers_and_torch_and_note_seq_objects.py new file mode 100644 index 0000000000000000000000000000000000000000..fbde04e33f0abd86d12f3dee048a4f0585c9f19d --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dummy_transformers_and_torch_and_note_seq_objects.py @@ -0,0 +1,17 @@ +# This file is autogenerated by the command `make fix-copies`, do not edit. +from ..utils import DummyObject, requires_backends + + +class SpectrogramDiffusionPipeline(metaclass=DummyObject): + _backends = ["transformers", "torch", "note_seq"] + + def __init__(self, *args, **kwargs): + requires_backends(self, ["transformers", "torch", "note_seq"]) + + @classmethod + def from_config(cls, *args, **kwargs): + requires_backends(cls, ["transformers", "torch", "note_seq"]) + + @classmethod + def from_pretrained(cls, *args, **kwargs): + requires_backends(cls, ["transformers", "torch", "note_seq"]) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/dynamic_modules_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/dynamic_modules_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..5b0952f0b514cb52e63fdac8a780ddc9482a5b9d --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/dynamic_modules_utils.py @@ -0,0 +1,456 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Utilities to dynamically load objects from the Hub.""" + +import importlib +import inspect +import json +import os +import re +import shutil +import sys +from pathlib import Path +from typing import Dict, Optional, Union +from urllib import request + +from huggingface_hub import HfFolder, cached_download, hf_hub_download, model_info +from packaging import version + +from .. import __version__ +from . import DIFFUSERS_DYNAMIC_MODULE_NAME, HF_MODULES_CACHE, logging + + +COMMUNITY_PIPELINES_URL = ( + "https://raw.githubusercontent.com/huggingface/diffusers/{revision}/examples/community/{pipeline}.py" +) + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +def get_diffusers_versions(): + url = "https://pypi.org/pypi/diffusers/json" + releases = json.loads(request.urlopen(url).read())["releases"].keys() + return sorted(releases, key=lambda x: version.Version(x)) + + +def init_hf_modules(): + """ + Creates the cache directory for modules with an init, and adds it to the Python path. + """ + # This function has already been executed if HF_MODULES_CACHE already is in the Python path. + if HF_MODULES_CACHE in sys.path: + return + + sys.path.append(HF_MODULES_CACHE) + os.makedirs(HF_MODULES_CACHE, exist_ok=True) + init_path = Path(HF_MODULES_CACHE) / "__init__.py" + if not init_path.exists(): + init_path.touch() + + +def create_dynamic_module(name: Union[str, os.PathLike]): + """ + Creates a dynamic module in the cache directory for modules. + """ + init_hf_modules() + dynamic_module_path = Path(HF_MODULES_CACHE) / name + # If the parent module does not exist yet, recursively create it. + if not dynamic_module_path.parent.exists(): + create_dynamic_module(dynamic_module_path.parent) + os.makedirs(dynamic_module_path, exist_ok=True) + init_path = dynamic_module_path / "__init__.py" + if not init_path.exists(): + init_path.touch() + + +def get_relative_imports(module_file): + """ + Get the list of modules that are relatively imported in a module file. + + Args: + module_file (`str` or `os.PathLike`): The module file to inspect. + """ + with open(module_file, "r", encoding="utf-8") as f: + content = f.read() + + # Imports of the form `import .xxx` + relative_imports = re.findall("^\s*import\s+\.(\S+)\s*$", content, flags=re.MULTILINE) + # Imports of the form `from .xxx import yyy` + relative_imports += re.findall("^\s*from\s+\.(\S+)\s+import", content, flags=re.MULTILINE) + # Unique-ify + return list(set(relative_imports)) + + +def get_relative_import_files(module_file): + """ + Get the list of all files that are needed for a given module. Note that this function recurses through the relative + imports (if a imports b and b imports c, it will return module files for b and c). + + Args: + module_file (`str` or `os.PathLike`): The module file to inspect. + """ + no_change = False + files_to_check = [module_file] + all_relative_imports = [] + + # Let's recurse through all relative imports + while not no_change: + new_imports = [] + for f in files_to_check: + new_imports.extend(get_relative_imports(f)) + + module_path = Path(module_file).parent + new_import_files = [str(module_path / m) for m in new_imports] + new_import_files = [f for f in new_import_files if f not in all_relative_imports] + files_to_check = [f"{f}.py" for f in new_import_files] + + no_change = len(new_import_files) == 0 + all_relative_imports.extend(files_to_check) + + return all_relative_imports + + +def check_imports(filename): + """ + Check if the current Python environment contains all the libraries that are imported in a file. + """ + with open(filename, "r", encoding="utf-8") as f: + content = f.read() + + # Imports of the form `import xxx` + imports = re.findall("^\s*import\s+(\S+)\s*$", content, flags=re.MULTILINE) + # Imports of the form `from xxx import yyy` + imports += re.findall("^\s*from\s+(\S+)\s+import", content, flags=re.MULTILINE) + # Only keep the top-level module + imports = [imp.split(".")[0] for imp in imports if not imp.startswith(".")] + + # Unique-ify and test we got them all + imports = list(set(imports)) + missing_packages = [] + for imp in imports: + try: + importlib.import_module(imp) + except ImportError: + missing_packages.append(imp) + + if len(missing_packages) > 0: + raise ImportError( + "This modeling file requires the following packages that were not found in your environment: " + f"{', '.join(missing_packages)}. Run `pip install {' '.join(missing_packages)}`" + ) + + return get_relative_imports(filename) + + +def get_class_in_module(class_name, module_path): + """ + Import a module on the cache directory for modules and extract a class from it. + """ + module_path = module_path.replace(os.path.sep, ".") + module = importlib.import_module(module_path) + + if class_name is None: + return find_pipeline_class(module) + return getattr(module, class_name) + + +def find_pipeline_class(loaded_module): + """ + Retrieve pipeline class that inherits from `DiffusionPipeline`. Note that there has to be exactly one class + inheriting from `DiffusionPipeline`. + """ + from ..pipelines import DiffusionPipeline + + cls_members = dict(inspect.getmembers(loaded_module, inspect.isclass)) + + pipeline_class = None + for cls_name, cls in cls_members.items(): + if ( + cls_name != DiffusionPipeline.__name__ + and issubclass(cls, DiffusionPipeline) + and cls.__module__.split(".")[0] != "diffusers" + ): + if pipeline_class is not None: + raise ValueError( + f"Multiple classes that inherit from {DiffusionPipeline.__name__} have been found:" + f" {pipeline_class.__name__}, and {cls_name}. Please make sure to define only one in" + f" {loaded_module}." + ) + pipeline_class = cls + + return pipeline_class + + +def get_cached_module_file( + pretrained_model_name_or_path: Union[str, os.PathLike], + module_file: str, + cache_dir: Optional[Union[str, os.PathLike]] = None, + force_download: bool = False, + resume_download: bool = False, + proxies: Optional[Dict[str, str]] = None, + use_auth_token: Optional[Union[bool, str]] = None, + revision: Optional[str] = None, + local_files_only: bool = False, +): + """ + Prepares Downloads a module from a local folder or a distant repo and returns its path inside the cached + Transformers module. + + Args: + pretrained_model_name_or_path (`str` or `os.PathLike`): + This can be either: + + - a string, the *model id* of a pretrained model configuration hosted inside a model repo on + huggingface.co. Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced + under a user or organization name, like `dbmdz/bert-base-german-cased`. + - a path to a *directory* containing a configuration file saved using the + [`~PreTrainedTokenizer.save_pretrained`] method, e.g., `./my_model_directory/`. + + module_file (`str`): + The name of the module file containing the class to look for. + cache_dir (`str` or `os.PathLike`, *optional*): + Path to a directory in which a downloaded pretrained model configuration should be cached if the standard + cache should not be used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force to (re-)download the configuration files and override the cached versions if they + exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request. + use_auth_token (`str` or *bool*, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated + when running `transformers-cli login` (stored in `~/.huggingface`). + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a + git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any + identifier allowed by git. + local_files_only (`bool`, *optional*, defaults to `False`): + If `True`, will only try to load the tokenizer configuration from local files. + + + + You may pass a token in `use_auth_token` if you are not logged in (`huggingface-cli long`) and want to use private + or [gated models](https://huggingface.co/docs/hub/models-gated#gated-models). + + + + Returns: + `str`: The path to the module inside the cache. + """ + # Download and cache module_file from the repo `pretrained_model_name_or_path` of grab it if it's a local file. + pretrained_model_name_or_path = str(pretrained_model_name_or_path) + + module_file_or_url = os.path.join(pretrained_model_name_or_path, module_file) + + if os.path.isfile(module_file_or_url): + resolved_module_file = module_file_or_url + submodule = "local" + elif pretrained_model_name_or_path.count("/") == 0: + available_versions = get_diffusers_versions() + # cut ".dev0" + latest_version = "v" + ".".join(__version__.split(".")[:3]) + + # retrieve github version that matches + if revision is None: + revision = latest_version if latest_version[1:] in available_versions else "main" + logger.info(f"Defaulting to latest_version: {revision}.") + elif revision in available_versions: + revision = f"v{revision}" + elif revision == "main": + revision = revision + else: + raise ValueError( + f"`custom_revision`: {revision} does not exist. Please make sure to choose one of" + f" {', '.join(available_versions + ['main'])}." + ) + + # community pipeline on GitHub + github_url = COMMUNITY_PIPELINES_URL.format(revision=revision, pipeline=pretrained_model_name_or_path) + try: + resolved_module_file = cached_download( + github_url, + cache_dir=cache_dir, + force_download=force_download, + proxies=proxies, + resume_download=resume_download, + local_files_only=local_files_only, + use_auth_token=False, + ) + submodule = "git" + module_file = pretrained_model_name_or_path + ".py" + except EnvironmentError: + logger.error(f"Could not locate the {module_file} inside {pretrained_model_name_or_path}.") + raise + else: + try: + # Load from URL or cache if already cached + resolved_module_file = hf_hub_download( + pretrained_model_name_or_path, + module_file, + cache_dir=cache_dir, + force_download=force_download, + proxies=proxies, + resume_download=resume_download, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + ) + submodule = os.path.join("local", "--".join(pretrained_model_name_or_path.split("/"))) + except EnvironmentError: + logger.error(f"Could not locate the {module_file} inside {pretrained_model_name_or_path}.") + raise + + # Check we have all the requirements in our environment + modules_needed = check_imports(resolved_module_file) + + # Now we move the module inside our cached dynamic modules. + full_submodule = DIFFUSERS_DYNAMIC_MODULE_NAME + os.path.sep + submodule + create_dynamic_module(full_submodule) + submodule_path = Path(HF_MODULES_CACHE) / full_submodule + if submodule == "local" or submodule == "git": + # We always copy local files (we could hash the file to see if there was a change, and give them the name of + # that hash, to only copy when there is a modification but it seems overkill for now). + # The only reason we do the copy is to avoid putting too many folders in sys.path. + shutil.copy(resolved_module_file, submodule_path / module_file) + for module_needed in modules_needed: + module_needed = f"{module_needed}.py" + shutil.copy(os.path.join(pretrained_model_name_or_path, module_needed), submodule_path / module_needed) + else: + # Get the commit hash + # TODO: we will get this info in the etag soon, so retrieve it from there and not here. + if isinstance(use_auth_token, str): + token = use_auth_token + elif use_auth_token is True: + token = HfFolder.get_token() + else: + token = None + + commit_hash = model_info(pretrained_model_name_or_path, revision=revision, token=token).sha + + # The module file will end up being placed in a subfolder with the git hash of the repo. This way we get the + # benefit of versioning. + submodule_path = submodule_path / commit_hash + full_submodule = full_submodule + os.path.sep + commit_hash + create_dynamic_module(full_submodule) + + if not (submodule_path / module_file).exists(): + shutil.copy(resolved_module_file, submodule_path / module_file) + # Make sure we also have every file with relative + for module_needed in modules_needed: + if not (submodule_path / module_needed).exists(): + get_cached_module_file( + pretrained_model_name_or_path, + f"{module_needed}.py", + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + use_auth_token=use_auth_token, + revision=revision, + local_files_only=local_files_only, + ) + return os.path.join(full_submodule, module_file) + + +def get_class_from_dynamic_module( + pretrained_model_name_or_path: Union[str, os.PathLike], + module_file: str, + class_name: Optional[str] = None, + cache_dir: Optional[Union[str, os.PathLike]] = None, + force_download: bool = False, + resume_download: bool = False, + proxies: Optional[Dict[str, str]] = None, + use_auth_token: Optional[Union[bool, str]] = None, + revision: Optional[str] = None, + local_files_only: bool = False, + **kwargs, +): + """ + Extracts a class from a module file, present in the local folder or repository of a model. + + + + Calling this function will execute the code in the module file found locally or downloaded from the Hub. It should + therefore only be called on trusted repos. + + + + Args: + pretrained_model_name_or_path (`str` or `os.PathLike`): + This can be either: + + - a string, the *model id* of a pretrained model configuration hosted inside a model repo on + huggingface.co. Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced + under a user or organization name, like `dbmdz/bert-base-german-cased`. + - a path to a *directory* containing a configuration file saved using the + [`~PreTrainedTokenizer.save_pretrained`] method, e.g., `./my_model_directory/`. + + module_file (`str`): + The name of the module file containing the class to look for. + class_name (`str`): + The name of the class to import in the module. + cache_dir (`str` or `os.PathLike`, *optional*): + Path to a directory in which a downloaded pretrained model configuration should be cached if the standard + cache should not be used. + force_download (`bool`, *optional*, defaults to `False`): + Whether or not to force to (re-)download the configuration files and override the cached versions if they + exist. + resume_download (`bool`, *optional*, defaults to `False`): + Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists. + proxies (`Dict[str, str]`, *optional*): + A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', + 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request. + use_auth_token (`str` or `bool`, *optional*): + The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated + when running `transformers-cli login` (stored in `~/.huggingface`). + revision (`str`, *optional*, defaults to `"main"`): + The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a + git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any + identifier allowed by git. + local_files_only (`bool`, *optional*, defaults to `False`): + If `True`, will only try to load the tokenizer configuration from local files. + + + + You may pass a token in `use_auth_token` if you are not logged in (`huggingface-cli long`) and want to use private + or [gated models](https://huggingface.co/docs/hub/models-gated#gated-models). + + + + Returns: + `type`: The class, dynamically imported from the module. + + Examples: + + ```python + # Download module `modeling.py` from huggingface.co and cache then extract the class `MyBertModel` from this + # module. + cls = get_class_from_dynamic_module("sgugger/my-bert-model", "modeling.py", "MyBertModel") + ```""" + # And lastly we get the class inside our newly created module + final_module = get_cached_module_file( + pretrained_model_name_or_path, + module_file, + cache_dir=cache_dir, + force_download=force_download, + resume_download=resume_download, + proxies=proxies, + use_auth_token=use_auth_token, + revision=revision, + local_files_only=local_files_only, + ) + return get_class_in_module(class_name, final_module.replace(".py", "")) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/hub_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/hub_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..4f0cf00a5c5d0d303ba53f62fbf027c0bc31ad49 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/hub_utils.py @@ -0,0 +1,361 @@ +# coding=utf-8 +# Copyright 2023 The HuggingFace Inc. team. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +import os +import re +import sys +import traceback +import warnings +from pathlib import Path +from typing import Dict, Optional, Union +from uuid import uuid4 + +from huggingface_hub import HfFolder, ModelCard, ModelCardData, hf_hub_download, whoami +from huggingface_hub.file_download import REGEX_COMMIT_HASH +from huggingface_hub.utils import ( + EntryNotFoundError, + RepositoryNotFoundError, + RevisionNotFoundError, + is_jinja_available, +) +from packaging import version +from requests import HTTPError + +from .. import __version__ +from .constants import ( + DEPRECATED_REVISION_ARGS, + DIFFUSERS_CACHE, + HUGGINGFACE_CO_RESOLVE_ENDPOINT, + SAFETENSORS_WEIGHTS_NAME, + WEIGHTS_NAME, +) +from .import_utils import ( + ENV_VARS_TRUE_VALUES, + _flax_version, + _jax_version, + _onnxruntime_version, + _torch_version, + is_flax_available, + is_onnx_available, + is_torch_available, +) +from .logging import get_logger + + +logger = get_logger(__name__) + + +MODEL_CARD_TEMPLATE_PATH = Path(__file__).parent / "model_card_template.md" +SESSION_ID = uuid4().hex +HF_HUB_OFFLINE = os.getenv("HF_HUB_OFFLINE", "").upper() in ENV_VARS_TRUE_VALUES +DISABLE_TELEMETRY = os.getenv("DISABLE_TELEMETRY", "").upper() in ENV_VARS_TRUE_VALUES +HUGGINGFACE_CO_TELEMETRY = HUGGINGFACE_CO_RESOLVE_ENDPOINT + "/api/telemetry/" + + +def http_user_agent(user_agent: Union[Dict, str, None] = None) -> str: + """ + Formats a user-agent string with basic info about a request. + """ + ua = f"diffusers/{__version__}; python/{sys.version.split()[0]}; session_id/{SESSION_ID}" + if DISABLE_TELEMETRY or HF_HUB_OFFLINE: + return ua + "; telemetry/off" + if is_torch_available(): + ua += f"; torch/{_torch_version}" + if is_flax_available(): + ua += f"; jax/{_jax_version}" + ua += f"; flax/{_flax_version}" + if is_onnx_available(): + ua += f"; onnxruntime/{_onnxruntime_version}" + # CI will set this value to True + if os.environ.get("DIFFUSERS_IS_CI", "").upper() in ENV_VARS_TRUE_VALUES: + ua += "; is_ci/true" + if isinstance(user_agent, dict): + ua += "; " + "; ".join(f"{k}/{v}" for k, v in user_agent.items()) + elif isinstance(user_agent, str): + ua += "; " + user_agent + return ua + + +def get_full_repo_name(model_id: str, organization: Optional[str] = None, token: Optional[str] = None): + if token is None: + token = HfFolder.get_token() + if organization is None: + username = whoami(token)["name"] + return f"{username}/{model_id}" + else: + return f"{organization}/{model_id}" + + +def create_model_card(args, model_name): + if not is_jinja_available(): + raise ValueError( + "Modelcard rendering is based on Jinja templates." + " Please make sure to have `jinja` installed before using `create_model_card`." + " To install it, please run `pip install Jinja2`." + ) + + if hasattr(args, "local_rank") and args.local_rank not in [-1, 0]: + return + + hub_token = args.hub_token if hasattr(args, "hub_token") else None + repo_name = get_full_repo_name(model_name, token=hub_token) + + model_card = ModelCard.from_template( + card_data=ModelCardData( # Card metadata object that will be converted to YAML block + language="en", + license="apache-2.0", + library_name="diffusers", + tags=[], + datasets=args.dataset_name, + metrics=[], + ), + template_path=MODEL_CARD_TEMPLATE_PATH, + model_name=model_name, + repo_name=repo_name, + dataset_name=args.dataset_name if hasattr(args, "dataset_name") else None, + learning_rate=args.learning_rate, + train_batch_size=args.train_batch_size, + eval_batch_size=args.eval_batch_size, + gradient_accumulation_steps=( + args.gradient_accumulation_steps if hasattr(args, "gradient_accumulation_steps") else None + ), + adam_beta1=args.adam_beta1 if hasattr(args, "adam_beta1") else None, + adam_beta2=args.adam_beta2 if hasattr(args, "adam_beta2") else None, + adam_weight_decay=args.adam_weight_decay if hasattr(args, "adam_weight_decay") else None, + adam_epsilon=args.adam_epsilon if hasattr(args, "adam_epsilon") else None, + lr_scheduler=args.lr_scheduler if hasattr(args, "lr_scheduler") else None, + lr_warmup_steps=args.lr_warmup_steps if hasattr(args, "lr_warmup_steps") else None, + ema_inv_gamma=args.ema_inv_gamma if hasattr(args, "ema_inv_gamma") else None, + ema_power=args.ema_power if hasattr(args, "ema_power") else None, + ema_max_decay=args.ema_max_decay if hasattr(args, "ema_max_decay") else None, + mixed_precision=args.mixed_precision, + ) + + card_path = os.path.join(args.output_dir, "README.md") + model_card.save(card_path) + + +def extract_commit_hash(resolved_file: Optional[str], commit_hash: Optional[str] = None): + """ + Extracts the commit hash from a resolved filename toward a cache file. + """ + if resolved_file is None or commit_hash is not None: + return commit_hash + resolved_file = str(Path(resolved_file).as_posix()) + search = re.search(r"snapshots/([^/]+)/", resolved_file) + if search is None: + return None + commit_hash = search.groups()[0] + return commit_hash if REGEX_COMMIT_HASH.match(commit_hash) else None + + +# Old default cache path, potentially to be migrated. +# This logic was more or less taken from `transformers`, with the following differences: +# - Diffusers doesn't use custom environment variables to specify the cache path. +# - There is no need to migrate the cache format, just move the files to the new location. +hf_cache_home = os.path.expanduser( + os.getenv("HF_HOME", os.path.join(os.getenv("XDG_CACHE_HOME", "~/.cache"), "huggingface")) +) +old_diffusers_cache = os.path.join(hf_cache_home, "diffusers") + + +def move_cache(old_cache_dir: Optional[str] = None, new_cache_dir: Optional[str] = None) -> None: + if new_cache_dir is None: + new_cache_dir = DIFFUSERS_CACHE + if old_cache_dir is None: + old_cache_dir = old_diffusers_cache + + old_cache_dir = Path(old_cache_dir).expanduser() + new_cache_dir = Path(new_cache_dir).expanduser() + for old_blob_path in old_cache_dir.glob("**/blobs/*"): + if old_blob_path.is_file() and not old_blob_path.is_symlink(): + new_blob_path = new_cache_dir / old_blob_path.relative_to(old_cache_dir) + new_blob_path.parent.mkdir(parents=True, exist_ok=True) + os.replace(old_blob_path, new_blob_path) + try: + os.symlink(new_blob_path, old_blob_path) + except OSError: + logger.warning( + "Could not create symlink between old cache and new cache. If you use an older version of diffusers again, files will be re-downloaded." + ) + # At this point, old_cache_dir contains symlinks to the new cache (it can still be used). + + +cache_version_file = os.path.join(DIFFUSERS_CACHE, "version_diffusers_cache.txt") +if not os.path.isfile(cache_version_file): + cache_version = 0 +else: + with open(cache_version_file) as f: + try: + cache_version = int(f.read()) + except ValueError: + cache_version = 0 + +if cache_version < 1: + old_cache_is_not_empty = os.path.isdir(old_diffusers_cache) and len(os.listdir(old_diffusers_cache)) > 0 + if old_cache_is_not_empty: + logger.warning( + "The cache for model files in Diffusers v0.14.0 has moved to a new location. Moving your " + "existing cached models. This is a one-time operation, you can interrupt it or run it " + "later by calling `diffusers.utils.hub_utils.move_cache()`." + ) + try: + move_cache() + except Exception as e: + trace = "\n".join(traceback.format_tb(e.__traceback__)) + logger.error( + f"There was a problem when trying to move your cache:\n\n{trace}\n{e.__class__.__name__}: {e}\n\nPlease " + "file an issue at https://github.com/huggingface/diffusers/issues/new/choose, copy paste this whole " + "message and we will do our best to help." + ) + +if cache_version < 1: + try: + os.makedirs(DIFFUSERS_CACHE, exist_ok=True) + with open(cache_version_file, "w") as f: + f.write("1") + except Exception: + logger.warning( + f"There was a problem when trying to write in your cache folder ({DIFFUSERS_CACHE}). Please, ensure " + "the directory exists and can be written to." + ) + + +def _add_variant(weights_name: str, variant: Optional[str] = None) -> str: + if variant is not None: + splits = weights_name.split(".") + splits = splits[:-1] + [variant] + splits[-1:] + weights_name = ".".join(splits) + + return weights_name + + +def _get_model_file( + pretrained_model_name_or_path, + *, + weights_name, + subfolder, + cache_dir, + force_download, + proxies, + resume_download, + local_files_only, + use_auth_token, + user_agent, + revision, + commit_hash=None, +): + pretrained_model_name_or_path = str(pretrained_model_name_or_path) + if os.path.isfile(pretrained_model_name_or_path): + return pretrained_model_name_or_path + elif os.path.isdir(pretrained_model_name_or_path): + if os.path.isfile(os.path.join(pretrained_model_name_or_path, weights_name)): + # Load from a PyTorch checkpoint + model_file = os.path.join(pretrained_model_name_or_path, weights_name) + return model_file + elif subfolder is not None and os.path.isfile( + os.path.join(pretrained_model_name_or_path, subfolder, weights_name) + ): + model_file = os.path.join(pretrained_model_name_or_path, subfolder, weights_name) + return model_file + else: + raise EnvironmentError( + f"Error no file named {weights_name} found in directory {pretrained_model_name_or_path}." + ) + else: + # 1. First check if deprecated way of loading from branches is used + if ( + revision in DEPRECATED_REVISION_ARGS + and (weights_name == WEIGHTS_NAME or weights_name == SAFETENSORS_WEIGHTS_NAME) + and version.parse(version.parse(__version__).base_version) >= version.parse("0.20.0") + ): + try: + model_file = hf_hub_download( + pretrained_model_name_or_path, + filename=_add_variant(weights_name, revision), + cache_dir=cache_dir, + force_download=force_download, + proxies=proxies, + resume_download=resume_download, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + user_agent=user_agent, + subfolder=subfolder, + revision=revision or commit_hash, + ) + warnings.warn( + f"Loading the variant {revision} from {pretrained_model_name_or_path} via `revision='{revision}'` is deprecated. Loading instead from `revision='main'` with `variant={revision}`. Loading model variants via `revision='{revision}'` will be removed in diffusers v1. Please use `variant='{revision}'` instead.", + FutureWarning, + ) + return model_file + except: # noqa: E722 + warnings.warn( + f"You are loading the variant {revision} from {pretrained_model_name_or_path} via `revision='{revision}'`. This behavior is deprecated and will be removed in diffusers v1. One should use `variant='{revision}'` instead. However, it appears that {pretrained_model_name_or_path} currently does not have a {_add_variant(weights_name, revision)} file in the 'main' branch of {pretrained_model_name_or_path}. \n The Diffusers team and community would be very grateful if you could open an issue: https://github.com/huggingface/diffusers/issues/new with the title '{pretrained_model_name_or_path} is missing {_add_variant(weights_name, revision)}' so that the correct variant file can be added.", + FutureWarning, + ) + try: + # 2. Load model file as usual + model_file = hf_hub_download( + pretrained_model_name_or_path, + filename=weights_name, + cache_dir=cache_dir, + force_download=force_download, + proxies=proxies, + resume_download=resume_download, + local_files_only=local_files_only, + use_auth_token=use_auth_token, + user_agent=user_agent, + subfolder=subfolder, + revision=revision or commit_hash, + ) + return model_file + + except RepositoryNotFoundError: + raise EnvironmentError( + f"{pretrained_model_name_or_path} is not a local folder and is not a valid model identifier " + "listed on 'https://huggingface.co/models'\nIf this is a private repository, make sure to pass a " + "token having permission to this repo with `use_auth_token` or log in with `huggingface-cli " + "login`." + ) + except RevisionNotFoundError: + raise EnvironmentError( + f"{revision} is not a valid git identifier (branch name, tag name or commit id) that exists for " + "this model name. Check the model page at " + f"'https://huggingface.co/{pretrained_model_name_or_path}' for available revisions." + ) + except EntryNotFoundError: + raise EnvironmentError( + f"{pretrained_model_name_or_path} does not appear to have a file named {weights_name}." + ) + except HTTPError as err: + raise EnvironmentError( + f"There was a specific connection error when trying to load {pretrained_model_name_or_path}:\n{err}" + ) + except ValueError: + raise EnvironmentError( + f"We couldn't connect to '{HUGGINGFACE_CO_RESOLVE_ENDPOINT}' to load this model, couldn't find it" + f" in the cached files and it looks like {pretrained_model_name_or_path} is not the path to a" + f" directory containing a file named {weights_name} or" + " \nCheckout your internet connection or see how to run the library in" + " offline mode at 'https://huggingface.co/docs/diffusers/installation#offline-mode'." + ) + except EnvironmentError: + raise EnvironmentError( + f"Can't load the model for '{pretrained_model_name_or_path}'. If you were trying to load it from " + "'https://huggingface.co/models', make sure you don't have a local directory with the same name. " + f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory " + f"containing a file named {weights_name}" + ) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/import_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/import_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..3a7539cfb0fb4a99ea7da3b8e88619a7640257ca --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/import_utils.py @@ -0,0 +1,655 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" +Import utilities: Utilities related to imports and our lazy inits. +""" +import importlib.util +import operator as op +import os +import sys +from collections import OrderedDict +from typing import Union + +from huggingface_hub.utils import is_jinja_available # noqa: F401 +from packaging import version +from packaging.version import Version, parse + +from . import logging + + +# The package importlib_metadata is in a different place, depending on the python version. +if sys.version_info < (3, 8): + import importlib_metadata +else: + import importlib.metadata as importlib_metadata + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + +ENV_VARS_TRUE_VALUES = {"1", "ON", "YES", "TRUE"} +ENV_VARS_TRUE_AND_AUTO_VALUES = ENV_VARS_TRUE_VALUES.union({"AUTO"}) + +USE_TF = os.environ.get("USE_TF", "AUTO").upper() +USE_TORCH = os.environ.get("USE_TORCH", "AUTO").upper() +USE_JAX = os.environ.get("USE_FLAX", "AUTO").upper() +USE_SAFETENSORS = os.environ.get("USE_SAFETENSORS", "AUTO").upper() + +STR_OPERATION_TO_FUNC = {">": op.gt, ">=": op.ge, "==": op.eq, "!=": op.ne, "<=": op.le, "<": op.lt} + +_torch_version = "N/A" +if USE_TORCH in ENV_VARS_TRUE_AND_AUTO_VALUES and USE_TF not in ENV_VARS_TRUE_VALUES: + _torch_available = importlib.util.find_spec("torch") is not None + if _torch_available: + try: + _torch_version = importlib_metadata.version("torch") + logger.info(f"PyTorch version {_torch_version} available.") + except importlib_metadata.PackageNotFoundError: + _torch_available = False +else: + logger.info("Disabling PyTorch because USE_TORCH is set") + _torch_available = False + + +_tf_version = "N/A" +if USE_TF in ENV_VARS_TRUE_AND_AUTO_VALUES and USE_TORCH not in ENV_VARS_TRUE_VALUES: + _tf_available = importlib.util.find_spec("tensorflow") is not None + if _tf_available: + candidates = ( + "tensorflow", + "tensorflow-cpu", + "tensorflow-gpu", + "tf-nightly", + "tf-nightly-cpu", + "tf-nightly-gpu", + "intel-tensorflow", + "intel-tensorflow-avx512", + "tensorflow-rocm", + "tensorflow-macos", + "tensorflow-aarch64", + ) + _tf_version = None + # For the metadata, we have to look for both tensorflow and tensorflow-cpu + for pkg in candidates: + try: + _tf_version = importlib_metadata.version(pkg) + break + except importlib_metadata.PackageNotFoundError: + pass + _tf_available = _tf_version is not None + if _tf_available: + if version.parse(_tf_version) < version.parse("2"): + logger.info(f"TensorFlow found but with version {_tf_version}. Diffusers requires version 2 minimum.") + _tf_available = False + else: + logger.info(f"TensorFlow version {_tf_version} available.") +else: + logger.info("Disabling Tensorflow because USE_TORCH is set") + _tf_available = False + +_jax_version = "N/A" +_flax_version = "N/A" +if USE_JAX in ENV_VARS_TRUE_AND_AUTO_VALUES: + _flax_available = importlib.util.find_spec("jax") is not None and importlib.util.find_spec("flax") is not None + if _flax_available: + try: + _jax_version = importlib_metadata.version("jax") + _flax_version = importlib_metadata.version("flax") + logger.info(f"JAX version {_jax_version}, Flax version {_flax_version} available.") + except importlib_metadata.PackageNotFoundError: + _flax_available = False +else: + _flax_available = False + +if USE_SAFETENSORS in ENV_VARS_TRUE_AND_AUTO_VALUES: + _safetensors_available = importlib.util.find_spec("safetensors") is not None + if _safetensors_available: + try: + _safetensors_version = importlib_metadata.version("safetensors") + logger.info(f"Safetensors version {_safetensors_version} available.") + except importlib_metadata.PackageNotFoundError: + _safetensors_available = False +else: + logger.info("Disabling Safetensors because USE_TF is set") + _safetensors_available = False + +_transformers_available = importlib.util.find_spec("transformers") is not None +try: + _transformers_version = importlib_metadata.version("transformers") + logger.debug(f"Successfully imported transformers version {_transformers_version}") +except importlib_metadata.PackageNotFoundError: + _transformers_available = False + + +_inflect_available = importlib.util.find_spec("inflect") is not None +try: + _inflect_version = importlib_metadata.version("inflect") + logger.debug(f"Successfully imported inflect version {_inflect_version}") +except importlib_metadata.PackageNotFoundError: + _inflect_available = False + + +_unidecode_available = importlib.util.find_spec("unidecode") is not None +try: + _unidecode_version = importlib_metadata.version("unidecode") + logger.debug(f"Successfully imported unidecode version {_unidecode_version}") +except importlib_metadata.PackageNotFoundError: + _unidecode_available = False + + +_onnxruntime_version = "N/A" +_onnx_available = importlib.util.find_spec("onnxruntime") is not None +if _onnx_available: + candidates = ( + "onnxruntime", + "onnxruntime-gpu", + "ort_nightly_gpu", + "onnxruntime-directml", + "onnxruntime-openvino", + "ort_nightly_directml", + "onnxruntime-rocm", + "onnxruntime-training", + ) + _onnxruntime_version = None + # For the metadata, we have to look for both onnxruntime and onnxruntime-gpu + for pkg in candidates: + try: + _onnxruntime_version = importlib_metadata.version(pkg) + break + except importlib_metadata.PackageNotFoundError: + pass + _onnx_available = _onnxruntime_version is not None + if _onnx_available: + logger.debug(f"Successfully imported onnxruntime version {_onnxruntime_version}") + +# (sayakpaul): importlib.util.find_spec("opencv-python") returns None even when it's installed. +# _opencv_available = importlib.util.find_spec("opencv-python") is not None +try: + candidates = ( + "opencv-python", + "opencv-contrib-python", + "opencv-python-headless", + "opencv-contrib-python-headless", + ) + _opencv_version = None + for pkg in candidates: + try: + _opencv_version = importlib_metadata.version(pkg) + break + except importlib_metadata.PackageNotFoundError: + pass + _opencv_available = _opencv_version is not None + if _opencv_available: + logger.debug(f"Successfully imported cv2 version {_opencv_version}") +except importlib_metadata.PackageNotFoundError: + _opencv_available = False + +_scipy_available = importlib.util.find_spec("scipy") is not None +try: + _scipy_version = importlib_metadata.version("scipy") + logger.debug(f"Successfully imported scipy version {_scipy_version}") +except importlib_metadata.PackageNotFoundError: + _scipy_available = False + +_librosa_available = importlib.util.find_spec("librosa") is not None +try: + _librosa_version = importlib_metadata.version("librosa") + logger.debug(f"Successfully imported librosa version {_librosa_version}") +except importlib_metadata.PackageNotFoundError: + _librosa_available = False + +_accelerate_available = importlib.util.find_spec("accelerate") is not None +try: + _accelerate_version = importlib_metadata.version("accelerate") + logger.debug(f"Successfully imported accelerate version {_accelerate_version}") +except importlib_metadata.PackageNotFoundError: + _accelerate_available = False + +_xformers_available = importlib.util.find_spec("xformers") is not None +try: + _xformers_version = importlib_metadata.version("xformers") + if _torch_available: + import torch + + if version.Version(torch.__version__) < version.Version("1.12"): + raise ValueError("PyTorch should be >= 1.12") + logger.debug(f"Successfully imported xformers version {_xformers_version}") +except importlib_metadata.PackageNotFoundError: + _xformers_available = False + +_k_diffusion_available = importlib.util.find_spec("k_diffusion") is not None +try: + _k_diffusion_version = importlib_metadata.version("k_diffusion") + logger.debug(f"Successfully imported k-diffusion version {_k_diffusion_version}") +except importlib_metadata.PackageNotFoundError: + _k_diffusion_available = False + +_note_seq_available = importlib.util.find_spec("note_seq") is not None +try: + _note_seq_version = importlib_metadata.version("note_seq") + logger.debug(f"Successfully imported note-seq version {_note_seq_version}") +except importlib_metadata.PackageNotFoundError: + _note_seq_available = False + +_wandb_available = importlib.util.find_spec("wandb") is not None +try: + _wandb_version = importlib_metadata.version("wandb") + logger.debug(f"Successfully imported wandb version {_wandb_version }") +except importlib_metadata.PackageNotFoundError: + _wandb_available = False + +_omegaconf_available = importlib.util.find_spec("omegaconf") is not None +try: + _omegaconf_version = importlib_metadata.version("omegaconf") + logger.debug(f"Successfully imported omegaconf version {_omegaconf_version}") +except importlib_metadata.PackageNotFoundError: + _omegaconf_available = False + +_tensorboard_available = importlib.util.find_spec("tensorboard") +try: + _tensorboard_version = importlib_metadata.version("tensorboard") + logger.debug(f"Successfully imported tensorboard version {_tensorboard_version}") +except importlib_metadata.PackageNotFoundError: + _tensorboard_available = False + + +_compel_available = importlib.util.find_spec("compel") +try: + _compel_version = importlib_metadata.version("compel") + logger.debug(f"Successfully imported compel version {_compel_version}") +except importlib_metadata.PackageNotFoundError: + _compel_available = False + + +_ftfy_available = importlib.util.find_spec("ftfy") is not None +try: + _ftfy_version = importlib_metadata.version("ftfy") + logger.debug(f"Successfully imported ftfy version {_ftfy_version}") +except importlib_metadata.PackageNotFoundError: + _ftfy_available = False + + +_bs4_available = importlib.util.find_spec("bs4") is not None +try: + # importlib metadata under different name + _bs4_version = importlib_metadata.version("beautifulsoup4") + logger.debug(f"Successfully imported ftfy version {_bs4_version}") +except importlib_metadata.PackageNotFoundError: + _bs4_available = False + +_torchsde_available = importlib.util.find_spec("torchsde") is not None +try: + _torchsde_version = importlib_metadata.version("torchsde") + logger.debug(f"Successfully imported torchsde version {_torchsde_version}") +except importlib_metadata.PackageNotFoundError: + _torchsde_available = False + +_invisible_watermark_available = importlib.util.find_spec("imwatermark") is not None +try: + _invisible_watermark_version = importlib_metadata.version("invisible-watermark") + logger.debug(f"Successfully imported invisible-watermark version {_invisible_watermark_version}") +except importlib_metadata.PackageNotFoundError: + _invisible_watermark_available = False + + +def is_torch_available(): + return _torch_available + + +def is_safetensors_available(): + return _safetensors_available + + +def is_tf_available(): + return _tf_available + + +def is_flax_available(): + return _flax_available + + +def is_transformers_available(): + return _transformers_available + + +def is_inflect_available(): + return _inflect_available + + +def is_unidecode_available(): + return _unidecode_available + + +def is_onnx_available(): + return _onnx_available + + +def is_opencv_available(): + return _opencv_available + + +def is_scipy_available(): + return _scipy_available + + +def is_librosa_available(): + return _librosa_available + + +def is_xformers_available(): + return _xformers_available + + +def is_accelerate_available(): + return _accelerate_available + + +def is_k_diffusion_available(): + return _k_diffusion_available + + +def is_note_seq_available(): + return _note_seq_available + + +def is_wandb_available(): + return _wandb_available + + +def is_omegaconf_available(): + return _omegaconf_available + + +def is_tensorboard_available(): + return _tensorboard_available + + +def is_compel_available(): + return _compel_available + + +def is_ftfy_available(): + return _ftfy_available + + +def is_bs4_available(): + return _bs4_available + + +def is_torchsde_available(): + return _torchsde_available + + +def is_invisible_watermark_available(): + return _invisible_watermark_available + + +# docstyle-ignore +FLAX_IMPORT_ERROR = """ +{0} requires the FLAX library but it was not found in your environment. Checkout the instructions on the +installation page: https://github.com/google/flax and follow the ones that match your environment. +""" + +# docstyle-ignore +INFLECT_IMPORT_ERROR = """ +{0} requires the inflect library but it was not found in your environment. You can install it with pip: `pip install +inflect` +""" + +# docstyle-ignore +PYTORCH_IMPORT_ERROR = """ +{0} requires the PyTorch library but it was not found in your environment. Checkout the instructions on the +installation page: https://pytorch.org/get-started/locally/ and follow the ones that match your environment. +""" + +# docstyle-ignore +ONNX_IMPORT_ERROR = """ +{0} requires the onnxruntime library but it was not found in your environment. You can install it with pip: `pip +install onnxruntime` +""" + +# docstyle-ignore +OPENCV_IMPORT_ERROR = """ +{0} requires the OpenCV library but it was not found in your environment. You can install it with pip: `pip +install opencv-python` +""" + +# docstyle-ignore +SCIPY_IMPORT_ERROR = """ +{0} requires the scipy library but it was not found in your environment. You can install it with pip: `pip install +scipy` +""" + +# docstyle-ignore +LIBROSA_IMPORT_ERROR = """ +{0} requires the librosa library but it was not found in your environment. Checkout the instructions on the +installation page: https://librosa.org/doc/latest/install.html and follow the ones that match your environment. +""" + +# docstyle-ignore +TRANSFORMERS_IMPORT_ERROR = """ +{0} requires the transformers library but it was not found in your environment. You can install it with pip: `pip +install transformers` +""" + +# docstyle-ignore +UNIDECODE_IMPORT_ERROR = """ +{0} requires the unidecode library but it was not found in your environment. You can install it with pip: `pip install +Unidecode` +""" + +# docstyle-ignore +K_DIFFUSION_IMPORT_ERROR = """ +{0} requires the k-diffusion library but it was not found in your environment. You can install it with pip: `pip +install k-diffusion` +""" + +# docstyle-ignore +NOTE_SEQ_IMPORT_ERROR = """ +{0} requires the note-seq library but it was not found in your environment. You can install it with pip: `pip +install note-seq` +""" + +# docstyle-ignore +WANDB_IMPORT_ERROR = """ +{0} requires the wandb library but it was not found in your environment. You can install it with pip: `pip +install wandb` +""" + +# docstyle-ignore +OMEGACONF_IMPORT_ERROR = """ +{0} requires the omegaconf library but it was not found in your environment. You can install it with pip: `pip +install omegaconf` +""" + +# docstyle-ignore +TENSORBOARD_IMPORT_ERROR = """ +{0} requires the tensorboard library but it was not found in your environment. You can install it with pip: `pip +install tensorboard` +""" + + +# docstyle-ignore +COMPEL_IMPORT_ERROR = """ +{0} requires the compel library but it was not found in your environment. You can install it with pip: `pip install compel` +""" + +# docstyle-ignore +BS4_IMPORT_ERROR = """ +{0} requires the Beautiful Soup library but it was not found in your environment. You can install it with pip: +`pip install beautifulsoup4`. Please note that you may need to restart your runtime after installation. +""" + +# docstyle-ignore +FTFY_IMPORT_ERROR = """ +{0} requires the ftfy library but it was not found in your environment. Checkout the instructions on the +installation section: https://github.com/rspeer/python-ftfy/tree/master#installing and follow the ones +that match your environment. Please note that you may need to restart your runtime after installation. +""" + +# docstyle-ignore +TORCHSDE_IMPORT_ERROR = """ +{0} requires the torchsde library but it was not found in your environment. You can install it with pip: `pip install torchsde` +""" + +# docstyle-ignore +INVISIBLE_WATERMARK_IMPORT_ERROR = """ +{0} requires the invisible-watermark library but it was not found in your environment. You can install it with pip: `pip install invisible-watermark>=2.0` +""" + + +BACKENDS_MAPPING = OrderedDict( + [ + ("bs4", (is_bs4_available, BS4_IMPORT_ERROR)), + ("flax", (is_flax_available, FLAX_IMPORT_ERROR)), + ("inflect", (is_inflect_available, INFLECT_IMPORT_ERROR)), + ("onnx", (is_onnx_available, ONNX_IMPORT_ERROR)), + ("opencv", (is_opencv_available, OPENCV_IMPORT_ERROR)), + ("scipy", (is_scipy_available, SCIPY_IMPORT_ERROR)), + ("torch", (is_torch_available, PYTORCH_IMPORT_ERROR)), + ("transformers", (is_transformers_available, TRANSFORMERS_IMPORT_ERROR)), + ("unidecode", (is_unidecode_available, UNIDECODE_IMPORT_ERROR)), + ("librosa", (is_librosa_available, LIBROSA_IMPORT_ERROR)), + ("k_diffusion", (is_k_diffusion_available, K_DIFFUSION_IMPORT_ERROR)), + ("note_seq", (is_note_seq_available, NOTE_SEQ_IMPORT_ERROR)), + ("wandb", (is_wandb_available, WANDB_IMPORT_ERROR)), + ("omegaconf", (is_omegaconf_available, OMEGACONF_IMPORT_ERROR)), + ("tensorboard", (is_tensorboard_available, TENSORBOARD_IMPORT_ERROR)), + ("compel", (is_compel_available, COMPEL_IMPORT_ERROR)), + ("ftfy", (is_ftfy_available, FTFY_IMPORT_ERROR)), + ("torchsde", (is_torchsde_available, TORCHSDE_IMPORT_ERROR)), + ("invisible_watermark", (is_invisible_watermark_available, INVISIBLE_WATERMARK_IMPORT_ERROR)), + ] +) + + +def requires_backends(obj, backends): + if not isinstance(backends, (list, tuple)): + backends = [backends] + + name = obj.__name__ if hasattr(obj, "__name__") else obj.__class__.__name__ + checks = (BACKENDS_MAPPING[backend] for backend in backends) + failed = [msg.format(name) for available, msg in checks if not available()] + if failed: + raise ImportError("".join(failed)) + + if name in [ + "VersatileDiffusionTextToImagePipeline", + "VersatileDiffusionPipeline", + "VersatileDiffusionDualGuidedPipeline", + "StableDiffusionImageVariationPipeline", + "UnCLIPPipeline", + ] and is_transformers_version("<", "4.25.0"): + raise ImportError( + f"You need to install `transformers>=4.25` in order to use {name}: \n```\n pip install" + " --upgrade transformers \n```" + ) + + if name in ["StableDiffusionDepth2ImgPipeline", "StableDiffusionPix2PixZeroPipeline"] and is_transformers_version( + "<", "4.26.0" + ): + raise ImportError( + f"You need to install `transformers>=4.26` in order to use {name}: \n```\n pip install" + " --upgrade transformers \n```" + ) + + +class DummyObject(type): + """ + Metaclass for the dummy objects. Any class inheriting from it will return the ImportError generated by + `requires_backend` each time a user tries to access any method of that class. + """ + + def __getattr__(cls, key): + if key.startswith("_"): + return super().__getattr__(cls, key) + requires_backends(cls, cls._backends) + + +# This function was copied from: https://github.com/huggingface/accelerate/blob/874c4967d94badd24f893064cc3bef45f57cadf7/src/accelerate/utils/versions.py#L319 +def compare_versions(library_or_version: Union[str, Version], operation: str, requirement_version: str): + """ + Args: + Compares a library version to some requirement using a given operation. + library_or_version (`str` or `packaging.version.Version`): + A library name or a version to check. + operation (`str`): + A string representation of an operator, such as `">"` or `"<="`. + requirement_version (`str`): + The version to compare the library version against + """ + if operation not in STR_OPERATION_TO_FUNC.keys(): + raise ValueError(f"`operation` must be one of {list(STR_OPERATION_TO_FUNC.keys())}, received {operation}") + operation = STR_OPERATION_TO_FUNC[operation] + if isinstance(library_or_version, str): + library_or_version = parse(importlib_metadata.version(library_or_version)) + return operation(library_or_version, parse(requirement_version)) + + +# This function was copied from: https://github.com/huggingface/accelerate/blob/874c4967d94badd24f893064cc3bef45f57cadf7/src/accelerate/utils/versions.py#L338 +def is_torch_version(operation: str, version: str): + """ + Args: + Compares the current PyTorch version to a given reference with an operation. + operation (`str`): + A string representation of an operator, such as `">"` or `"<="` + version (`str`): + A string version of PyTorch + """ + return compare_versions(parse(_torch_version), operation, version) + + +def is_transformers_version(operation: str, version: str): + """ + Args: + Compares the current Transformers version to a given reference with an operation. + operation (`str`): + A string representation of an operator, such as `">"` or `"<="` + version (`str`): + A version string + """ + if not _transformers_available: + return False + return compare_versions(parse(_transformers_version), operation, version) + + +def is_accelerate_version(operation: str, version: str): + """ + Args: + Compares the current Accelerate version to a given reference with an operation. + operation (`str`): + A string representation of an operator, such as `">"` or `"<="` + version (`str`): + A version string + """ + if not _accelerate_available: + return False + return compare_versions(parse(_accelerate_version), operation, version) + + +def is_k_diffusion_version(operation: str, version: str): + """ + Args: + Compares the current k-diffusion version to a given reference with an operation. + operation (`str`): + A string representation of an operator, such as `">"` or `"<="` + version (`str`): + A version string + """ + if not _k_diffusion_available: + return False + return compare_versions(parse(_k_diffusion_version), operation, version) + + +class OptionalDependencyNotAvailable(BaseException): + """An error indicating that an optional dependency of Diffusers was not found in the environment.""" diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/logging.py b/gf/sd-webui-rich-text/diffusers_official/utils/logging.py new file mode 100644 index 0000000000000000000000000000000000000000..4ccc57cd69d57e9bd999e35320cb98416f000522 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/logging.py @@ -0,0 +1,339 @@ +# coding=utf-8 +# Copyright 2023 Optuna, Hugging Face +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" Logging utilities.""" + +import logging +import os +import sys +import threading +from logging import ( + CRITICAL, # NOQA + DEBUG, # NOQA + ERROR, # NOQA + FATAL, # NOQA + INFO, # NOQA + NOTSET, # NOQA + WARN, # NOQA + WARNING, # NOQA +) +from typing import Optional + +from tqdm import auto as tqdm_lib + + +_lock = threading.Lock() +_default_handler: Optional[logging.Handler] = None + +log_levels = { + "debug": logging.DEBUG, + "info": logging.INFO, + "warning": logging.WARNING, + "error": logging.ERROR, + "critical": logging.CRITICAL, +} + +_default_log_level = logging.WARNING + +_tqdm_active = True + + +def _get_default_logging_level(): + """ + If DIFFUSERS_VERBOSITY env var is set to one of the valid choices return that as the new default level. If it is + not - fall back to `_default_log_level` + """ + env_level_str = os.getenv("DIFFUSERS_VERBOSITY", None) + if env_level_str: + if env_level_str in log_levels: + return log_levels[env_level_str] + else: + logging.getLogger().warning( + f"Unknown option DIFFUSERS_VERBOSITY={env_level_str}, " + f"has to be one of: { ', '.join(log_levels.keys()) }" + ) + return _default_log_level + + +def _get_library_name() -> str: + return __name__.split(".")[0] + + +def _get_library_root_logger() -> logging.Logger: + return logging.getLogger(_get_library_name()) + + +def _configure_library_root_logger() -> None: + global _default_handler + + with _lock: + if _default_handler: + # This library has already configured the library root logger. + return + _default_handler = logging.StreamHandler() # Set sys.stderr as stream. + _default_handler.flush = sys.stderr.flush + + # Apply our default configuration to the library root logger. + library_root_logger = _get_library_root_logger() + library_root_logger.addHandler(_default_handler) + library_root_logger.setLevel(_get_default_logging_level()) + library_root_logger.propagate = False + + +def _reset_library_root_logger() -> None: + global _default_handler + + with _lock: + if not _default_handler: + return + + library_root_logger = _get_library_root_logger() + library_root_logger.removeHandler(_default_handler) + library_root_logger.setLevel(logging.NOTSET) + _default_handler = None + + +def get_log_levels_dict(): + return log_levels + + +def get_logger(name: Optional[str] = None) -> logging.Logger: + """ + Return a logger with the specified name. + + This function is not supposed to be directly accessed unless you are writing a custom diffusers module. + """ + + if name is None: + name = _get_library_name() + + _configure_library_root_logger() + return logging.getLogger(name) + + +def get_verbosity() -> int: + """ + Return the current level for the 🤗 Diffusers' root logger as an `int`. + + Returns: + `int`: + Logging level integers which can be one of: + + - `50`: `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` + - `40`: `diffusers.logging.ERROR` + - `30`: `diffusers.logging.WARNING` or `diffusers.logging.WARN` + - `20`: `diffusers.logging.INFO` + - `10`: `diffusers.logging.DEBUG` + + """ + + _configure_library_root_logger() + return _get_library_root_logger().getEffectiveLevel() + + +def set_verbosity(verbosity: int) -> None: + """ + Set the verbosity level for the 🤗 Diffusers' root logger. + + Args: + verbosity (`int`): + Logging level which can be one of: + + - `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` + - `diffusers.logging.ERROR` + - `diffusers.logging.WARNING` or `diffusers.logging.WARN` + - `diffusers.logging.INFO` + - `diffusers.logging.DEBUG` + """ + + _configure_library_root_logger() + _get_library_root_logger().setLevel(verbosity) + + +def set_verbosity_info(): + """Set the verbosity to the `INFO` level.""" + return set_verbosity(INFO) + + +def set_verbosity_warning(): + """Set the verbosity to the `WARNING` level.""" + return set_verbosity(WARNING) + + +def set_verbosity_debug(): + """Set the verbosity to the `DEBUG` level.""" + return set_verbosity(DEBUG) + + +def set_verbosity_error(): + """Set the verbosity to the `ERROR` level.""" + return set_verbosity(ERROR) + + +def disable_default_handler() -> None: + """Disable the default handler of the 🤗 Diffusers' root logger.""" + + _configure_library_root_logger() + + assert _default_handler is not None + _get_library_root_logger().removeHandler(_default_handler) + + +def enable_default_handler() -> None: + """Enable the default handler of the 🤗 Diffusers' root logger.""" + + _configure_library_root_logger() + + assert _default_handler is not None + _get_library_root_logger().addHandler(_default_handler) + + +def add_handler(handler: logging.Handler) -> None: + """adds a handler to the HuggingFace Diffusers' root logger.""" + + _configure_library_root_logger() + + assert handler is not None + _get_library_root_logger().addHandler(handler) + + +def remove_handler(handler: logging.Handler) -> None: + """removes given handler from the HuggingFace Diffusers' root logger.""" + + _configure_library_root_logger() + + assert handler is not None and handler not in _get_library_root_logger().handlers + _get_library_root_logger().removeHandler(handler) + + +def disable_propagation() -> None: + """ + Disable propagation of the library log outputs. Note that log propagation is disabled by default. + """ + + _configure_library_root_logger() + _get_library_root_logger().propagate = False + + +def enable_propagation() -> None: + """ + Enable propagation of the library log outputs. Please disable the HuggingFace Diffusers' default handler to prevent + double logging if the root logger has been configured. + """ + + _configure_library_root_logger() + _get_library_root_logger().propagate = True + + +def enable_explicit_format() -> None: + """ + Enable explicit formatting for every 🤗 Diffusers' logger. The explicit formatter is as follows: + ``` + [LEVELNAME|FILENAME|LINE NUMBER] TIME >> MESSAGE + ``` + All handlers currently bound to the root logger are affected by this method. + """ + handlers = _get_library_root_logger().handlers + + for handler in handlers: + formatter = logging.Formatter("[%(levelname)s|%(filename)s:%(lineno)s] %(asctime)s >> %(message)s") + handler.setFormatter(formatter) + + +def reset_format() -> None: + """ + Resets the formatting for 🤗 Diffusers' loggers. + + All handlers currently bound to the root logger are affected by this method. + """ + handlers = _get_library_root_logger().handlers + + for handler in handlers: + handler.setFormatter(None) + + +def warning_advice(self, *args, **kwargs): + """ + This method is identical to `logger.warning()`, but if env var DIFFUSERS_NO_ADVISORY_WARNINGS=1 is set, this + warning will not be printed + """ + no_advisory_warnings = os.getenv("DIFFUSERS_NO_ADVISORY_WARNINGS", False) + if no_advisory_warnings: + return + self.warning(*args, **kwargs) + + +logging.Logger.warning_advice = warning_advice + + +class EmptyTqdm: + """Dummy tqdm which doesn't do anything.""" + + def __init__(self, *args, **kwargs): # pylint: disable=unused-argument + self._iterator = args[0] if args else None + + def __iter__(self): + return iter(self._iterator) + + def __getattr__(self, _): + """Return empty function.""" + + def empty_fn(*args, **kwargs): # pylint: disable=unused-argument + return + + return empty_fn + + def __enter__(self): + return self + + def __exit__(self, type_, value, traceback): + return + + +class _tqdm_cls: + def __call__(self, *args, **kwargs): + if _tqdm_active: + return tqdm_lib.tqdm(*args, **kwargs) + else: + return EmptyTqdm(*args, **kwargs) + + def set_lock(self, *args, **kwargs): + self._lock = None + if _tqdm_active: + return tqdm_lib.tqdm.set_lock(*args, **kwargs) + + def get_lock(self): + if _tqdm_active: + return tqdm_lib.tqdm.get_lock() + + +tqdm = _tqdm_cls() + + +def is_progress_bar_enabled() -> bool: + """Return a boolean indicating whether tqdm progress bars are enabled.""" + global _tqdm_active + return bool(_tqdm_active) + + +def enable_progress_bar(): + """Enable tqdm progress bar.""" + global _tqdm_active + _tqdm_active = True + + +def disable_progress_bar(): + """Disable tqdm progress bar.""" + global _tqdm_active + _tqdm_active = False diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/model_card_template.md b/gf/sd-webui-rich-text/diffusers_official/utils/model_card_template.md new file mode 100644 index 0000000000000000000000000000000000000000..f19c85b0fcf2f7b07e9c3f950a9657b3f2053f21 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/model_card_template.md @@ -0,0 +1,50 @@ +--- +{{ card_data }} +--- + + + +# {{ model_name | default("Diffusion Model") }} + +## Model description + +This diffusion model is trained with the [🤗 Diffusers](https://github.com/huggingface/diffusers) library +on the `{{ dataset_name }}` dataset. + +## Intended uses & limitations + +#### How to use + +```python +# TODO: add an example code snippet for running this diffusion pipeline +``` + +#### Limitations and bias + +[TODO: provide examples of latent issues and potential remediations] + +## Training data + +[TODO: describe the data used to train the model] + +### Training hyperparameters + +The following hyperparameters were used during training: +- learning_rate: {{ learning_rate }} +- train_batch_size: {{ train_batch_size }} +- eval_batch_size: {{ eval_batch_size }} +- gradient_accumulation_steps: {{ gradient_accumulation_steps }} +- optimizer: AdamW with betas=({{ adam_beta1 }}, {{ adam_beta2 }}), weight_decay={{ adam_weight_decay }} and epsilon={{ adam_epsilon }} +- lr_scheduler: {{ lr_scheduler }} +- lr_warmup_steps: {{ lr_warmup_steps }} +- ema_inv_gamma: {{ ema_inv_gamma }} +- ema_inv_gamma: {{ ema_power }} +- ema_inv_gamma: {{ ema_max_decay }} +- mixed_precision: {{ mixed_precision }} + +### Training results + +📈 [TensorBoard logs](https://huggingface.co/{{ repo_name }}/tensorboard?#scalars) + + diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/outputs.py b/gf/sd-webui-rich-text/diffusers_official/utils/outputs.py new file mode 100644 index 0000000000000000000000000000000000000000..37b11561d1e1ee5d5cb40c7630b132e1f451c5b0 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/outputs.py @@ -0,0 +1,108 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" +Generic utilities +""" + +from collections import OrderedDict +from dataclasses import fields +from typing import Any, Tuple + +import numpy as np + +from .import_utils import is_torch_available + + +def is_tensor(x): + """ + Tests if `x` is a `torch.Tensor` or `np.ndarray`. + """ + if is_torch_available(): + import torch + + if isinstance(x, torch.Tensor): + return True + + return isinstance(x, np.ndarray) + + +class BaseOutput(OrderedDict): + """ + Base class for all model outputs as dataclass. Has a `__getitem__` that allows indexing by integer or slice (like a + tuple) or strings (like a dictionary) that will ignore the `None` attributes. Otherwise behaves like a regular + Python dictionary. + + + + You can't unpack a [`BaseOutput`] directly. Use the [`~utils.BaseOutput.to_tuple`] method to convert it to a tuple + first. + + + """ + + def __post_init__(self): + class_fields = fields(self) + + # Safety and consistency checks + if not len(class_fields): + raise ValueError(f"{self.__class__.__name__} has no fields.") + + first_field = getattr(self, class_fields[0].name) + other_fields_are_none = all(getattr(self, field.name) is None for field in class_fields[1:]) + + if other_fields_are_none and isinstance(first_field, dict): + for key, value in first_field.items(): + self[key] = value + else: + for field in class_fields: + v = getattr(self, field.name) + if v is not None: + self[field.name] = v + + def __delitem__(self, *args, **kwargs): + raise Exception(f"You cannot use ``__delitem__`` on a {self.__class__.__name__} instance.") + + def setdefault(self, *args, **kwargs): + raise Exception(f"You cannot use ``setdefault`` on a {self.__class__.__name__} instance.") + + def pop(self, *args, **kwargs): + raise Exception(f"You cannot use ``pop`` on a {self.__class__.__name__} instance.") + + def update(self, *args, **kwargs): + raise Exception(f"You cannot use ``update`` on a {self.__class__.__name__} instance.") + + def __getitem__(self, k): + if isinstance(k, str): + inner_dict = dict(self.items()) + return inner_dict[k] + else: + return self.to_tuple()[k] + + def __setattr__(self, name, value): + if name in self.keys() and value is not None: + # Don't call self.__setitem__ to avoid recursion errors + super().__setitem__(name, value) + super().__setattr__(name, value) + + def __setitem__(self, key, value): + # Will raise a KeyException if needed + super().__setitem__(key, value) + # Don't call self.__setattr__ to avoid recursion errors + super().__setattr__(key, value) + + def to_tuple(self) -> Tuple[Any]: + """ + Convert self to a tuple containing all the attributes/keys that are not `None`. + """ + return tuple(self[k] for k in self.keys()) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/pil_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/pil_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..15b97c73dcb7f85b22fcae95c641dde0123b5f05 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/pil_utils.py @@ -0,0 +1,48 @@ +import PIL.Image +import PIL.ImageOps +from packaging import version +from PIL import Image + + +if version.parse(version.parse(PIL.__version__).base_version) >= version.parse("9.1.0"): + PIL_INTERPOLATION = { + "linear": PIL.Image.Resampling.BILINEAR, + "bilinear": PIL.Image.Resampling.BILINEAR, + "bicubic": PIL.Image.Resampling.BICUBIC, + "lanczos": PIL.Image.Resampling.LANCZOS, + "nearest": PIL.Image.Resampling.NEAREST, + } +else: + PIL_INTERPOLATION = { + "linear": PIL.Image.LINEAR, + "bilinear": PIL.Image.BILINEAR, + "bicubic": PIL.Image.BICUBIC, + "lanczos": PIL.Image.LANCZOS, + "nearest": PIL.Image.NEAREST, + } + + +def pt_to_pil(images): + """ + Convert a torch image to a PIL image. + """ + images = (images / 2 + 0.5).clamp(0, 1) + images = images.cpu().permute(0, 2, 3, 1).float().numpy() + images = numpy_to_pil(images) + return images + + +def numpy_to_pil(images): + """ + Convert a numpy image or a batch of images to a PIL image. + """ + if images.ndim == 3: + images = images[None, ...] + images = (images * 255).round().astype("uint8") + if images.shape[-1] == 1: + # special case for grayscale (single channel) images + pil_images = [Image.fromarray(image.squeeze(), mode="L") for image in images] + else: + pil_images = [Image.fromarray(image) for image in images] + + return pil_images diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/testing_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/testing_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..64eb3ac925e9240d30766547880c5dea2e0aeb43 --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/testing_utils.py @@ -0,0 +1,602 @@ +import inspect +import logging +import multiprocessing +import os +import random +import re +import tempfile +import unittest +import urllib.parse +from distutils.util import strtobool +from io import BytesIO, StringIO +from pathlib import Path +from typing import List, Optional, Union + +import numpy as np +import PIL.Image +import PIL.ImageOps +import requests +from packaging import version + +from .import_utils import ( + BACKENDS_MAPPING, + is_compel_available, + is_flax_available, + is_note_seq_available, + is_onnx_available, + is_opencv_available, + is_torch_available, + is_torch_version, + is_torchsde_available, +) +from .logging import get_logger + + +global_rng = random.Random() + +logger = get_logger(__name__) + +if is_torch_available(): + import torch + + if "DIFFUSERS_TEST_DEVICE" in os.environ: + torch_device = os.environ["DIFFUSERS_TEST_DEVICE"] + + available_backends = ["cuda", "cpu", "mps"] + if torch_device not in available_backends: + raise ValueError( + f"unknown torch backend for diffusers tests: {torch_device}. Available backends are:" + f" {available_backends}" + ) + logger.info(f"torch_device overrode to {torch_device}") + else: + torch_device = "cuda" if torch.cuda.is_available() else "cpu" + is_torch_higher_equal_than_1_12 = version.parse( + version.parse(torch.__version__).base_version + ) >= version.parse("1.12") + + if is_torch_higher_equal_than_1_12: + # Some builds of torch 1.12 don't have the mps backend registered. See #892 for more details + mps_backend_registered = hasattr(torch.backends, "mps") + torch_device = "mps" if (mps_backend_registered and torch.backends.mps.is_available()) else torch_device + + +def torch_all_close(a, b, *args, **kwargs): + if not is_torch_available(): + raise ValueError("PyTorch needs to be installed to use this function.") + if not torch.allclose(a, b, *args, **kwargs): + assert False, f"Max diff is absolute {(a - b).abs().max()}. Diff tensor is {(a - b).abs()}." + return True + + +def print_tensor_test(tensor, filename="test_corrections.txt", expected_tensor_name="expected_slice"): + test_name = os.environ.get("PYTEST_CURRENT_TEST") + if not torch.is_tensor(tensor): + tensor = torch.from_numpy(tensor) + + tensor_str = str(tensor.detach().cpu().flatten().to(torch.float32)).replace("\n", "") + # format is usually: + # expected_slice = np.array([-0.5713, -0.3018, -0.9814, 0.04663, -0.879, 0.76, -1.734, 0.1044, 1.161]) + output_str = tensor_str.replace("tensor", f"{expected_tensor_name} = np.array") + test_file, test_class, test_fn = test_name.split("::") + test_fn = test_fn.split()[0] + with open(filename, "a") as f: + print(";".join([test_file, test_class, test_fn, output_str]), file=f) + + +def get_tests_dir(append_path=None): + """ + Args: + append_path: optional path to append to the tests dir path + Return: + The full path to the `tests` dir, so that the tests can be invoked from anywhere. Optionally `append_path` is + joined after the `tests` dir the former is provided. + """ + # this function caller's __file__ + caller__file__ = inspect.stack()[1][1] + tests_dir = os.path.abspath(os.path.dirname(caller__file__)) + + while not tests_dir.endswith("tests"): + tests_dir = os.path.dirname(tests_dir) + + if append_path: + return os.path.join(tests_dir, append_path) + else: + return tests_dir + + +def parse_flag_from_env(key, default=False): + try: + value = os.environ[key] + except KeyError: + # KEY isn't set, default to `default`. + _value = default + else: + # KEY is set, convert it to True or False. + try: + _value = strtobool(value) + except ValueError: + # More values are supported, but let's keep the message simple. + raise ValueError(f"If set, {key} must be yes or no.") + return _value + + +_run_slow_tests = parse_flag_from_env("RUN_SLOW", default=False) +_run_nightly_tests = parse_flag_from_env("RUN_NIGHTLY", default=False) + + +def floats_tensor(shape, scale=1.0, rng=None, name=None): + """Creates a random float32 tensor""" + if rng is None: + rng = global_rng + + total_dims = 1 + for dim in shape: + total_dims *= dim + + values = [] + for _ in range(total_dims): + values.append(rng.random() * scale) + + return torch.tensor(data=values, dtype=torch.float).view(shape).contiguous() + + +def slow(test_case): + """ + Decorator marking a test as slow. + + Slow tests are skipped by default. Set the RUN_SLOW environment variable to a truthy value to run them. + + """ + return unittest.skipUnless(_run_slow_tests, "test is slow")(test_case) + + +def nightly(test_case): + """ + Decorator marking a test that runs nightly in the diffusers CI. + + Slow tests are skipped by default. Set the RUN_NIGHTLY environment variable to a truthy value to run them. + + """ + return unittest.skipUnless(_run_nightly_tests, "test is nightly")(test_case) + + +def require_torch(test_case): + """ + Decorator marking a test that requires PyTorch. These tests are skipped when PyTorch isn't installed. + """ + return unittest.skipUnless(is_torch_available(), "test requires PyTorch")(test_case) + + +def require_torch_2(test_case): + """ + Decorator marking a test that requires PyTorch 2. These tests are skipped when it isn't installed. + """ + return unittest.skipUnless(is_torch_available() and is_torch_version(">=", "2.0.0"), "test requires PyTorch 2")( + test_case + ) + + +def require_torch_gpu(test_case): + """Decorator marking a test that requires CUDA and PyTorch.""" + return unittest.skipUnless(is_torch_available() and torch_device == "cuda", "test requires PyTorch+CUDA")( + test_case + ) + + +def skip_mps(test_case): + """Decorator marking a test to skip if torch_device is 'mps'""" + return unittest.skipUnless(torch_device != "mps", "test requires non 'mps' device")(test_case) + + +def require_flax(test_case): + """ + Decorator marking a test that requires JAX & Flax. These tests are skipped when one / both are not installed + """ + return unittest.skipUnless(is_flax_available(), "test requires JAX & Flax")(test_case) + + +def require_compel(test_case): + """ + Decorator marking a test that requires compel: https://github.com/damian0815/compel. These tests are skipped when + the library is not installed. + """ + return unittest.skipUnless(is_compel_available(), "test requires compel")(test_case) + + +def require_onnxruntime(test_case): + """ + Decorator marking a test that requires onnxruntime. These tests are skipped when onnxruntime isn't installed. + """ + return unittest.skipUnless(is_onnx_available(), "test requires onnxruntime")(test_case) + + +def require_note_seq(test_case): + """ + Decorator marking a test that requires note_seq. These tests are skipped when note_seq isn't installed. + """ + return unittest.skipUnless(is_note_seq_available(), "test requires note_seq")(test_case) + + +def require_torchsde(test_case): + """ + Decorator marking a test that requires torchsde. These tests are skipped when torchsde isn't installed. + """ + return unittest.skipUnless(is_torchsde_available(), "test requires torchsde")(test_case) + + +def load_numpy(arry: Union[str, np.ndarray], local_path: Optional[str] = None) -> np.ndarray: + if isinstance(arry, str): + # local_path = "/home/patrick_huggingface_co/" + if local_path is not None: + # local_path can be passed to correct images of tests + return os.path.join(local_path, "/".join([arry.split("/")[-5], arry.split("/")[-2], arry.split("/")[-1]])) + elif arry.startswith("http://") or arry.startswith("https://"): + response = requests.get(arry) + response.raise_for_status() + arry = np.load(BytesIO(response.content)) + elif os.path.isfile(arry): + arry = np.load(arry) + else: + raise ValueError( + f"Incorrect path or url, URLs must start with `http://` or `https://`, and {arry} is not a valid path" + ) + elif isinstance(arry, np.ndarray): + pass + else: + raise ValueError( + "Incorrect format used for numpy ndarray. Should be an url linking to an image, a local path, or a" + " ndarray." + ) + + return arry + + +def load_pt(url: str): + response = requests.get(url) + response.raise_for_status() + arry = torch.load(BytesIO(response.content)) + return arry + + +def load_image(image: Union[str, PIL.Image.Image]) -> PIL.Image.Image: + """ + Loads `image` to a PIL Image. + + Args: + image (`str` or `PIL.Image.Image`): + The image to convert to the PIL Image format. + Returns: + `PIL.Image.Image`: + A PIL Image. + """ + if isinstance(image, str): + if image.startswith("http://") or image.startswith("https://"): + image = PIL.Image.open(requests.get(image, stream=True).raw) + elif os.path.isfile(image): + image = PIL.Image.open(image) + else: + raise ValueError( + f"Incorrect path or url, URLs must start with `http://` or `https://`, and {image} is not a valid path" + ) + elif isinstance(image, PIL.Image.Image): + image = image + else: + raise ValueError( + "Incorrect format used for image. Should be an url linking to an image, a local path, or a PIL image." + ) + image = PIL.ImageOps.exif_transpose(image) + image = image.convert("RGB") + return image + + +def preprocess_image(image: PIL.Image, batch_size: int): + w, h = image.size + w, h = (x - x % 8 for x in (w, h)) # resize to integer multiple of 8 + image = image.resize((w, h), resample=PIL.Image.LANCZOS) + image = np.array(image).astype(np.float32) / 255.0 + image = np.vstack([image[None].transpose(0, 3, 1, 2)] * batch_size) + image = torch.from_numpy(image) + return 2.0 * image - 1.0 + + +def export_to_gif(image: List[PIL.Image.Image], output_gif_path: str = None) -> str: + if output_gif_path is None: + output_gif_path = tempfile.NamedTemporaryFile(suffix=".gif").name + + image[0].save( + output_gif_path, + save_all=True, + append_images=image[1:], + optimize=False, + duration=100, + loop=0, + ) + return output_gif_path + + +def export_to_video(video_frames: List[np.ndarray], output_video_path: str = None) -> str: + if is_opencv_available(): + import cv2 + else: + raise ImportError(BACKENDS_MAPPING["opencv"][1].format("export_to_video")) + if output_video_path is None: + output_video_path = tempfile.NamedTemporaryFile(suffix=".mp4").name + + fourcc = cv2.VideoWriter_fourcc(*"mp4v") + h, w, c = video_frames[0].shape + video_writer = cv2.VideoWriter(output_video_path, fourcc, fps=8, frameSize=(w, h)) + for i in range(len(video_frames)): + img = cv2.cvtColor(video_frames[i], cv2.COLOR_RGB2BGR) + video_writer.write(img) + return output_video_path + + +def load_hf_numpy(path) -> np.ndarray: + if not path.startswith("http://") or path.startswith("https://"): + path = os.path.join( + "https://huggingface.co/datasets/fusing/diffusers-testing/resolve/main", urllib.parse.quote(path) + ) + + return load_numpy(path) + + +# --- pytest conf functions --- # + +# to avoid multiple invocation from tests/conftest.py and examples/conftest.py - make sure it's called only once +pytest_opt_registered = {} + + +def pytest_addoption_shared(parser): + """ + This function is to be called from `conftest.py` via `pytest_addoption` wrapper that has to be defined there. + + It allows loading both `conftest.py` files at once without causing a failure due to adding the same `pytest` + option. + + """ + option = "--make-reports" + if option not in pytest_opt_registered: + parser.addoption( + option, + action="store", + default=False, + help="generate report files. The value of this option is used as a prefix to report names", + ) + pytest_opt_registered[option] = 1 + + +def pytest_terminal_summary_main(tr, id): + """ + Generate multiple reports at the end of test suite run - each report goes into a dedicated file in the current + directory. The report files are prefixed with the test suite name. + + This function emulates --duration and -rA pytest arguments. + + This function is to be called from `conftest.py` via `pytest_terminal_summary` wrapper that has to be defined + there. + + Args: + - tr: `terminalreporter` passed from `conftest.py` + - id: unique id like `tests` or `examples` that will be incorporated into the final reports filenames - this is + needed as some jobs have multiple runs of pytest, so we can't have them overwrite each other. + + NB: this functions taps into a private _pytest API and while unlikely, it could break should + pytest do internal changes - also it calls default internal methods of terminalreporter which + can be hijacked by various `pytest-` plugins and interfere. + + """ + from _pytest.config import create_terminal_writer + + if not len(id): + id = "tests" + + config = tr.config + orig_writer = config.get_terminal_writer() + orig_tbstyle = config.option.tbstyle + orig_reportchars = tr.reportchars + + dir = "reports" + Path(dir).mkdir(parents=True, exist_ok=True) + report_files = { + k: f"{dir}/{id}_{k}.txt" + for k in [ + "durations", + "errors", + "failures_long", + "failures_short", + "failures_line", + "passes", + "stats", + "summary_short", + "warnings", + ] + } + + # custom durations report + # note: there is no need to call pytest --durations=XX to get this separate report + # adapted from https://github.com/pytest-dev/pytest/blob/897f151e/src/_pytest/runner.py#L66 + dlist = [] + for replist in tr.stats.values(): + for rep in replist: + if hasattr(rep, "duration"): + dlist.append(rep) + if dlist: + dlist.sort(key=lambda x: x.duration, reverse=True) + with open(report_files["durations"], "w") as f: + durations_min = 0.05 # sec + f.write("slowest durations\n") + for i, rep in enumerate(dlist): + if rep.duration < durations_min: + f.write(f"{len(dlist)-i} durations < {durations_min} secs were omitted") + break + f.write(f"{rep.duration:02.2f}s {rep.when:<8} {rep.nodeid}\n") + + def summary_failures_short(tr): + # expecting that the reports were --tb=long (default) so we chop them off here to the last frame + reports = tr.getreports("failed") + if not reports: + return + tr.write_sep("=", "FAILURES SHORT STACK") + for rep in reports: + msg = tr._getfailureheadline(rep) + tr.write_sep("_", msg, red=True, bold=True) + # chop off the optional leading extra frames, leaving only the last one + longrepr = re.sub(r".*_ _ _ (_ ){10,}_ _ ", "", rep.longreprtext, 0, re.M | re.S) + tr._tw.line(longrepr) + # note: not printing out any rep.sections to keep the report short + + # use ready-made report funcs, we are just hijacking the filehandle to log to a dedicated file each + # adapted from https://github.com/pytest-dev/pytest/blob/897f151e/src/_pytest/terminal.py#L814 + # note: some pytest plugins may interfere by hijacking the default `terminalreporter` (e.g. + # pytest-instafail does that) + + # report failures with line/short/long styles + config.option.tbstyle = "auto" # full tb + with open(report_files["failures_long"], "w") as f: + tr._tw = create_terminal_writer(config, f) + tr.summary_failures() + + # config.option.tbstyle = "short" # short tb + with open(report_files["failures_short"], "w") as f: + tr._tw = create_terminal_writer(config, f) + summary_failures_short(tr) + + config.option.tbstyle = "line" # one line per error + with open(report_files["failures_line"], "w") as f: + tr._tw = create_terminal_writer(config, f) + tr.summary_failures() + + with open(report_files["errors"], "w") as f: + tr._tw = create_terminal_writer(config, f) + tr.summary_errors() + + with open(report_files["warnings"], "w") as f: + tr._tw = create_terminal_writer(config, f) + tr.summary_warnings() # normal warnings + tr.summary_warnings() # final warnings + + tr.reportchars = "wPpsxXEf" # emulate -rA (used in summary_passes() and short_test_summary()) + with open(report_files["passes"], "w") as f: + tr._tw = create_terminal_writer(config, f) + tr.summary_passes() + + with open(report_files["summary_short"], "w") as f: + tr._tw = create_terminal_writer(config, f) + tr.short_test_summary() + + with open(report_files["stats"], "w") as f: + tr._tw = create_terminal_writer(config, f) + tr.summary_stats() + + # restore: + tr._tw = orig_writer + tr.reportchars = orig_reportchars + config.option.tbstyle = orig_tbstyle + + +# Taken from: https://github.com/huggingface/transformers/blob/3658488ff77ff8d45101293e749263acf437f4d5/src/transformers/testing_utils.py#L1787 +def run_test_in_subprocess(test_case, target_func, inputs=None, timeout=None): + """ + To run a test in a subprocess. In particular, this can avoid (GPU) memory issue. + + Args: + test_case (`unittest.TestCase`): + The test that will run `target_func`. + target_func (`Callable`): + The function implementing the actual testing logic. + inputs (`dict`, *optional*, defaults to `None`): + The inputs that will be passed to `target_func` through an (input) queue. + timeout (`int`, *optional*, defaults to `None`): + The timeout (in seconds) that will be passed to the input and output queues. If not specified, the env. + variable `PYTEST_TIMEOUT` will be checked. If still `None`, its value will be set to `600`. + """ + if timeout is None: + timeout = int(os.environ.get("PYTEST_TIMEOUT", 600)) + + start_methohd = "spawn" + ctx = multiprocessing.get_context(start_methohd) + + input_queue = ctx.Queue(1) + output_queue = ctx.JoinableQueue(1) + + # We can't send `unittest.TestCase` to the child, otherwise we get issues regarding pickle. + input_queue.put(inputs, timeout=timeout) + + process = ctx.Process(target=target_func, args=(input_queue, output_queue, timeout)) + process.start() + # Kill the child process if we can't get outputs from it in time: otherwise, the hanging subprocess prevents + # the test to exit properly. + try: + results = output_queue.get(timeout=timeout) + output_queue.task_done() + except Exception as e: + process.terminate() + test_case.fail(e) + process.join(timeout=timeout) + + if results["error"] is not None: + test_case.fail(f'{results["error"]}') + + +class CaptureLogger: + """ + Args: + Context manager to capture `logging` streams + logger: 'logging` logger object + Returns: + The captured output is available via `self.out` + Example: + ```python + >>> from diffusers import logging + >>> from diffusers.testing_utils import CaptureLogger + + >>> msg = "Testing 1, 2, 3" + >>> logging.set_verbosity_info() + >>> logger = logging.get_logger("diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.py") + >>> with CaptureLogger(logger) as cl: + ... logger.info(msg) + >>> assert cl.out, msg + "\n" + ``` + """ + + def __init__(self, logger): + self.logger = logger + self.io = StringIO() + self.sh = logging.StreamHandler(self.io) + self.out = "" + + def __enter__(self): + self.logger.addHandler(self.sh) + return self + + def __exit__(self, *exc): + self.logger.removeHandler(self.sh) + self.out = self.io.getvalue() + + def __repr__(self): + return f"captured: {self.out}\n" + + +def enable_full_determinism(): + """ + Helper function for reproducible behavior during distributed training. See + - https://pytorch.org/docs/stable/notes/randomness.html for pytorch + """ + # Enable PyTorch deterministic mode. This potentially requires either the environment + # variable 'CUDA_LAUNCH_BLOCKING' or 'CUBLAS_WORKSPACE_CONFIG' to be set, + # depending on the CUDA version, so we set them both here + os.environ["CUDA_LAUNCH_BLOCKING"] = "1" + os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":16:8" + torch.use_deterministic_algorithms(True) + + # Enable CUDNN deterministic mode + torch.backends.cudnn.deterministic = True + torch.backends.cudnn.benchmark = False + torch.backends.cuda.matmul.allow_tf32 = False + + +def disable_full_determinism(): + os.environ["CUDA_LAUNCH_BLOCKING"] = "0" + os.environ["CUBLAS_WORKSPACE_CONFIG"] = "" + torch.use_deterministic_algorithms(False) diff --git a/gf/sd-webui-rich-text/diffusers_official/utils/torch_utils.py b/gf/sd-webui-rich-text/diffusers_official/utils/torch_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..5f64bce25e78d5212696f4b06b767d338599670a --- /dev/null +++ b/gf/sd-webui-rich-text/diffusers_official/utils/torch_utils.py @@ -0,0 +1,84 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" +PyTorch utilities: Utilities related to PyTorch +""" +from typing import List, Optional, Tuple, Union + +from . import logging +from .import_utils import is_torch_available, is_torch_version + + +if is_torch_available(): + import torch + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + +try: + from torch._dynamo import allow_in_graph as maybe_allow_in_graph +except (ImportError, ModuleNotFoundError): + + def maybe_allow_in_graph(cls): + return cls + + +def randn_tensor( + shape: Union[Tuple, List], + generator: Optional[Union[List["torch.Generator"], "torch.Generator"]] = None, + device: Optional["torch.device"] = None, + dtype: Optional["torch.dtype"] = None, + layout: Optional["torch.layout"] = None, +): + """A helper function to create random tensors on the desired `device` with the desired `dtype`. When + passing a list of generators, you can seed each batch size individually. If CPU generators are passed, the tensor + is always created on the CPU. + """ + # device on which tensor is created defaults to device + rand_device = device + batch_size = shape[0] + + layout = layout or torch.strided + device = device or torch.device("cpu") + + if generator is not None: + gen_device_type = generator.device.type if not isinstance(generator, list) else generator[0].device.type + if gen_device_type != device.type and gen_device_type == "cpu": + rand_device = "cpu" + if device != "mps": + logger.info( + f"The passed generator was created on 'cpu' even though a tensor on {device} was expected." + f" Tensors will be created on 'cpu' and then moved to {device}. Note that one can probably" + f" slighly speed up this function by passing a generator that was created on the {device} device." + ) + elif gen_device_type != device.type and gen_device_type == "cuda": + raise ValueError(f"Cannot generate a {device} tensor from a generator of type {gen_device_type}.") + + if isinstance(generator, list): + shape = (1,) + shape[1:] + latents = [ + torch.randn(shape, generator=generator[i], device=rand_device, dtype=dtype, layout=layout) + for i in range(batch_size) + ] + latents = torch.cat(latents, dim=0).to(device) + else: + latents = torch.randn(shape, generator=generator, device=rand_device, dtype=dtype, layout=layout).to(device) + + return latents + + +def is_compiled_module(module): + """Check whether the module was compiled with torch.compile()""" + if is_torch_version("<", "2.0.0") or not hasattr(torch, "_dynamo"): + return False + return isinstance(module, torch._dynamo.eval_frame.OptimizedModule) diff --git a/gf/sd-webui-rich-text/install.py b/gf/sd-webui-rich-text/install.py new file mode 100644 index 0000000000000000000000000000000000000000..3e7e8c3188bd656ebf2cb6959b7ebfcb9a15b438 --- /dev/null +++ b/gf/sd-webui-rich-text/install.py @@ -0,0 +1,24 @@ +import launch + +# TODO: add pip dependency if need extra module only on extension + +if not launch.is_installed("diffusers"): + launch.run_pip("install diffusers==0.18.2", "requirements for Rich-Text-to-Image") + +if not launch.is_installed("invisible-watermark"): + launch.run_pip("install invisible-watermark==0.2.0", "requirements for Rich-Text-to-Image") + +if not launch.is_installed("accelerate"): + launch.run_pip("install accelerate==0.21.0", "requirements for Rich-Text-to-Image") + +if not launch.is_installed("safetensors"): + launch.run_pip("install safetensors==0.3.1", "requirements for Rich-Text-to-Image") + +if not launch.is_installed("seaborn"): + launch.run_pip("install seaborn==0.12.2", "requirements for Rich-Text-to-Image") + +if not launch.is_installed("scikit-learn"): + launch.run_pip("install scikit-learn==1.3.0", "requirements for Rich-Text-to-Image") + +if not launch.is_installed("threadpoolctl"): + launch.run_pip("install threadpoolctl==3.1.0", "requirements for Rich-Text-to-Image") \ No newline at end of file diff --git a/gf/sd-webui-rich-text/rich-text-to-json-iframe.html b/gf/sd-webui-rich-text/rich-text-to-json-iframe.html new file mode 100644 index 0000000000000000000000000000000000000000..c83b3f8063109e442da9d8b86c23272175e68935 --- /dev/null +++ b/gf/sd-webui-rich-text/rich-text-to-json-iframe.html @@ -0,0 +1,341 @@ + + + + + Rich Text to JSON + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + +
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/gf/sd-webui-rich-text/scripts/__pycache__/rich_text_on_tab.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/__pycache__/rich_text_on_tab.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..a57d646fbdbba57953256fb7b58a9f2233da1276 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/__pycache__/rich_text_on_tab.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/__pycache__/rich_text_settings.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/__pycache__/rich_text_settings.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..d52066bc9b9b6404b3070ce04c58827fe638b0fe Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/__pycache__/rich_text_settings.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/attention.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/attention.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..2589c21c8fb9fb353f6b0049101b6e0dc60e45a6 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/attention.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/attention_processor.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/attention_processor.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..fe349391b16339c1e89b0f3625ba8d495595267e Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/attention_processor.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/dual_transformer_2d.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/dual_transformer_2d.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..8e2cff421145a58e23c6af831cfb297a5580413c Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/dual_transformer_2d.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/region_diffusion.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/region_diffusion.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..b1aef0129d55ed39454493d6544ef20bd5d6320d Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/region_diffusion.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/region_diffusion_xl.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/region_diffusion_xl.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..43a8651027410222a047f780e59454a0c6bc94f5 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/region_diffusion_xl.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/resnet.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/resnet.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..1676944b05fe5d60526b0a7c6d942fb2af2fb0b6 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/resnet.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/transformer_2d.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/transformer_2d.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..eca50243485664fe2101dde9886c39da15f35980 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/transformer_2d.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/unet_2d_blocks.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/unet_2d_blocks.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..abfaa9f4149f2d7ef0f762a2f5febd117206fe97 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/unet_2d_blocks.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/__pycache__/unet_2d_condition.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/__pycache__/unet_2d_condition.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..9f0f0427478bcd5ef417dc6d73f5a0bcb6b2eb59 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/__pycache__/unet_2d_condition.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/attention.py b/gf/sd-webui-rich-text/scripts/models/attention.py new file mode 100644 index 0000000000000000000000000000000000000000..17155841e05cceb2076be0dc5262d7e3c2150b62 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/attention.py @@ -0,0 +1,391 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Any, Dict, Optional + +import torch +import torch.nn.functional as F +from torch import nn + +from diffusers_official.utils import maybe_allow_in_graph +from diffusers_official.models.activations import get_activation +from diffusers_official.models.embeddings import CombinedTimestepLabelEmbeddings + +from scripts.models.attention_processor import Attention + +@maybe_allow_in_graph +class BasicTransformerBlock(nn.Module): + r""" + A basic Transformer block. + + Parameters: + dim (`int`): The number of channels in the input and output. + num_attention_heads (`int`): The number of heads to use for multi-head attention. + attention_head_dim (`int`): The number of channels in each head. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The size of the encoder_hidden_states vector for cross attention. + only_cross_attention (`bool`, *optional*): + Whether to use only cross-attention layers. In this case two cross attention layers are used. + double_self_attention (`bool`, *optional*): + Whether to use two self-attention layers. In this case no cross attention layers are used. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward. + num_embeds_ada_norm (: + obj: `int`, *optional*): The number of diffusion steps used during training. See `Transformer2DModel`. + attention_bias (: + obj: `bool`, *optional*, defaults to `False`): Configure if the attentions should contain a bias parameter. + """ + + def __init__( + self, + dim: int, + num_attention_heads: int, + attention_head_dim: int, + dropout=0.0, + cross_attention_dim: Optional[int] = None, + activation_fn: str = "geglu", + num_embeds_ada_norm: Optional[int] = None, + attention_bias: bool = False, + only_cross_attention: bool = False, + double_self_attention: bool = False, + upcast_attention: bool = False, + norm_elementwise_affine: bool = True, + norm_type: str = "layer_norm", + final_dropout: bool = False, + ): + super().__init__() + self.only_cross_attention = only_cross_attention + + self.use_ada_layer_norm_zero = (num_embeds_ada_norm is not None) and norm_type == "ada_norm_zero" + self.use_ada_layer_norm = (num_embeds_ada_norm is not None) and norm_type == "ada_norm" + + if norm_type in ("ada_norm", "ada_norm_zero") and num_embeds_ada_norm is None: + raise ValueError( + f"`norm_type` is set to {norm_type}, but `num_embeds_ada_norm` is not defined. Please make sure to" + f" define `num_embeds_ada_norm` if setting `norm_type` to {norm_type}." + ) + + # Define 3 blocks. Each block has its own normalization layer. + # 1. Self-Attn + if self.use_ada_layer_norm: + self.norm1 = AdaLayerNorm(dim, num_embeds_ada_norm) + elif self.use_ada_layer_norm_zero: + self.norm1 = AdaLayerNormZero(dim, num_embeds_ada_norm) + else: + self.norm1 = nn.LayerNorm(dim, elementwise_affine=norm_elementwise_affine) + self.attn1 = Attention( + query_dim=dim, + heads=num_attention_heads, + dim_head=attention_head_dim, + dropout=dropout, + bias=attention_bias, + cross_attention_dim=cross_attention_dim if only_cross_attention else None, + upcast_attention=upcast_attention, + ) + + # 2. Cross-Attn + if cross_attention_dim is not None or double_self_attention: + # We currently only use AdaLayerNormZero for self attention where there will only be one attention block. + # I.e. the number of returned modulation chunks from AdaLayerZero would not make sense if returned during + # the second cross attention block. + self.norm2 = ( + AdaLayerNorm(dim, num_embeds_ada_norm) + if self.use_ada_layer_norm + else nn.LayerNorm(dim, elementwise_affine=norm_elementwise_affine) + ) + self.attn2 = Attention( + query_dim=dim, + cross_attention_dim=cross_attention_dim if not double_self_attention else None, + heads=num_attention_heads, + dim_head=attention_head_dim, + dropout=dropout, + bias=attention_bias, + upcast_attention=upcast_attention, + ) # is self-attn if encoder_hidden_states is none + else: + self.norm2 = None + self.attn2 = None + + # 3. Feed-forward + self.norm3 = nn.LayerNorm(dim, elementwise_affine=norm_elementwise_affine) + self.ff = FeedForward(dim, dropout=dropout, activation_fn=activation_fn, final_dropout=final_dropout) + + # let chunk size default to None + self._chunk_size = None + self._chunk_dim = 0 + + def set_chunk_feed_forward(self, chunk_size: Optional[int], dim: int): + # Sets chunk feed-forward + self._chunk_size = chunk_size + self._chunk_dim = dim + + def forward( + self, + hidden_states: torch.FloatTensor, + attention_mask: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + timestep: Optional[torch.LongTensor] = None, + cross_attention_kwargs: Dict[str, Any] = None, + class_labels: Optional[torch.LongTensor] = None, + ): + # Notice that normalization is always applied before the real computation in the following blocks. + # 1. Self-Attention + if self.use_ada_layer_norm: + norm_hidden_states = self.norm1(hidden_states, timestep) + elif self.use_ada_layer_norm_zero: + norm_hidden_states, gate_msa, shift_mlp, scale_mlp, gate_mlp = self.norm1( + hidden_states, timestep, class_labels, hidden_dtype=hidden_states.dtype + ) + else: + norm_hidden_states = self.norm1(hidden_states) + + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + # Rich-Text: ignore the attention probs + attn_output, _ = self.attn1( + norm_hidden_states, + encoder_hidden_states=encoder_hidden_states if self.only_cross_attention else None, + attention_mask=attention_mask, + **cross_attention_kwargs, + ) + if self.use_ada_layer_norm_zero: + attn_output = gate_msa.unsqueeze(1) * attn_output + hidden_states = attn_output + hidden_states + + # 2. Cross-Attention + if self.attn2 is not None: + norm_hidden_states = ( + self.norm2(hidden_states, timestep) if self.use_ada_layer_norm else self.norm2(hidden_states) + ) + + # Rich-Text: ignore the attention probs + attn_output, _ = self.attn2( + norm_hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=encoder_attention_mask, + **cross_attention_kwargs, + ) + hidden_states = attn_output + hidden_states + + # 3. Feed-forward + norm_hidden_states = self.norm3(hidden_states) + + if self.use_ada_layer_norm_zero: + norm_hidden_states = norm_hidden_states * (1 + scale_mlp[:, None]) + shift_mlp[:, None] + + if self._chunk_size is not None: + # "feed_forward_chunk_size" can be used to save memory + if norm_hidden_states.shape[self._chunk_dim] % self._chunk_size != 0: + raise ValueError( + f"`hidden_states` dimension to be chunked: {norm_hidden_states.shape[self._chunk_dim]} has to be divisible by chunk size: {self._chunk_size}. Make sure to set an appropriate `chunk_size` when calling `unet.enable_forward_chunking`." + ) + + num_chunks = norm_hidden_states.shape[self._chunk_dim] // self._chunk_size + ff_output = torch.cat( + [self.ff(hid_slice) for hid_slice in norm_hidden_states.chunk(num_chunks, dim=self._chunk_dim)], + dim=self._chunk_dim, + ) + else: + ff_output = self.ff(norm_hidden_states) + + if self.use_ada_layer_norm_zero: + ff_output = gate_mlp.unsqueeze(1) * ff_output + + hidden_states = ff_output + hidden_states + + return hidden_states + + +class FeedForward(nn.Module): + r""" + A feed-forward layer. + + Parameters: + dim (`int`): The number of channels in the input. + dim_out (`int`, *optional*): The number of channels in the output. If not given, defaults to `dim`. + mult (`int`, *optional*, defaults to 4): The multiplier to use for the hidden dimension. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward. + final_dropout (`bool` *optional*, defaults to False): Apply a final dropout. + """ + + def __init__( + self, + dim: int, + dim_out: Optional[int] = None, + mult: int = 4, + dropout: float = 0.0, + activation_fn: str = "geglu", + final_dropout: bool = False, + ): + super().__init__() + inner_dim = int(dim * mult) + dim_out = dim_out if dim_out is not None else dim + + if activation_fn == "gelu": + act_fn = GELU(dim, inner_dim) + if activation_fn == "gelu-approximate": + act_fn = GELU(dim, inner_dim, approximate="tanh") + elif activation_fn == "geglu": + act_fn = GEGLU(dim, inner_dim) + elif activation_fn == "geglu-approximate": + act_fn = ApproximateGELU(dim, inner_dim) + + self.net = nn.ModuleList([]) + # project in + self.net.append(act_fn) + # project dropout + self.net.append(nn.Dropout(dropout)) + # project out + self.net.append(nn.Linear(inner_dim, dim_out)) + # FF as used in Vision Transformer, MLP-Mixer, etc. have a final dropout + if final_dropout: + self.net.append(nn.Dropout(dropout)) + + def forward(self, hidden_states): + for module in self.net: + hidden_states = module(hidden_states) + return hidden_states + + +class GELU(nn.Module): + r""" + GELU activation function with tanh approximation support with `approximate="tanh"`. + """ + + def __init__(self, dim_in: int, dim_out: int, approximate: str = "none"): + super().__init__() + self.proj = nn.Linear(dim_in, dim_out) + self.approximate = approximate + + def gelu(self, gate): + if gate.device.type != "mps": + return F.gelu(gate, approximate=self.approximate) + # mps: gelu is not implemented for float16 + return F.gelu(gate.to(dtype=torch.float32), approximate=self.approximate).to(dtype=gate.dtype) + + def forward(self, hidden_states): + hidden_states = self.proj(hidden_states) + hidden_states = self.gelu(hidden_states) + return hidden_states + + +class GEGLU(nn.Module): + r""" + A variant of the gated linear unit activation function from https://arxiv.org/abs/2002.05202. + + Parameters: + dim_in (`int`): The number of channels in the input. + dim_out (`int`): The number of channels in the output. + """ + + def __init__(self, dim_in: int, dim_out: int): + super().__init__() + self.proj = nn.Linear(dim_in, dim_out * 2) + + def gelu(self, gate): + if gate.device.type != "mps": + return F.gelu(gate) + # mps: gelu is not implemented for float16 + return F.gelu(gate.to(dtype=torch.float32)).to(dtype=gate.dtype) + + def forward(self, hidden_states): + hidden_states, gate = self.proj(hidden_states).chunk(2, dim=-1) + return hidden_states * self.gelu(gate) + + +class ApproximateGELU(nn.Module): + """ + The approximate form of Gaussian Error Linear Unit (GELU) + + For more details, see section 2: https://arxiv.org/abs/1606.08415 + """ + + def __init__(self, dim_in: int, dim_out: int): + super().__init__() + self.proj = nn.Linear(dim_in, dim_out) + + def forward(self, x): + x = self.proj(x) + return x * torch.sigmoid(1.702 * x) + + +class AdaLayerNorm(nn.Module): + """ + Norm layer modified to incorporate timestep embeddings. + """ + + def __init__(self, embedding_dim, num_embeddings): + super().__init__() + self.emb = nn.Embedding(num_embeddings, embedding_dim) + self.silu = nn.SiLU() + self.linear = nn.Linear(embedding_dim, embedding_dim * 2) + self.norm = nn.LayerNorm(embedding_dim, elementwise_affine=False) + + def forward(self, x, timestep): + emb = self.linear(self.silu(self.emb(timestep))) + scale, shift = torch.chunk(emb, 2) + x = self.norm(x) * (1 + scale) + shift + return x + + +class AdaLayerNormZero(nn.Module): + """ + Norm layer adaptive layer norm zero (adaLN-Zero). + """ + + def __init__(self, embedding_dim, num_embeddings): + super().__init__() + + self.emb = CombinedTimestepLabelEmbeddings(num_embeddings, embedding_dim) + + self.silu = nn.SiLU() + self.linear = nn.Linear(embedding_dim, 6 * embedding_dim, bias=True) + self.norm = nn.LayerNorm(embedding_dim, elementwise_affine=False, eps=1e-6) + + def forward(self, x, timestep, class_labels, hidden_dtype=None): + emb = self.linear(self.silu(self.emb(timestep, class_labels, hidden_dtype=hidden_dtype))) + shift_msa, scale_msa, gate_msa, shift_mlp, scale_mlp, gate_mlp = emb.chunk(6, dim=1) + x = self.norm(x) * (1 + scale_msa[:, None]) + shift_msa[:, None] + return x, gate_msa, shift_mlp, scale_mlp, gate_mlp + + +class AdaGroupNorm(nn.Module): + """ + GroupNorm layer modified to incorporate timestep embeddings. + """ + + def __init__( + self, embedding_dim: int, out_dim: int, num_groups: int, act_fn: Optional[str] = None, eps: float = 1e-5 + ): + super().__init__() + self.num_groups = num_groups + self.eps = eps + + if act_fn is None: + self.act = None + else: + self.act = get_activation(act_fn) + + self.linear = nn.Linear(embedding_dim, out_dim * 2) + + def forward(self, x, emb): + if self.act: + emb = self.act(emb) + emb = self.linear(emb) + emb = emb[:, :, None, None] + scale, shift = emb.chunk(2, dim=1) + + x = F.group_norm(x, self.num_groups, eps=self.eps) + x = x * (1 + scale) + shift + return x diff --git a/gf/sd-webui-rich-text/scripts/models/attention_processor.py b/gf/sd-webui-rich-text/scripts/models/attention_processor.py new file mode 100644 index 0000000000000000000000000000000000000000..7f1d1a5950a09ef9c5fc85dbf7f5bc5d2e8054a6 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/attention_processor.py @@ -0,0 +1,1681 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Callable, Optional, Union + +import torch +import torch.nn.functional as F +from torch import nn + +from diffusers_official.utils import deprecate, logging, maybe_allow_in_graph +from diffusers_official.utils.import_utils import is_xformers_available + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +if is_xformers_available(): + import xformers + import xformers.ops +else: + xformers = None + + +@maybe_allow_in_graph +class Attention(nn.Module): + r""" + A cross attention layer. + + Parameters: + query_dim (`int`): The number of channels in the query. + cross_attention_dim (`int`, *optional*): + The number of channels in the encoder_hidden_states. If not given, defaults to `query_dim`. + heads (`int`, *optional*, defaults to 8): The number of heads to use for multi-head attention. + dim_head (`int`, *optional*, defaults to 64): The number of channels in each head. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + bias (`bool`, *optional*, defaults to False): + Set to `True` for the query, key, and value linear layers to contain a bias parameter. + """ + + def __init__( + self, + query_dim: int, + cross_attention_dim: Optional[int] = None, + heads: int = 8, + dim_head: int = 64, + dropout: float = 0.0, + bias=False, + upcast_attention: bool = False, + upcast_softmax: bool = False, + cross_attention_norm: Optional[str] = None, + cross_attention_norm_num_groups: int = 32, + added_kv_proj_dim: Optional[int] = None, + norm_num_groups: Optional[int] = None, + spatial_norm_dim: Optional[int] = None, + out_bias: bool = True, + scale_qk: bool = True, + only_cross_attention: bool = False, + eps: float = 1e-5, + rescale_output_factor: float = 1.0, + residual_connection: bool = False, + _from_deprecated_attn_block=False, + processor: Optional["AttnProcessor"] = None, + ): + super().__init__() + inner_dim = dim_head * heads + cross_attention_dim = cross_attention_dim if cross_attention_dim is not None else query_dim + self.upcast_attention = upcast_attention + self.upcast_softmax = upcast_softmax + self.rescale_output_factor = rescale_output_factor + self.residual_connection = residual_connection + self.dropout = dropout + + # we make use of this private variable to know whether this class is loaded + # with an deprecated state dict so that we can convert it on the fly + self._from_deprecated_attn_block = _from_deprecated_attn_block + + self.scale_qk = scale_qk + self.scale = dim_head**-0.5 if self.scale_qk else 1.0 + + self.heads = heads + # for slice_size > 0 the attention score computation + # is split across the batch axis to save memory + # You can set slice_size with `set_attention_slice` + self.sliceable_head_dim = heads + + self.added_kv_proj_dim = added_kv_proj_dim + self.only_cross_attention = only_cross_attention + + if self.added_kv_proj_dim is None and self.only_cross_attention: + raise ValueError( + "`only_cross_attention` can only be set to True if `added_kv_proj_dim` is not None. Make sure to set either `only_cross_attention=False` or define `added_kv_proj_dim`." + ) + + if norm_num_groups is not None: + self.group_norm = nn.GroupNorm(num_channels=query_dim, num_groups=norm_num_groups, eps=eps, affine=True) + else: + self.group_norm = None + + if spatial_norm_dim is not None: + self.spatial_norm = SpatialNorm(f_channels=query_dim, zq_channels=spatial_norm_dim) + else: + self.spatial_norm = None + + if cross_attention_norm is None: + self.norm_cross = None + elif cross_attention_norm == "layer_norm": + self.norm_cross = nn.LayerNorm(cross_attention_dim) + elif cross_attention_norm == "group_norm": + if self.added_kv_proj_dim is not None: + # The given `encoder_hidden_states` are initially of shape + # (batch_size, seq_len, added_kv_proj_dim) before being projected + # to (batch_size, seq_len, cross_attention_dim). The norm is applied + # before the projection, so we need to use `added_kv_proj_dim` as + # the number of channels for the group norm. + norm_cross_num_channels = added_kv_proj_dim + else: + norm_cross_num_channels = cross_attention_dim + + self.norm_cross = nn.GroupNorm( + num_channels=norm_cross_num_channels, num_groups=cross_attention_norm_num_groups, eps=1e-5, affine=True + ) + else: + raise ValueError( + f"unknown cross_attention_norm: {cross_attention_norm}. Should be None, 'layer_norm' or 'group_norm'" + ) + + self.to_q = nn.Linear(query_dim, inner_dim, bias=bias) + + if not self.only_cross_attention: + # only relevant for the `AddedKVProcessor` classes + self.to_k = nn.Linear(cross_attention_dim, inner_dim, bias=bias) + self.to_v = nn.Linear(cross_attention_dim, inner_dim, bias=bias) + else: + self.to_k = None + self.to_v = None + + if self.added_kv_proj_dim is not None: + self.add_k_proj = nn.Linear(added_kv_proj_dim, inner_dim) + self.add_v_proj = nn.Linear(added_kv_proj_dim, inner_dim) + + self.to_out = nn.ModuleList([]) + self.to_out.append(nn.Linear(inner_dim, query_dim, bias=out_bias)) + self.to_out.append(nn.Dropout(dropout)) + + # set attention processor + # We use the AttnProcessor2_0 by default when torch 2.x is used which uses + # torch.nn.functional.scaled_dot_product_attention for native Flash/memory_efficient_attention + # but only if it has the default `scale` argument. TODO remove scale_qk check when we move to torch 2.1 + if processor is None: + processor = ( + AttnProcessor2_0() if hasattr(F, "scaled_dot_product_attention") and self.scale_qk else AttnProcessor() + ) + self.set_processor(processor) + + # Rich-Text: util function for averaging over attention heads + def reshape_batch_dim_to_heads_and_average(self, tensor): + batch_size, seq_len, seq_len2 = tensor.shape + head_size = self.heads + tensor = tensor.reshape(batch_size // head_size, + head_size, seq_len, seq_len2) + return tensor.mean(1) + + def set_use_memory_efficient_attention_xformers( + self, use_memory_efficient_attention_xformers: bool, attention_op: Optional[Callable] = None + ): + is_lora = hasattr(self, "processor") and isinstance( + self.processor, + (LoRAAttnProcessor, LoRAAttnProcessor2_0, LoRAXFormersAttnProcessor, LoRAAttnAddedKVProcessor), + ) + is_custom_diffusion = hasattr(self, "processor") and isinstance( + self.processor, (CustomDiffusionAttnProcessor, CustomDiffusionXFormersAttnProcessor) + ) + is_added_kv_processor = hasattr(self, "processor") and isinstance( + self.processor, + ( + AttnAddedKVProcessor, + AttnAddedKVProcessor2_0, + SlicedAttnAddedKVProcessor, + XFormersAttnAddedKVProcessor, + LoRAAttnAddedKVProcessor, + ), + ) + + if use_memory_efficient_attention_xformers: + if is_added_kv_processor and (is_lora or is_custom_diffusion): + raise NotImplementedError( + f"Memory efficient attention is currently not supported for LoRA or custom diffuson for attention processor type {self.processor}" + ) + if not is_xformers_available(): + raise ModuleNotFoundError( + ( + "Refer to https://github.com/facebookresearch/xformers for more information on how to install" + " xformers" + ), + name="xformers", + ) + elif not torch.cuda.is_available(): + raise ValueError( + "torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is" + " only available for GPU " + ) + else: + try: + # Make sure we can run the memory efficient attention + _ = xformers.ops.memory_efficient_attention( + torch.randn((1, 2, 40), device="cuda"), + torch.randn((1, 2, 40), device="cuda"), + torch.randn((1, 2, 40), device="cuda"), + ) + except Exception as e: + raise e + + if is_lora: + # TODO (sayakpaul): should we throw a warning if someone wants to use the xformers + # variant when using PT 2.0 now that we have LoRAAttnProcessor2_0? + processor = LoRAXFormersAttnProcessor( + hidden_size=self.processor.hidden_size, + cross_attention_dim=self.processor.cross_attention_dim, + rank=self.processor.rank, + attention_op=attention_op, + ) + processor.load_state_dict(self.processor.state_dict()) + processor.to(self.processor.to_q_lora.up.weight.device) + elif is_custom_diffusion: + processor = CustomDiffusionXFormersAttnProcessor( + train_kv=self.processor.train_kv, + train_q_out=self.processor.train_q_out, + hidden_size=self.processor.hidden_size, + cross_attention_dim=self.processor.cross_attention_dim, + attention_op=attention_op, + ) + processor.load_state_dict(self.processor.state_dict()) + if hasattr(self.processor, "to_k_custom_diffusion"): + processor.to(self.processor.to_k_custom_diffusion.weight.device) + elif is_added_kv_processor: + # TODO(Patrick, Suraj, William) - currently xformers doesn't work for UnCLIP + # which uses this type of cross attention ONLY because the attention mask of format + # [0, ..., -10.000, ..., 0, ...,] is not supported + # throw warning + logger.info( + "Memory efficient attention with `xformers` might currently not work correctly if an attention mask is required for the attention operation." + ) + processor = XFormersAttnAddedKVProcessor(attention_op=attention_op) + else: + processor = XFormersAttnProcessor(attention_op=attention_op) + else: + if is_lora: + attn_processor_class = ( + LoRAAttnProcessor2_0 if hasattr(F, "scaled_dot_product_attention") else LoRAAttnProcessor + ) + processor = attn_processor_class( + hidden_size=self.processor.hidden_size, + cross_attention_dim=self.processor.cross_attention_dim, + rank=self.processor.rank, + ) + processor.load_state_dict(self.processor.state_dict()) + processor.to(self.processor.to_q_lora.up.weight.device) + elif is_custom_diffusion: + processor = CustomDiffusionAttnProcessor( + train_kv=self.processor.train_kv, + train_q_out=self.processor.train_q_out, + hidden_size=self.processor.hidden_size, + cross_attention_dim=self.processor.cross_attention_dim, + ) + processor.load_state_dict(self.processor.state_dict()) + if hasattr(self.processor, "to_k_custom_diffusion"): + processor.to(self.processor.to_k_custom_diffusion.weight.device) + else: + # set attention processor + # We use the AttnProcessor2_0 by default when torch 2.x is used which uses + # torch.nn.functional.scaled_dot_product_attention for native Flash/memory_efficient_attention + # but only if it has the default `scale` argument. TODO remove scale_qk check when we move to torch 2.1 + processor = ( + AttnProcessor2_0() + if hasattr(F, "scaled_dot_product_attention") and self.scale_qk + else AttnProcessor() + ) + + self.set_processor(processor) + + def set_attention_slice(self, slice_size): + if slice_size is not None and slice_size > self.sliceable_head_dim: + raise ValueError(f"slice_size {slice_size} has to be smaller or equal to {self.sliceable_head_dim}.") + + if slice_size is not None and self.added_kv_proj_dim is not None: + processor = SlicedAttnAddedKVProcessor(slice_size) + elif slice_size is not None: + processor = SlicedAttnProcessor(slice_size) + elif self.added_kv_proj_dim is not None: + processor = AttnAddedKVProcessor() + else: + # set attention processor + # We use the AttnProcessor2_0 by default when torch 2.x is used which uses + # torch.nn.functional.scaled_dot_product_attention for native Flash/memory_efficient_attention + # but only if it has the default `scale` argument. TODO remove scale_qk check when we move to torch 2.1 + processor = ( + AttnProcessor2_0() if hasattr(F, "scaled_dot_product_attention") and self.scale_qk else AttnProcessor() + ) + + self.set_processor(processor) + + def set_processor(self, processor: "AttnProcessor"): + # if current processor is in `self._modules` and if passed `processor` is not, we need to + # pop `processor` from `self._modules` + if ( + hasattr(self, "processor") + and isinstance(self.processor, torch.nn.Module) + and not isinstance(processor, torch.nn.Module) + ): + logger.info(f"You are removing possibly trained weights of {self.processor} with {processor}") + self._modules.pop("processor") + + self.processor = processor + + # Rich-Text: inject self-attention maps + def forward(self, hidden_states, real_attn_probs=None, attn_weights=None, encoder_hidden_states=None, attention_mask=None, **cross_attention_kwargs): + # The `Attention` class can call different attention processors / attention functions + # here we simply pass along all tensors to the selected processor class + # For standard processors that are defined here, `**cross_attention_kwargs` is empty + return self.processor( + self, + hidden_states, + real_attn_probs=real_attn_probs, + attn_weights=attn_weights, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + **cross_attention_kwargs, + ) + + def batch_to_head_dim(self, tensor): + head_size = self.heads + batch_size, seq_len, dim = tensor.shape + tensor = tensor.reshape(batch_size // head_size, head_size, seq_len, dim) + tensor = tensor.permute(0, 2, 1, 3).reshape(batch_size // head_size, seq_len, dim * head_size) + return tensor + + def head_to_batch_dim(self, tensor, out_dim=3): + head_size = self.heads + batch_size, seq_len, dim = tensor.shape + tensor = tensor.reshape(batch_size, seq_len, head_size, dim // head_size) + tensor = tensor.permute(0, 2, 1, 3) + + if out_dim == 3: + tensor = tensor.reshape(batch_size * head_size, seq_len, dim // head_size) + + return tensor + + # Rich-Text: return attention scores + def get_attention_scores(self, query, key, attention_mask=None, attn_weights=False): + dtype = query.dtype + if self.upcast_attention: + query = query.float() + key = key.float() + + if attention_mask is None: + baddbmm_input = torch.empty( + query.shape[0], query.shape[1], key.shape[1], dtype=query.dtype, device=query.device + ) + beta = 0 + else: + baddbmm_input = attention_mask + beta = 1 + + attention_scores = torch.baddbmm( + baddbmm_input, + query, + key.transpose(-1, -2), + beta=beta, + alpha=self.scale, + ) + del baddbmm_input + + if self.upcast_softmax: + attention_scores = attention_scores.float() + + # Rich-Text: font size + if attn_weights is not None: + assert key.shape[1] == 77 + attention_scores_stable = attention_scores - attention_scores.max(-1, True)[0] + attention_score_exp = attention_scores_stable.float().exp() + # attention_score_exp = attention_scores.float().exp() + font_size_abs, font_size_sign = attn_weights['font_size'].abs(), attn_weights['font_size'].sign() + attention_score_exp[:, :, attn_weights['word_pos']] = attention_score_exp[:, :, attn_weights['word_pos']].clone( + )*font_size_abs + attention_probs = attention_score_exp / attention_score_exp.sum(-1, True) + attention_probs[:, :, attn_weights['word_pos']] *= font_size_sign + # import ipdb; ipdb.set_trace() + if attention_probs.isnan().any(): + import ipdb; ipdb.set_trace() + else: + attention_probs = attention_scores.softmax(dim=-1) + + del attention_scores + + attention_probs = attention_probs.to(dtype) + + return attention_probs + + def prepare_attention_mask(self, attention_mask, target_length, batch_size=None, out_dim=3): + if batch_size is None: + deprecate( + "batch_size=None", + "0.0.15", + ( + "Not passing the `batch_size` parameter to `prepare_attention_mask` can lead to incorrect" + " attention mask preparation and is deprecated behavior. Please make sure to pass `batch_size` to" + " `prepare_attention_mask` when preparing the attention_mask." + ), + ) + batch_size = 1 + + head_size = self.heads + if attention_mask is None: + return attention_mask + + current_length: int = attention_mask.shape[-1] + if current_length != target_length: + if attention_mask.device.type == "mps": + # HACK: MPS: Does not support padding by greater than dimension of input tensor. + # Instead, we can manually construct the padding tensor. + padding_shape = (attention_mask.shape[0], attention_mask.shape[1], target_length) + padding = torch.zeros(padding_shape, dtype=attention_mask.dtype, device=attention_mask.device) + attention_mask = torch.cat([attention_mask, padding], dim=2) + else: + # TODO: for pipelines such as stable-diffusion, padding cross-attn mask: + # we want to instead pad by (0, remaining_length), where remaining_length is: + # remaining_length: int = target_length - current_length + # TODO: re-enable tests/models/test_models_unet_2d_condition.py#test_model_xattn_padding + attention_mask = F.pad(attention_mask, (0, target_length), value=0.0) + + if out_dim == 3: + if attention_mask.shape[0] < batch_size * head_size: + attention_mask = attention_mask.repeat_interleave(head_size, dim=0) + elif out_dim == 4: + attention_mask = attention_mask.unsqueeze(1) + attention_mask = attention_mask.repeat_interleave(head_size, dim=1) + + return attention_mask + + def norm_encoder_hidden_states(self, encoder_hidden_states): + assert self.norm_cross is not None, "self.norm_cross must be defined to call self.norm_encoder_hidden_states" + + if isinstance(self.norm_cross, nn.LayerNorm): + encoder_hidden_states = self.norm_cross(encoder_hidden_states) + elif isinstance(self.norm_cross, nn.GroupNorm): + # Group norm norms along the channels dimension and expects + # input to be in the shape of (N, C, *). In this case, we want + # to norm along the hidden dimension, so we need to move + # (batch_size, sequence_length, hidden_size) -> + # (batch_size, hidden_size, sequence_length) + encoder_hidden_states = encoder_hidden_states.transpose(1, 2) + encoder_hidden_states = self.norm_cross(encoder_hidden_states) + encoder_hidden_states = encoder_hidden_states.transpose(1, 2) + else: + assert False + + return encoder_hidden_states + + +class AttnProcessor: + r""" + Default processor for performing attention-related computations. + """ + + # Rich-Text: inject self-attention maps + def __call__( + self, + attn: Attention, + hidden_states, + real_attn_probs=None, + attn_weights=None, + encoder_hidden_states=None, + attention_mask=None, + temb=None, + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + query = attn.head_to_batch_dim(query) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + if real_attn_probs is None: + # Rich-Text: font size + attention_probs = attn.get_attention_scores(query, key, attention_mask, attn_weights=attn_weights) + else: + # Rich-Text: inject self-attention maps + attention_probs = real_attn_probs + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + # Rich-Text Modified: return attn probs + # We return the map averaged over heads to save memory footprint + attention_probs_avg = attn.reshape_batch_dim_to_heads_and_average( + attention_probs) + return hidden_states, [attention_probs_avg, attention_probs] + + +class LoRALinearLayer(nn.Module): + def __init__(self, in_features, out_features, rank=4, network_alpha=None): + super().__init__() + + if rank > min(in_features, out_features): + raise ValueError(f"LoRA rank {rank} must be less or equal than {min(in_features, out_features)}") + + self.down = nn.Linear(in_features, rank, bias=False) + self.up = nn.Linear(rank, out_features, bias=False) + # This value has the same meaning as the `--network_alpha` option in the kohya-ss trainer script. + # See https://github.com/darkstorm2150/sd-scripts/blob/main/docs/train_network_README-en.md#execute-learning + self.network_alpha = network_alpha + self.rank = rank + + nn.init.normal_(self.down.weight, std=1 / rank) + nn.init.zeros_(self.up.weight) + + def forward(self, hidden_states): + orig_dtype = hidden_states.dtype + dtype = self.down.weight.dtype + + down_hidden_states = self.down(hidden_states.to(dtype)) + up_hidden_states = self.up(down_hidden_states) + + if self.network_alpha is not None: + up_hidden_states *= self.network_alpha / self.rank + + return up_hidden_states.to(orig_dtype) + + +class LoRAAttnProcessor(nn.Module): + r""" + Processor for implementing the LoRA attention mechanism. + + Args: + hidden_size (`int`, *optional*): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*): + The number of channels in the `encoder_hidden_states`. + rank (`int`, defaults to 4): + The dimension of the LoRA update matrices. + network_alpha (`int`, *optional*): + Equivalent to `alpha` but it's usage is specific to Kohya (A1111) style LoRAs. + """ + + def __init__(self, hidden_size, cross_attention_dim=None, rank=4, network_alpha=None): + super().__init__() + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.rank = rank + + self.to_q_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_k_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_v_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_out_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + + def __call__( + self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None, scale=1.0, temb=None + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + scale * self.to_q_lora(hidden_states) + query = attn.head_to_batch_dim(query) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + scale * self.to_k_lora(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + scale * self.to_v_lora(encoder_hidden_states) + + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + scale * self.to_out_lora(hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class CustomDiffusionAttnProcessor(nn.Module): + r""" + Processor for implementing attention for the Custom Diffusion method. + + Args: + train_kv (`bool`, defaults to `True`): + Whether to newly train the key and value matrices corresponding to the text features. + train_q_out (`bool`, defaults to `True`): + Whether to newly train query matrices corresponding to the latent image features. + hidden_size (`int`, *optional*, defaults to `None`): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*, defaults to `None`): + The number of channels in the `encoder_hidden_states`. + out_bias (`bool`, defaults to `True`): + Whether to include the bias parameter in `train_q_out`. + dropout (`float`, *optional*, defaults to 0.0): + The dropout probability to use. + """ + + def __init__( + self, + train_kv=True, + train_q_out=True, + hidden_size=None, + cross_attention_dim=None, + out_bias=True, + dropout=0.0, + ): + super().__init__() + self.train_kv = train_kv + self.train_q_out = train_q_out + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + + # `_custom_diffusion` id for easy serialization and loading. + if self.train_kv: + self.to_k_custom_diffusion = nn.Linear(cross_attention_dim or hidden_size, hidden_size, bias=False) + self.to_v_custom_diffusion = nn.Linear(cross_attention_dim or hidden_size, hidden_size, bias=False) + if self.train_q_out: + self.to_q_custom_diffusion = nn.Linear(hidden_size, hidden_size, bias=False) + self.to_out_custom_diffusion = nn.ModuleList([]) + self.to_out_custom_diffusion.append(nn.Linear(hidden_size, hidden_size, bias=out_bias)) + self.to_out_custom_diffusion.append(nn.Dropout(dropout)) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + batch_size, sequence_length, _ = hidden_states.shape + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + if self.train_q_out: + query = self.to_q_custom_diffusion(hidden_states) + else: + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + crossattn = False + encoder_hidden_states = hidden_states + else: + crossattn = True + if attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + if self.train_kv: + key = self.to_k_custom_diffusion(encoder_hidden_states) + value = self.to_v_custom_diffusion(encoder_hidden_states) + else: + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + if crossattn: + detach = torch.ones_like(key) + detach[:, :1, :] = detach[:, :1, :] * 0.0 + key = detach * key + (1 - detach) * key.detach() + value = detach * value + (1 - detach) * value.detach() + + query = attn.head_to_batch_dim(query) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + if self.train_q_out: + # linear proj + hidden_states = self.to_out_custom_diffusion[0](hidden_states) + # dropout + hidden_states = self.to_out_custom_diffusion[1](hidden_states) + else: + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + return hidden_states + + +class AttnAddedKVProcessor: + r""" + Processor for performing attention-related computations with extra learnable key and value matrices for the text + encoder. + """ + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + residual = hidden_states + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + query = attn.head_to_batch_dim(query) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + value = attn.to_v(hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=1) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=1) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +class AttnAddedKVProcessor2_0: + r""" + Processor for performing scaled dot-product attention (enabled by default if you're using PyTorch 2.0), with extra + learnable key and value matrices for the text encoder. + """ + + def __init__(self): + if not hasattr(F, "scaled_dot_product_attention"): + raise ImportError( + "AttnAddedKVProcessor2_0 requires PyTorch 2.0, to use it, please upgrade PyTorch to 2.0." + ) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + residual = hidden_states + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size, out_dim=4) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + query = attn.head_to_batch_dim(query, out_dim=4) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj, out_dim=4) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj, out_dim=4) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + value = attn.to_v(hidden_states) + key = attn.head_to_batch_dim(key, out_dim=4) + value = attn.head_to_batch_dim(value, out_dim=4) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=2) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=2) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + # the output of sdp = (batch, num_heads, seq_len, head_dim) + # TODO: add support for attn.scale when we move to Torch 2.1 + hidden_states = F.scaled_dot_product_attention( + query, key, value, attn_mask=attention_mask, dropout_p=0.0, is_causal=False + ) + hidden_states = hidden_states.transpose(1, 2).reshape(batch_size, -1, residual.shape[1]) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +class LoRAAttnAddedKVProcessor(nn.Module): + r""" + Processor for implementing the LoRA attention mechanism with extra learnable key and value matrices for the text + encoder. + + Args: + hidden_size (`int`, *optional*): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*, defaults to `None`): + The number of channels in the `encoder_hidden_states`. + rank (`int`, defaults to 4): + The dimension of the LoRA update matrices. + + """ + + def __init__(self, hidden_size, cross_attention_dim=None, rank=4, network_alpha=None): + super().__init__() + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.rank = rank + + self.to_q_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.add_k_proj_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.add_v_proj_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_k_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_v_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_out_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None, scale=1.0): + residual = hidden_states + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + scale * self.to_q_lora(hidden_states) + query = attn.head_to_batch_dim(query) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + scale * self.add_k_proj_lora( + encoder_hidden_states + ) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + scale * self.add_v_proj_lora( + encoder_hidden_states + ) + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + scale * self.to_k_lora(hidden_states) + value = attn.to_v(hidden_states) + scale * self.to_v_lora(hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=1) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=1) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + attention_probs = attn.get_attention_scores(query, key, attention_mask) + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + scale * self.to_out_lora(hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +class XFormersAttnAddedKVProcessor: + r""" + Processor for implementing memory efficient attention using xFormers. + + Args: + attention_op (`Callable`, *optional*, defaults to `None`): + The base + [operator](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.AttentionOpBase) to + use as the attention operator. It is recommended to set to `None`, and allow xFormers to choose the best + operator. + """ + + def __init__(self, attention_op: Optional[Callable] = None): + self.attention_op = attention_op + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + residual = hidden_states + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + query = attn.head_to_batch_dim(query) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + value = attn.to_v(hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=1) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=1) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + hidden_states = xformers.ops.memory_efficient_attention( + query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=attn.scale + ) + hidden_states = hidden_states.to(query.dtype) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +class XFormersAttnProcessor: + r""" + Processor for implementing memory efficient attention using xFormers. + + Args: + attention_op (`Callable`, *optional*, defaults to `None`): + The base + [operator](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.AttentionOpBase) to + use as the attention operator. It is recommended to set to `None`, and allow xFormers to choose the best + operator. + """ + + def __init__(self, attention_op: Optional[Callable] = None): + self.attention_op = attention_op + + def __call__( + self, + attn: Attention, + hidden_states: torch.FloatTensor, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + temb: Optional[torch.FloatTensor] = None, + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, key_tokens, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + + attention_mask = attn.prepare_attention_mask(attention_mask, key_tokens, batch_size) + if attention_mask is not None: + # expand our mask's singleton query_tokens dimension: + # [batch*heads, 1, key_tokens] -> + # [batch*heads, query_tokens, key_tokens] + # so that it can be added as a bias onto the attention scores that xformers computes: + # [batch*heads, query_tokens, key_tokens] + # we do this explicitly because xformers doesn't broadcast the singleton dimension for us. + _, query_tokens, _ = hidden_states.shape + attention_mask = attention_mask.expand(-1, query_tokens, -1) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + query = attn.head_to_batch_dim(query).contiguous() + key = attn.head_to_batch_dim(key).contiguous() + value = attn.head_to_batch_dim(value).contiguous() + + hidden_states = xformers.ops.memory_efficient_attention( + query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=attn.scale + ) + hidden_states = hidden_states.to(query.dtype) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class AttnProcessor2_0: + r""" + Default processor for performing attention-related computations. + """ + + # Rich-Text: inject self-attention maps + def __call__( + self, + attn: Attention, + hidden_states, + real_attn_probs=None, + attn_weights=None, + encoder_hidden_states=None, + attention_mask=None, + temb=None, + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + query = attn.head_to_batch_dim(query) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + if real_attn_probs is None: + # Rich-Text: font size + attention_probs = attn.get_attention_scores(query, key, attention_mask, attn_weights=attn_weights) + else: + # Rich-Text: inject self-attention maps + attention_probs = real_attn_probs + hidden_states = torch.bmm(attention_probs, value) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + # Rich-Text Modified: return attn probs + # We return the map averaged over heads to save memory footprint + attention_probs_avg = attn.reshape_batch_dim_to_heads_and_average( + attention_probs) + return hidden_states, [attention_probs_avg, attention_probs] + + +class LoRAXFormersAttnProcessor(nn.Module): + r""" + Processor for implementing the LoRA attention mechanism with memory efficient attention using xFormers. + + Args: + hidden_size (`int`, *optional*): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*): + The number of channels in the `encoder_hidden_states`. + rank (`int`, defaults to 4): + The dimension of the LoRA update matrices. + attention_op (`Callable`, *optional*, defaults to `None`): + The base + [operator](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.AttentionOpBase) to + use as the attention operator. It is recommended to set to `None`, and allow xFormers to choose the best + operator. + network_alpha (`int`, *optional*): + Equivalent to `alpha` but it's usage is specific to Kohya (A1111) style LoRAs. + + """ + + def __init__( + self, hidden_size, cross_attention_dim, rank=4, attention_op: Optional[Callable] = None, network_alpha=None + ): + super().__init__() + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.rank = rank + self.attention_op = attention_op + + self.to_q_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_k_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_v_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_out_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + + def __call__( + self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None, scale=1.0, temb=None + ): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + scale * self.to_q_lora(hidden_states) + query = attn.head_to_batch_dim(query).contiguous() + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + scale * self.to_k_lora(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + scale * self.to_v_lora(encoder_hidden_states) + + key = attn.head_to_batch_dim(key).contiguous() + value = attn.head_to_batch_dim(value).contiguous() + + hidden_states = xformers.ops.memory_efficient_attention( + query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=attn.scale + ) + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + scale * self.to_out_lora(hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class LoRAAttnProcessor2_0(nn.Module): + r""" + Processor for implementing the LoRA attention mechanism using PyTorch 2.0's memory-efficient scaled dot-product + attention. + + Args: + hidden_size (`int`): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*): + The number of channels in the `encoder_hidden_states`. + rank (`int`, defaults to 4): + The dimension of the LoRA update matrices. + network_alpha (`int`, *optional*): + Equivalent to `alpha` but it's usage is specific to Kohya (A1111) style LoRAs. + """ + + def __init__(self, hidden_size, cross_attention_dim=None, rank=4, network_alpha=None): + super().__init__() + if not hasattr(F, "scaled_dot_product_attention"): + raise ImportError("AttnProcessor2_0 requires PyTorch 2.0, to use it, please upgrade PyTorch to 2.0.") + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.rank = rank + + self.to_q_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + self.to_k_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_v_lora = LoRALinearLayer(cross_attention_dim or hidden_size, hidden_size, rank, network_alpha) + self.to_out_lora = LoRALinearLayer(hidden_size, hidden_size, rank, network_alpha) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None, scale=1.0): + residual = hidden_states + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + inner_dim = hidden_states.shape[-1] + + if attention_mask is not None: + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + # scaled_dot_product_attention expects attention_mask shape to be + # (batch, heads, source_length, target_length) + attention_mask = attention_mask.view(batch_size, attn.heads, -1, attention_mask.shape[-1]) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + scale * self.to_q_lora(hidden_states) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + scale * self.to_k_lora(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + scale * self.to_v_lora(encoder_hidden_states) + + head_dim = inner_dim // attn.heads + query = query.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + key = key.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + value = value.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) + + # TODO: add support for attn.scale when we move to Torch 2.1 + hidden_states = F.scaled_dot_product_attention( + query, key, value, attn_mask=attention_mask, dropout_p=0.0, is_causal=False + ) + hidden_states = hidden_states.transpose(1, 2).reshape(batch_size, -1, attn.heads * head_dim) + hidden_states = hidden_states.to(query.dtype) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + scale * self.to_out_lora(hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class CustomDiffusionXFormersAttnProcessor(nn.Module): + r""" + Processor for implementing memory efficient attention using xFormers for the Custom Diffusion method. + + Args: + train_kv (`bool`, defaults to `True`): + Whether to newly train the key and value matrices corresponding to the text features. + train_q_out (`bool`, defaults to `True`): + Whether to newly train query matrices corresponding to the latent image features. + hidden_size (`int`, *optional*, defaults to `None`): + The hidden size of the attention layer. + cross_attention_dim (`int`, *optional*, defaults to `None`): + The number of channels in the `encoder_hidden_states`. + out_bias (`bool`, defaults to `True`): + Whether to include the bias parameter in `train_q_out`. + dropout (`float`, *optional*, defaults to 0.0): + The dropout probability to use. + attention_op (`Callable`, *optional*, defaults to `None`): + The base + [operator](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.AttentionOpBase) to use + as the attention operator. It is recommended to set to `None`, and allow xFormers to choose the best operator. + """ + + def __init__( + self, + train_kv=True, + train_q_out=False, + hidden_size=None, + cross_attention_dim=None, + out_bias=True, + dropout=0.0, + attention_op: Optional[Callable] = None, + ): + super().__init__() + self.train_kv = train_kv + self.train_q_out = train_q_out + + self.hidden_size = hidden_size + self.cross_attention_dim = cross_attention_dim + self.attention_op = attention_op + + # `_custom_diffusion` id for easy serialization and loading. + if self.train_kv: + self.to_k_custom_diffusion = nn.Linear(cross_attention_dim or hidden_size, hidden_size, bias=False) + self.to_v_custom_diffusion = nn.Linear(cross_attention_dim or hidden_size, hidden_size, bias=False) + if self.train_q_out: + self.to_q_custom_diffusion = nn.Linear(hidden_size, hidden_size, bias=False) + self.to_out_custom_diffusion = nn.ModuleList([]) + self.to_out_custom_diffusion.append(nn.Linear(hidden_size, hidden_size, bias=out_bias)) + self.to_out_custom_diffusion.append(nn.Dropout(dropout)) + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if self.train_q_out: + query = self.to_q_custom_diffusion(hidden_states) + else: + query = attn.to_q(hidden_states) + + if encoder_hidden_states is None: + crossattn = False + encoder_hidden_states = hidden_states + else: + crossattn = True + if attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + if self.train_kv: + key = self.to_k_custom_diffusion(encoder_hidden_states) + value = self.to_v_custom_diffusion(encoder_hidden_states) + else: + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + + if crossattn: + detach = torch.ones_like(key) + detach[:, :1, :] = detach[:, :1, :] * 0.0 + key = detach * key + (1 - detach) * key.detach() + value = detach * value + (1 - detach) * value.detach() + + query = attn.head_to_batch_dim(query).contiguous() + key = attn.head_to_batch_dim(key).contiguous() + value = attn.head_to_batch_dim(value).contiguous() + + hidden_states = xformers.ops.memory_efficient_attention( + query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=attn.scale + ) + hidden_states = hidden_states.to(query.dtype) + hidden_states = attn.batch_to_head_dim(hidden_states) + + if self.train_q_out: + # linear proj + hidden_states = self.to_out_custom_diffusion[0](hidden_states) + # dropout + hidden_states = self.to_out_custom_diffusion[1](hidden_states) + else: + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + return hidden_states + + +class SlicedAttnProcessor: + r""" + Processor for implementing sliced attention. + + Args: + slice_size (`int`, *optional*): + The number of steps to compute attention. Uses as many slices as `attention_head_dim // slice_size`, and + `attention_head_dim` must be a multiple of the `slice_size`. + """ + + def __init__(self, slice_size): + self.slice_size = slice_size + + def __call__(self, attn: Attention, hidden_states, encoder_hidden_states=None, attention_mask=None): + residual = hidden_states + + input_ndim = hidden_states.ndim + + if input_ndim == 4: + batch_size, channel, height, width = hidden_states.shape + hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2) + + batch_size, sequence_length, _ = ( + hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape + ) + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if attn.group_norm is not None: + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + dim = query.shape[-1] + query = attn.head_to_batch_dim(query) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + key = attn.to_k(encoder_hidden_states) + value = attn.to_v(encoder_hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + + batch_size_attention, query_tokens, _ = query.shape + hidden_states = torch.zeros( + (batch_size_attention, query_tokens, dim // attn.heads), device=query.device, dtype=query.dtype + ) + + for i in range(batch_size_attention // self.slice_size): + start_idx = i * self.slice_size + end_idx = (i + 1) * self.slice_size + + query_slice = query[start_idx:end_idx] + key_slice = key[start_idx:end_idx] + attn_mask_slice = attention_mask[start_idx:end_idx] if attention_mask is not None else None + + attn_slice = attn.get_attention_scores(query_slice, key_slice, attn_mask_slice) + + attn_slice = torch.bmm(attn_slice, value[start_idx:end_idx]) + + hidden_states[start_idx:end_idx] = attn_slice + + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + if input_ndim == 4: + hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width) + + if attn.residual_connection: + hidden_states = hidden_states + residual + + hidden_states = hidden_states / attn.rescale_output_factor + + return hidden_states + + +class SlicedAttnAddedKVProcessor: + r""" + Processor for implementing sliced attention with extra learnable key and value matrices for the text encoder. + + Args: + slice_size (`int`, *optional*): + The number of steps to compute attention. Uses as many slices as `attention_head_dim // slice_size`, and + `attention_head_dim` must be a multiple of the `slice_size`. + """ + + def __init__(self, slice_size): + self.slice_size = slice_size + + def __call__(self, attn: "Attention", hidden_states, encoder_hidden_states=None, attention_mask=None, temb=None): + residual = hidden_states + + if attn.spatial_norm is not None: + hidden_states = attn.spatial_norm(hidden_states, temb) + + hidden_states = hidden_states.view(hidden_states.shape[0], hidden_states.shape[1], -1).transpose(1, 2) + + batch_size, sequence_length, _ = hidden_states.shape + + attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size) + + if encoder_hidden_states is None: + encoder_hidden_states = hidden_states + elif attn.norm_cross: + encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states) + + hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2) + + query = attn.to_q(hidden_states) + dim = query.shape[-1] + query = attn.head_to_batch_dim(query) + + encoder_hidden_states_key_proj = attn.add_k_proj(encoder_hidden_states) + encoder_hidden_states_value_proj = attn.add_v_proj(encoder_hidden_states) + + encoder_hidden_states_key_proj = attn.head_to_batch_dim(encoder_hidden_states_key_proj) + encoder_hidden_states_value_proj = attn.head_to_batch_dim(encoder_hidden_states_value_proj) + + if not attn.only_cross_attention: + key = attn.to_k(hidden_states) + value = attn.to_v(hidden_states) + key = attn.head_to_batch_dim(key) + value = attn.head_to_batch_dim(value) + key = torch.cat([encoder_hidden_states_key_proj, key], dim=1) + value = torch.cat([encoder_hidden_states_value_proj, value], dim=1) + else: + key = encoder_hidden_states_key_proj + value = encoder_hidden_states_value_proj + + batch_size_attention, query_tokens, _ = query.shape + hidden_states = torch.zeros( + (batch_size_attention, query_tokens, dim // attn.heads), device=query.device, dtype=query.dtype + ) + + for i in range(batch_size_attention // self.slice_size): + start_idx = i * self.slice_size + end_idx = (i + 1) * self.slice_size + + query_slice = query[start_idx:end_idx] + key_slice = key[start_idx:end_idx] + attn_mask_slice = attention_mask[start_idx:end_idx] if attention_mask is not None else None + + attn_slice = attn.get_attention_scores(query_slice, key_slice, attn_mask_slice) + + attn_slice = torch.bmm(attn_slice, value[start_idx:end_idx]) + + hidden_states[start_idx:end_idx] = attn_slice + + hidden_states = attn.batch_to_head_dim(hidden_states) + + # linear proj + hidden_states = attn.to_out[0](hidden_states) + # dropout + hidden_states = attn.to_out[1](hidden_states) + + hidden_states = hidden_states.transpose(-1, -2).reshape(residual.shape) + hidden_states = hidden_states + residual + + return hidden_states + + +AttentionProcessor = Union[ + AttnProcessor, + AttnProcessor2_0, + XFormersAttnProcessor, + SlicedAttnProcessor, + AttnAddedKVProcessor, + SlicedAttnAddedKVProcessor, + AttnAddedKVProcessor2_0, + XFormersAttnAddedKVProcessor, + LoRAAttnProcessor, + LoRAXFormersAttnProcessor, + LoRAAttnProcessor2_0, + LoRAAttnAddedKVProcessor, + CustomDiffusionAttnProcessor, + CustomDiffusionXFormersAttnProcessor, +] + + +class SpatialNorm(nn.Module): + """ + Spatially conditioned normalization as defined in https://arxiv.org/abs/2209.09002 + """ + + def __init__( + self, + f_channels, + zq_channels, + ): + super().__init__() + self.norm_layer = nn.GroupNorm(num_channels=f_channels, num_groups=32, eps=1e-6, affine=True) + self.conv_y = nn.Conv2d(zq_channels, f_channels, kernel_size=1, stride=1, padding=0) + self.conv_b = nn.Conv2d(zq_channels, f_channels, kernel_size=1, stride=1, padding=0) + + def forward(self, f, zq): + f_size = f.shape[-2:] + zq = F.interpolate(zq, size=f_size, mode="nearest") + norm_f = self.norm_layer(f) + new_f = norm_f * self.conv_y(zq) + self.conv_b(zq) + return new_f diff --git a/gf/sd-webui-rich-text/scripts/models/dual_transformer_2d.py b/gf/sd-webui-rich-text/scripts/models/dual_transformer_2d.py new file mode 100644 index 0000000000000000000000000000000000000000..5594aea3ec39ea105ddbfdcb2f445b56ddff6548 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/dual_transformer_2d.py @@ -0,0 +1,151 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Optional + +from torch import nn + +from scripts.models.transformer_2d import Transformer2DModel, Transformer2DModelOutput + + +class DualTransformer2DModel(nn.Module): + """ + Dual transformer wrapper that combines two `Transformer2DModel`s for mixed inference. + + Parameters: + num_attention_heads (`int`, *optional*, defaults to 16): The number of heads to use for multi-head attention. + attention_head_dim (`int`, *optional*, defaults to 88): The number of channels in each head. + in_channels (`int`, *optional*): + Pass if the input is continuous. The number of channels in the input and output. + num_layers (`int`, *optional*, defaults to 1): The number of layers of Transformer blocks to use. + dropout (`float`, *optional*, defaults to 0.1): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The number of encoder_hidden_states dimensions to use. + sample_size (`int`, *optional*): Pass if the input is discrete. The width of the latent images. + Note that this is fixed at training time as it is used for learning a number of position embeddings. See + `ImagePositionalEmbeddings`. + num_vector_embeds (`int`, *optional*): + Pass if the input is discrete. The number of classes of the vector embeddings of the latent pixels. + Includes the class for the masked latent pixel. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward. + num_embeds_ada_norm ( `int`, *optional*): Pass if at least one of the norm_layers is `AdaLayerNorm`. + The number of diffusion steps used during training. Note that this is fixed at training time as it is used + to learn a number of embeddings that are added to the hidden states. During inference, you can denoise for + up to but not more than steps than `num_embeds_ada_norm`. + attention_bias (`bool`, *optional*): + Configure if the TransformerBlocks' attention should contain a bias parameter. + """ + + def __init__( + self, + num_attention_heads: int = 16, + attention_head_dim: int = 88, + in_channels: Optional[int] = None, + num_layers: int = 1, + dropout: float = 0.0, + norm_num_groups: int = 32, + cross_attention_dim: Optional[int] = None, + attention_bias: bool = False, + sample_size: Optional[int] = None, + num_vector_embeds: Optional[int] = None, + activation_fn: str = "geglu", + num_embeds_ada_norm: Optional[int] = None, + ): + super().__init__() + self.transformers = nn.ModuleList( + [ + Transformer2DModel( + num_attention_heads=num_attention_heads, + attention_head_dim=attention_head_dim, + in_channels=in_channels, + num_layers=num_layers, + dropout=dropout, + norm_num_groups=norm_num_groups, + cross_attention_dim=cross_attention_dim, + attention_bias=attention_bias, + sample_size=sample_size, + num_vector_embeds=num_vector_embeds, + activation_fn=activation_fn, + num_embeds_ada_norm=num_embeds_ada_norm, + ) + for _ in range(2) + ] + ) + + # Variables that can be set by a pipeline: + + # The ratio of transformer1 to transformer2's output states to be combined during inference + self.mix_ratio = 0.5 + + # The shape of `encoder_hidden_states` is expected to be + # `(batch_size, condition_lengths[0]+condition_lengths[1], num_features)` + self.condition_lengths = [77, 257] + + # Which transformer to use to encode which condition. + # E.g. `(1, 0)` means that we'll use `transformers[1](conditions[0])` and `transformers[0](conditions[1])` + self.transformer_index_for_condition = [1, 0] + + def forward( + self, + hidden_states, + encoder_hidden_states, + timestep=None, + attention_mask=None, + cross_attention_kwargs=None, + return_dict: bool = True, + ): + """ + Args: + hidden_states ( When discrete, `torch.LongTensor` of shape `(batch size, num latent pixels)`. + When continuous, `torch.FloatTensor` of shape `(batch size, channel, height, width)`): Input + hidden_states + encoder_hidden_states ( `torch.LongTensor` of shape `(batch size, encoder_hidden_states dim)`, *optional*): + Conditional embeddings for cross attention layer. If not given, cross-attention defaults to + self-attention. + timestep ( `torch.long`, *optional*): + Optional timestep to be applied as an embedding in AdaLayerNorm's. Used to indicate denoising step. + attention_mask (`torch.FloatTensor`, *optional*): + Optional attention mask to be applied in Attention + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`models.unet_2d_condition.UNet2DConditionOutput`] instead of a plain tuple. + + Returns: + [`~models.transformer_2d.Transformer2DModelOutput`] or `tuple`: + [`~models.transformer_2d.Transformer2DModelOutput`] if `return_dict` is True, otherwise a `tuple`. When + returning a tuple, the first element is the sample tensor. + """ + input_states = hidden_states + + encoded_states = [] + tokens_start = 0 + # attention_mask is not used yet + for i in range(2): + # for each of the two transformers, pass the corresponding condition tokens + condition_state = encoder_hidden_states[:, tokens_start : tokens_start + self.condition_lengths[i]] + transformer_index = self.transformer_index_for_condition[i] + encoded_state = self.transformers[transformer_index]( + input_states, + encoder_hidden_states=condition_state, + timestep=timestep, + cross_attention_kwargs=cross_attention_kwargs, + return_dict=False, + )[0] + encoded_states.append(encoded_state - input_states) + tokens_start += self.condition_lengths[i] + + output_states = encoded_states[0] * self.mix_ratio + encoded_states[1] * (1 - self.mix_ratio) + output_states = output_states + input_states + + if not return_dict: + return (output_states,) + + return Transformer2DModelOutput(sample=output_states) diff --git a/gf/sd-webui-rich-text/scripts/models/region_diffusion.py b/gf/sd-webui-rich-text/scripts/models/region_diffusion.py new file mode 100644 index 0000000000000000000000000000000000000000..74342153525436772ece0bd8e118f0aebed33c80 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/region_diffusion.py @@ -0,0 +1,503 @@ +import os +import torch +import collections +import torch.nn as nn +from functools import partial +from transformers import CLIPTextModel, CLIPTokenizer, logging +from diffusers_official import AutoencoderKL, PNDMScheduler, EulerDiscreteScheduler, DPMSolverMultistepScheduler +from scripts.models.unet_2d_condition import UNet2DConditionModel +from scripts.models.utils.attention_utils import CrossAttentionLayers, SelfAttentionLayers + +# suppress partial model loading warning +logging.set_verbosity_error() + + +class RegionDiffusion(nn.Module): + def __init__(self, device, model_id='runwayml/stable-diffusion-v1-5'): + super().__init__() + + self.model_id = model_id + self.device = device + self.num_train_timesteps = 1000 + self.clip_gradient = False + + print(f'[INFO] loading stable diffusion...') + self.model_id = model_id + + for load_attemp in range(5): + print(f'[INFO] loading stable diffusion {model_id} ... attempt {load_attemp}') + try: + self.vae = AutoencoderKL.from_pretrained( + model_id, subfolder="vae").to(self.device) + self.tokenizer = CLIPTokenizer.from_pretrained( + model_id, subfolder='tokenizer') + self.text_encoder = CLIPTextModel.from_pretrained( + model_id, subfolder='text_encoder').to(self.device) + self.unet = UNet2DConditionModel.from_pretrained( + model_id, subfolder="unet").to(self.device) + except Exception as e: + print(f'[ERROR] loading stable diffusion {model_id} ... attempt {load_attemp} failed with error {e}') + continue + if self.unet is not None and self.vae is not None and self.text_encoder is not None: + break + + self.scheduler = PNDMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", + num_train_timesteps=self.num_train_timesteps, skip_prk_steps=True, steps_offset=1) + self.alphas_cumprod = self.scheduler.alphas_cumprod.to(self.device) + + self.masks = [] + self.attention_maps = None + self.selfattn_maps = None + self.crossattn_maps = None + self.color_loss = torch.nn.functional.mse_loss + self.forward_hooks = [] + self.forward_replacement_hooks = [] + + print(f'[INFO] loaded stable diffusion!') + + def get_text_embeds(self, prompt, negative_prompt): + # prompt, negative_prompt: [str] + + # Tokenize text and get embeddings + text_input = self.tokenizer( + prompt, padding='max_length', max_length=self.tokenizer.model_max_length, truncation=True, return_tensors='pt') + + with torch.no_grad(): + text_embeddings = self.text_encoder( + text_input.input_ids.to(self.device))[0] + + # Do the same for unconditional embeddings + uncond_input = self.tokenizer(negative_prompt, padding='max_length', + max_length=self.tokenizer.model_max_length, return_tensors='pt') + + with torch.no_grad(): + uncond_embeddings = self.text_encoder( + uncond_input.input_ids.to(self.device))[0] + + # Cat for final embeddings + text_embeddings = torch.cat([uncond_embeddings, text_embeddings]) + return text_embeddings + + def get_text_embeds_list(self, prompts): + # prompts: [list] + text_embeddings = [] + for prompt in prompts: + # Tokenize text and get embeddings + text_input = self.tokenizer( + [prompt], padding='max_length', max_length=self.tokenizer.model_max_length, truncation=True, return_tensors='pt') + + with torch.no_grad(): + text_embeddings.append(self.text_encoder( + text_input.input_ids.to(self.device))[0]) + + return text_embeddings + + def produce_latents(self, text_embeddings, height=512, width=512, num_inference_steps=50, guidance_scale=7.5, + latents=None, use_guidance=False, text_format_dict={}, inject_selfattn=0, inject_background=0): + + if latents is None: + latents = torch.randn( + (1, self.unet.in_channels, height // 8, width // 8), device=self.device) + + if inject_selfattn > 0 or inject_background > 0: + latents_reference = latents.clone().detach() + self.scheduler.set_timesteps(num_inference_steps) + n_styles = text_embeddings.shape[0]-1 + assert n_styles == len(self.masks) + with torch.autocast('cuda'): + for i, t in enumerate(self.scheduler.timesteps): + + # predict the noise residual + with torch.no_grad(): + # tokens without any attributes + feat_inject_step = t > (1-inject_selfattn) * 1000 + background_inject_step = i == int(inject_background * len(self.scheduler.timesteps)) and inject_background > 0 + noise_pred_uncond_cur = self.unet(latents, t, encoder_hidden_states=text_embeddings[:1], + )['sample'] + self.register_fontsize_hooks(text_format_dict) + noise_pred_text_cur = self.unet(latents, t, encoder_hidden_states=text_embeddings[-1:], + )['sample'] + self.remove_fontsize_hooks() + if inject_selfattn > 0 or inject_background > 0: + noise_pred_uncond_refer = self.unet(latents_reference, t, encoder_hidden_states=text_embeddings[:1], + )['sample'] + self.register_selfattn_hooks(feat_inject_step) + noise_pred_text_refer = self.unet(latents_reference, t, encoder_hidden_states=text_embeddings[-1:], + )['sample'] + self.remove_selfattn_hooks() + noise_pred_uncond = noise_pred_uncond_cur * self.masks[-1] + noise_pred_text = noise_pred_text_cur * self.masks[-1] + # tokens with attributes + for style_i, mask in enumerate(self.masks[:-1]): + self.register_replacement_hooks(feat_inject_step) + noise_pred_text_cur = self.unet(latents, t, encoder_hidden_states=text_embeddings[style_i+1:style_i+2], + )['sample'] + self.remove_replacement_hooks() + noise_pred_uncond = noise_pred_uncond + noise_pred_uncond_cur*mask + noise_pred_text = noise_pred_text + noise_pred_text_cur*mask + + # perform classifier-free guidance + noise_pred = noise_pred_uncond + guidance_scale * \ + (noise_pred_text - noise_pred_uncond) + + if inject_selfattn > 0 or inject_background > 0: + noise_pred_refer = noise_pred_uncond_refer + guidance_scale * \ + (noise_pred_text_refer - noise_pred_uncond_refer) + + # compute the previous noisy sample x_t -> x_t-1 + latents_reference = self.scheduler.step(torch.cat([noise_pred, noise_pred_refer]), t, + torch.cat([latents, latents_reference]))[ + 'prev_sample'] + latents, latents_reference = torch.chunk( + latents_reference, 2, dim=0) + + else: + # compute the previous noisy sample x_t -> x_t-1 + latents = self.scheduler.step(noise_pred, t, latents)[ + 'prev_sample'] + + # apply guidance + if use_guidance and t < text_format_dict['guidance_start_step']: + with torch.enable_grad(): + if not latents.requires_grad: + latents.requires_grad = True + latents_0 = self.predict_x0(latents, noise_pred, t) + latents_inp = 1 / 0.18215 * latents_0 + imgs = self.vae.decode(latents_inp).sample + imgs = (imgs / 2 + 0.5).clamp(0, 1) + loss_total = 0. + for attn_map, rgb_val in zip(text_format_dict['color_obj_atten'], text_format_dict['target_RGB']): + avg_rgb = ( + imgs*attn_map[:, 0]).sum(2).sum(2)/attn_map[:, 0].sum() + loss = self.color_loss( + avg_rgb, rgb_val[:, :, 0, 0])*100 + loss_total += loss + loss_total.backward() + latents = ( + latents - latents.grad * text_format_dict['color_guidance_weight'] * text_format_dict['color_obj_atten_all']).detach().clone() + + # apply background injection + if background_inject_step: + latents = latents_reference * self.masks[-1] + latents * \ + (1-self.masks[-1]) + return latents + + def predict_x0(self, x_t, eps_t, t): + alpha_t = self.scheduler.alphas_cumprod[t] + return (x_t - eps_t * torch.sqrt(1-alpha_t)) / torch.sqrt(alpha_t) + + def produce_attn_maps(self, prompts, negative_prompts='', height=512, width=512, num_inference_steps=50, + guidance_scale=7.5, latents=None): + + if isinstance(prompts, str): + prompts = [prompts] + + if isinstance(negative_prompts, str): + negative_prompts = [negative_prompts] + + # Prompts -> text embeds + text_embeddings = self.get_text_embeds( + prompts, negative_prompts) # [2, 77, 768] + if latents is None: + latents = torch.randn( + (text_embeddings.shape[0] // 2, self.unet.in_channels, height // 8, width // 8), device=self.device) + + self.scheduler.set_timesteps(num_inference_steps) + self.remove_replacement_hooks() + + with torch.autocast('cuda'): + for i, t in enumerate(self.scheduler.timesteps): + # expand the latents if we are doing classifier-free guidance to avoid doing two forward passes. + latent_model_input = torch.cat([latents] * 2) + + # predict the noise residual + with torch.no_grad(): + noise_pred = self.unet( + latent_model_input, t, encoder_hidden_states=text_embeddings)['sample'] + + # perform guidance + noise_pred_uncond, noise_pred_text = noise_pred.chunk(2) + noise_pred = noise_pred_uncond + guidance_scale * \ + (noise_pred_text - noise_pred_uncond) + + # compute the previous noisy sample x_t -> x_t-1 + latents = self.scheduler.step(noise_pred, t, latents)[ + 'prev_sample'] + + # Img latents -> imgs + imgs = self.decode_latents(latents) # [1, 3, 512, 512] + + # Img to Numpy + imgs = imgs.detach().cpu().permute(0, 2, 3, 1).numpy() + imgs = (imgs * 255).round().astype('uint8') + + return imgs + + def decode_latents(self, latents): + + latents = 1 / 0.18215 * latents + + with torch.no_grad(): + imgs = self.vae.decode(latents).sample + + imgs = (imgs / 2 + 0.5).clamp(0, 1) + + return imgs + + def encode_imgs(self, imgs): + # imgs: [B, 3, H, W] + + imgs = 2 * imgs - 1 + + posterior = self.vae.encode(imgs).latent_dist + latents = posterior.sample() * 0.18215 + + return latents + + def prompt_to_img(self, prompts, negative_prompts='', height=512, width=512, num_inference_steps=50, + guidance_scale=7.5, latents=None, text_format_dict={}, use_guidance=False, inject_selfattn=0, inject_background=0): + + if isinstance(prompts, str): + prompts = [prompts] + + if isinstance(negative_prompts, str): + negative_prompts = [negative_prompts] + + # Prompts -> text embeds + text_embeds = self.get_text_embeds( + prompts, negative_prompts) # [2, 77, 768] + + # else: + latents = self.produce_latents(text_embeds, height=height, width=width, latents=latents, + num_inference_steps=num_inference_steps, guidance_scale=guidance_scale, + use_guidance=use_guidance, text_format_dict=text_format_dict, + inject_selfattn=inject_selfattn, inject_background=inject_background) # [1, 4, 64, 64] + # Img latents -> imgs + imgs = self.decode_latents(latents) # [1, 3, 512, 512] + + # Img to Numpy + imgs = imgs.detach().cpu().permute(0, 2, 3, 1).numpy() + imgs = (imgs * 255).round().astype('uint8') + + return imgs + + def reset_attention_maps(self): + r"""Function to reset attention maps. + We reset attention maps because we append them while getting hooks + to visualize attention maps for every step. + """ + for key in self.selfattn_maps: + self.selfattn_maps[key] = [] + for key in self.crossattn_maps: + self.crossattn_maps[key] = [] + + def register_evaluation_hooks(self): + r"""Function for registering hooks during evaluation. + We mainly store activation maps averaged over queries. + """ + self.forward_hooks = [] + + def save_activations(activations, name, module, inp, out): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + # out[0] - final output of attention layer + # out[1] - attention probability matrix + if 'attn2' in name: + assert out[1].shape[-1] == 77 + activations[name].append(out[1].detach().cpu()) + else: + assert out[1].shape[-1] != 77 + attention_dict = collections.defaultdict(list) + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name: + # Register hook to obtain outputs at every attention layer. + self.forward_hooks.append(module.register_forward_hook( + partial(save_activations, attention_dict, name) + )) + # attention_dict is a dictionary containing attention maps for every attention layer + self.attention_maps = attention_dict + + def register_selfattn_hooks(self, feat_inject_step=False): + r"""Function for registering hooks during evaluation. + We mainly store activation maps averaged over queries. + """ + self.selfattn_forward_hooks = [] + + def save_activations(activations, name, module, inp, out): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + # out[0] - final output of attention layer + # out[1] - attention probability matrix + if 'attn2' in name: + assert out[1][1].shape[-1] == 77 + # cross attention injection + # activations[name] = out[1][1].detach() + else: + assert out[1][1].shape[-1] != 77 + activations[name] = out[1][1].detach() + + def save_resnet_activations(activations, name, module, inp, out): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + # out[0] - final output of residual layer + # out[1] - residual hidden feature + assert out[1].shape[-1] == 16 + activations[name] = out[1].detach() + attention_dict = collections.defaultdict(list) + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name and feat_inject_step: + # Register hook to obtain outputs at every attention layer. + self.selfattn_forward_hooks.append(module.register_forward_hook( + partial(save_activations, attention_dict, name) + )) + if name == 'up_blocks.1.resnets.1' and feat_inject_step: + self.selfattn_forward_hooks.append(module.register_forward_hook( + partial(save_resnet_activations, attention_dict, name) + )) + # attention_dict is a dictionary containing attention maps for every attention layer + self.self_attention_maps_cur = attention_dict + + def register_replacement_hooks(self, feat_inject_step=False): + r"""Function for registering hooks to replace self attention. + """ + self.forward_replacement_hooks = [] + + def replace_activations(name, module, args): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + if 'attn1' in name: + modified_args = (args[0], self.self_attention_maps_cur[name]) + return modified_args + # cross attention injection + # elif 'attn2' in name: + # modified_map = { + # 'reference': self.self_attention_maps_cur[name], + # 'inject_pos': self.inject_pos, + # } + # modified_args = (args[0], modified_map) + # return modified_args + + def replace_resnet_activations(name, module, args): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + modified_args = (args[0], args[1], + self.self_attention_maps_cur[name]) + return modified_args + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name and feat_inject_step: + # Register hook to obtain outputs at every attention layer. + self.forward_replacement_hooks.append(module.register_forward_pre_hook( + partial(replace_activations, name) + )) + if name == 'up_blocks.1.resnets.1' and feat_inject_step: + # Register hook to obtain outputs at every attention layer. + self.forward_replacement_hooks.append(module.register_forward_pre_hook( + partial(replace_resnet_activations, name) + )) + + def register_tokenmap_hooks(self): + r"""Function for registering hooks during evaluation. + We mainly store activation maps averaged over queries. + """ + self.forward_hooks = [] + + def save_activations(selfattn_maps, crossattn_maps, n_maps, name, module, inp, out): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + # out[0] - final output of attention layer + # out[1] - attention probability matrices + if name in n_maps: + n_maps[name] += 1 + else: + n_maps[name] = 1 + if 'attn2' in name: + assert out[1][0].shape[-1] == 77 + if name in CrossAttentionLayers and n_maps[name] > 10: + if name in crossattn_maps: + crossattn_maps[name] += out[1][0].detach().cpu()[1:2] + else: + crossattn_maps[name] = out[1][0].detach().cpu()[1:2] + else: + assert out[1][0].shape[-1] != 77 + if name in SelfAttentionLayers and n_maps[name] > 10: + if name in crossattn_maps: + selfattn_maps[name] += out[1][0].detach().cpu()[1:2] + else: + selfattn_maps[name] = out[1][0].detach().cpu()[1:2] + + selfattn_maps = collections.defaultdict(list) + crossattn_maps = collections.defaultdict(list) + n_maps = collections.defaultdict(list) + + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name: + # Register hook to obtain outputs at every attention layer. + self.forward_hooks.append(module.register_forward_hook( + partial(save_activations, selfattn_maps, + crossattn_maps, n_maps, name) + )) + # attention_dict is a dictionary containing attention maps for every attention layer + self.selfattn_maps = selfattn_maps + self.crossattn_maps = crossattn_maps + self.n_maps = n_maps + + def remove_tokenmap_hooks(self): + for hook in self.forward_hooks: + hook.remove() + self.selfattn_maps = None + self.crossattn_maps = None + self.n_maps = None + + def remove_evaluation_hooks(self): + for hook in self.forward_hooks: + hook.remove() + self.attention_maps = None + + def remove_replacement_hooks(self): + for hook in self.forward_replacement_hooks: + hook.remove() + + def remove_selfattn_hooks(self): + for hook in self.selfattn_forward_hooks: + hook.remove() + + def register_fontsize_hooks(self, text_format_dict={}): + r"""Function for registering hooks to replace self attention. + """ + self.forward_fontsize_hooks = [] + + def adjust_attn_weights(name, module, args): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + if 'attn2' in name: + modified_args = (args[0], None, attn_weights) + return modified_args + + if 'word_pos' in text_format_dict and text_format_dict['word_pos'] is not None \ + and 'font_size' in text_format_dict and text_format_dict['font_size'] is not None: + attn_weights = {'word_pos': text_format_dict['word_pos'], 'font_size': text_format_dict['font_size']} + else: + attn_weights = None + + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name and attn_weights is not None: + # Register hook to obtain outputs at every attention layer. + self.forward_fontsize_hooks.append(module.register_forward_pre_hook( + partial(adjust_attn_weights, name) + )) + + def remove_fontsize_hooks(self): + for hook in self.forward_fontsize_hooks: + hook.remove() \ No newline at end of file diff --git a/gf/sd-webui-rich-text/scripts/models/region_diffusion_xl.py b/gf/sd-webui-rich-text/scripts/models/region_diffusion_xl.py new file mode 100644 index 0000000000000000000000000000000000000000..021d16125ef0c652b6963d76fe8a7ad4fb602d5c --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/region_diffusion_xl.py @@ -0,0 +1,1150 @@ +# Adapted from diffusers_official.pipelines.stable_diffusion.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.py + +import inspect +from typing import Any, Callable, Dict, List, Optional, Tuple, Union + +import torch +from transformers import CLIPTextModel, CLIPTextModelWithProjection, CLIPTokenizer + +from diffusers_official.image_processor import VaeImageProcessor +from diffusers_official.loaders import FromSingleFileMixin, LoraLoaderMixin, TextualInversionLoaderMixin +# from diffusers_official.models import AutoencoderKL, UNet2DConditionModel +from diffusers_official.models import AutoencoderKL + +from diffusers_official.models.attention_processor import ( + AttnProcessor2_0, + LoRAAttnProcessor2_0, + LoRAXFormersAttnProcessor, + XFormersAttnProcessor, +) +from diffusers_official.schedulers import EulerDiscreteScheduler +from diffusers_official.utils import ( + is_accelerate_available, + is_accelerate_version, + logging, + randn_tensor, + replace_example_docstring, +) +from diffusers_official.pipelines.pipeline_utils import DiffusionPipeline +from diffusers_official.pipelines.stable_diffusion_xl.watermark import StableDiffusionXLWatermarker + +### cutomized modules +import collections +from functools import partial +from diffusers_official.pipelines.stable_diffusion_xl import StableDiffusionXLPipelineOutput + +from scripts.models.unet_2d_condition import UNet2DConditionModel +from scripts.models.utils.attention_utils import CrossAttentionLayers_XL + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +def rescale_noise_cfg(noise_cfg, noise_pred_text, guidance_rescale=0.0): + """ + Rescale `noise_cfg` according to `guidance_rescale`. Based on findings of [Common Diffusion Noise Schedules and + Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf). See Section 3.4 + """ + std_text = noise_pred_text.std(dim=list(range(1, noise_pred_text.ndim)), keepdim=True) + std_cfg = noise_cfg.std(dim=list(range(1, noise_cfg.ndim)), keepdim=True) + # rescale the results from guidance (fixes overexposure) + noise_pred_rescaled = noise_cfg * (std_text / std_cfg) + # mix with the original results from guidance by factor guidance_rescale to avoid "plain looking" images + noise_cfg = guidance_rescale * noise_pred_rescaled + (1 - guidance_rescale) * noise_cfg + return noise_cfg + + +class RegionDiffusionXL(DiffusionPipeline, FromSingleFileMixin): + r""" + Pipeline for text-to-image generation using Stable Diffusion. + + This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the + library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.) + + In addition the pipeline inherits the following loading methods: + - *Textual-Inversion*: [`loaders.TextualInversionLoaderMixin.load_textual_inversion`] + - *LoRA*: [`loaders.LoraLoaderMixin.load_lora_weights`] + - *Ckpt*: [`loaders.FromSingleFileMixin.from_single_file`] + + as well as the following saving methods: + - *LoRA*: [`loaders.LoraLoaderMixin.save_lora_weights`] + + Args: + vae ([`AutoencoderKL`]): + Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations. + text_encoder ([`CLIPTextModel`]): + Frozen text-encoder. Stable Diffusion uses the text portion of + [CLIP](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel), specifically + the [clip-vit-large-patch14](https://huggingface.co/openai/clip-vit-large-patch14) variant. + tokenizer (`CLIPTokenizer`): + Tokenizer of class + [CLIPTokenizer](https://huggingface.co/docs/transformers/v4.21.0/en/model_doc/clip#transformers.CLIPTokenizer). + unet ([`UNet2DConditionModel`]): Conditional U-Net architecture to denoise the encoded image latents. + scheduler ([`SchedulerMixin`]): + A scheduler to be used in combination with `unet` to denoise the encoded image latents. Can be one of + [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. + """ + + def __init__( + self, + load_path: str = "stabilityai/stable-diffusion-xl-base-1.0", + device: str = "cuda", + force_zeros_for_empty_prompt: bool = True, + ): + super().__init__() + + # self.register_modules( + # vae=vae, + # text_encoder=text_encoder, + # text_encoder_2=text_encoder_2, + # tokenizer=tokenizer, + # tokenizer_2=tokenizer_2, + # unet=unet, + # scheduler=scheduler, + # ) + self.model_id = load_path + + variant = "fp16" if "stable-diffusion-xl" in load_path else None + + for load_attemp in range(10): + print(f'[INFO] loading stable diffusion {self.model_id} ... attempt {load_attemp}') + # try: + # 1. Load the autoencoder model which will be used to decode the latents into image space. + self.vae = AutoencoderKL.from_pretrained(load_path, subfolder="vae", use_safetensors=True, variant=variant).to(device) + # 2. Load the tokenizer and text encoder to tokenize and encode the text. + self.tokenizer = CLIPTokenizer.from_pretrained(load_path, subfolder='tokenizer') + self.tokenizer_2 = CLIPTokenizer.from_pretrained(load_path, subfolder='tokenizer_2') + self.text_encoder = CLIPTextModel.from_pretrained(load_path, subfolder='text_encoder', torch_dtype=torch.float16, use_safetensors=True, variant=variant).to(device) + self.text_encoder_2 = CLIPTextModelWithProjection.from_pretrained(load_path, subfolder='text_encoder_2', torch_dtype=torch.float16, use_safetensors=True, variant=variant).to(device) + # 3. The UNet model for generating the latents. + self.unet = UNet2DConditionModel.from_pretrained(load_path, subfolder="unet", torch_dtype=torch.float16, use_safetensors=True, variant=variant).to(device) + # 4. Scheduler. + self.scheduler = EulerDiscreteScheduler.from_pretrained(load_path, subfolder="scheduler") + # except Exception as e: + # print(f'[INFO] failed to load stable diffusion {self.model_id} ... error {e}') + # continue + if self.unet is not None and self.vae is not None and self.text_encoder is not None: + break + + self.register_to_config(force_zeros_for_empty_prompt=force_zeros_for_empty_prompt) + self.vae_scale_factor = 2 ** (len(self.vae.config.block_out_channels) - 1) + self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor) + self.default_sample_size = self.unet.config.sample_size + + self.watermark = StableDiffusionXLWatermarker() + + self.device_type = device + + self.masks = [] + self.attention_maps = None + self.selfattn_maps = None + self.crossattn_maps = None + self.color_loss = torch.nn.functional.mse_loss + self.forward_hooks = [] + self.forward_replacement_hooks = [] + + # Overwriting the method from diffusers_official.pipelines.diffusion_pipeline.DiffusionPipeline + @property + def device(self) -> torch.device: + r""" + Returns: + `torch.device`: The torch device on which the pipeline is located. + """ + + return torch.device(self.device_type) + + # Copied from diffusers_official.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.enable_vae_slicing + def enable_vae_slicing(self): + r""" + Enable sliced VAE decoding. + + When this option is enabled, the VAE will split the input tensor in slices to compute decoding in several + steps. This is useful to save some memory and allow larger batch sizes. + """ + self.vae.enable_slicing() + + # Copied from diffusers_official.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.disable_vae_slicing + def disable_vae_slicing(self): + r""" + Disable sliced VAE decoding. If `enable_vae_slicing` was previously invoked, this method will go back to + computing decoding in one step. + """ + self.vae.disable_slicing() + + # Copied from diffusers_official.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.enable_vae_tiling + def enable_vae_tiling(self): + r""" + Enable tiled VAE decoding. + + When this option is enabled, the VAE will split the input tensor into tiles to compute decoding and encoding in + several steps. This is useful to save a large amount of memory and to allow the processing of larger images. + """ + self.vae.enable_tiling() + + # Copied from diffusers_official.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.disable_vae_tiling + def disable_vae_tiling(self): + r""" + Disable tiled VAE decoding. If `enable_vae_tiling` was previously invoked, this method will go back to + computing decoding in one step. + """ + self.vae.disable_tiling() + + def enable_sequential_cpu_offload(self, gpu_id=0): + r""" + Offloads all models to CPU using accelerate, significantly reducing memory usage. When called, unet, + text_encoder, vae and safety checker have their state dicts saved to CPU and then are moved to a + `torch.device('meta') and loaded to GPU only when their specific submodule has its `forward` method called. + Note that offloading happens on a submodule basis. Memory savings are higher than with + `enable_model_cpu_offload`, but performance is lower. + """ + if is_accelerate_available() and is_accelerate_version(">=", "0.14.0"): + from accelerate import cpu_offload + else: + raise ImportError("`enable_sequential_cpu_offload` requires `accelerate v0.14.0` or higher") + + device = torch.device(f"cuda:{gpu_id}") + + if self.device.type != "cpu": + self.to("cpu", silence_dtype_warnings=True) + torch.cuda.empty_cache() # otherwise we don't see the memory savings (but they probably exist) + + for cpu_offloaded_model in [self.unet, self.text_encoder, self.text_encoder_2, self.vae]: + cpu_offload(cpu_offloaded_model, device) + + def enable_model_cpu_offload(self, gpu_id=0): + r""" + Offloads all models to CPU using accelerate, reducing memory usage with a low impact on performance. Compared + to `enable_sequential_cpu_offload`, this method moves one whole model at a time to the GPU when its `forward` + method is called, and the model remains in GPU until the next model runs. Memory savings are lower than with + `enable_sequential_cpu_offload`, but performance is much better due to the iterative execution of the `unet`. + """ + if is_accelerate_available() and is_accelerate_version(">=", "0.17.0.dev0"): + from accelerate import cpu_offload_with_hook + else: + raise ImportError("`enable_model_cpu_offload` requires `accelerate v0.17.0` or higher.") + + device = torch.device(f"cuda:{gpu_id}") + + if self.device.type != "cpu": + self.to("cpu", silence_dtype_warnings=True) + torch.cuda.empty_cache() # otherwise we don't see the memory savings (but they probably exist) + + model_sequence = ( + [self.text_encoder, self.text_encoder_2] if self.text_encoder is not None else [self.text_encoder_2] + ) + model_sequence.extend([self.unet, self.vae]) + + hook = None + for cpu_offloaded_model in model_sequence: + _, hook = cpu_offload_with_hook(cpu_offloaded_model, device, prev_module_hook=hook) + + # We'll offload the last model manually. + self.final_offload_hook = hook + + @property + # Copied from diffusers_official.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline._execution_device + def _execution_device(self): + r""" + Returns the device on which the pipeline's models will be executed. After calling + `pipeline.enable_sequential_cpu_offload()` the execution device can only be inferred from Accelerate's module + hooks. + """ + if not hasattr(self.unet, "_hf_hook"): + return self.device + for module in self.unet.modules(): + if ( + hasattr(module, "_hf_hook") + and hasattr(module._hf_hook, "execution_device") + and module._hf_hook.execution_device is not None + ): + return torch.device(module._hf_hook.execution_device) + return self.device + + def encode_prompt( + self, + prompt, + device: Optional[torch.device] = None, + num_images_per_prompt: int = 1, + do_classifier_free_guidance: bool = True, + negative_prompt=None, + prompt_embeds: Optional[torch.FloatTensor] = None, + negative_prompt_embeds: Optional[torch.FloatTensor] = None, + pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + negative_pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + lora_scale: Optional[float] = None, + ): + r""" + Encodes the prompt into text encoder hidden states. + + Args: + prompt (`str` or `List[str]`, *optional*): + prompt to be encoded + device: (`torch.device`): + torch device + num_images_per_prompt (`int`): + number of images that should be generated per prompt + do_classifier_free_guidance (`bool`): + whether to use classifier free guidance or not + negative_prompt (`str` or `List[str]`, *optional*): + The prompt or prompts not to guide the image generation. If not defined, one has to pass + `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is + less than `1`). + prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not + provided, text embeddings will be generated from `prompt` input argument. + negative_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input + argument. + pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. + If not provided, pooled text embeddings will be generated from `prompt` input argument. + negative_pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, pooled negative_prompt_embeds will be generated from `negative_prompt` + input argument. + lora_scale (`float`, *optional*): + A lora scale that will be applied to all LoRA layers of the text encoder if LoRA layers are loaded. + """ + device = device or self._execution_device + + # set lora scale so that monkey patched LoRA + # function of text encoder can correctly access it + if lora_scale is not None and isinstance(self, LoraLoaderMixin): + self._lora_scale = lora_scale + + if prompt is not None and isinstance(prompt, str): + batch_size = 1 + elif prompt is not None and isinstance(prompt, list): + batch_size = len(prompt) + batch_size_neg = len(negative_prompt) + else: + batch_size = prompt_embeds.shape[0] + + # Define tokenizers and text encoders + tokenizers = [self.tokenizer, self.tokenizer_2] if self.tokenizer is not None else [self.tokenizer_2] + text_encoders = ( + [self.text_encoder, self.text_encoder_2] if self.text_encoder is not None else [self.text_encoder_2] + ) + + if prompt_embeds is None: + # textual inversion: procecss multi-vector tokens if necessary + prompt_embeds_list = [] + for tokenizer, text_encoder in zip(tokenizers, text_encoders): + if isinstance(self, TextualInversionLoaderMixin): + prompt = self.maybe_convert_prompt(prompt, tokenizer) + + text_inputs = tokenizer( + prompt, + padding="max_length", + max_length=tokenizer.model_max_length, + truncation=True, + return_tensors="pt", + ) + text_input_ids = text_inputs.input_ids + untruncated_ids = tokenizer(prompt, padding="longest", return_tensors="pt").input_ids + + if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal( + text_input_ids, untruncated_ids + ): + removed_text = tokenizer.batch_decode(untruncated_ids[:, tokenizer.model_max_length - 1 : -1]) + logger.warning( + "The following part of your input was truncated because CLIP can only handle sequences up to" + f" {tokenizer.model_max_length} tokens: {removed_text}" + ) + + prompt_embeds = text_encoder( + text_input_ids.to(device), + output_hidden_states=True, + ) + + # We are only ALWAYS interested in the pooled output of the final text encoder + pooled_prompt_embeds = prompt_embeds[0] + prompt_embeds = prompt_embeds.hidden_states[-2] + + bs_embed, seq_len, _ = prompt_embeds.shape + # duplicate text embeddings for each generation per prompt, using mps friendly method + prompt_embeds = prompt_embeds.repeat(1, num_images_per_prompt, 1) + prompt_embeds = prompt_embeds.view(bs_embed * num_images_per_prompt, seq_len, -1) + + prompt_embeds_list.append(prompt_embeds) + + prompt_embeds = torch.concat(prompt_embeds_list, dim=-1) + + # get unconditional embeddings for classifier free guidance + zero_out_negative_prompt = negative_prompt is None and self.config.force_zeros_for_empty_prompt + if do_classifier_free_guidance and negative_prompt_embeds is None and zero_out_negative_prompt: + negative_prompt_embeds = torch.zeros_like(prompt_embeds) + negative_pooled_prompt_embeds = torch.zeros_like(pooled_prompt_embeds) + elif do_classifier_free_guidance and negative_prompt_embeds is None: + negative_prompt = negative_prompt or "" + uncond_tokens: List[str] + if prompt is not None and type(prompt) is not type(negative_prompt): + raise TypeError( + f"`negative_prompt` should be the same type to `prompt`, but got {type(negative_prompt)} !=" + f" {type(prompt)}." + ) + elif isinstance(negative_prompt, str): + uncond_tokens = [negative_prompt] + # elif batch_size != len(negative_prompt): + # raise ValueError( + # f"`negative_prompt`: {negative_prompt} has batch size {len(negative_prompt)}, but `prompt`:" + # f" {prompt} has batch size {batch_size}. Please make sure that passed `negative_prompt` matches" + # " the batch size of `prompt`." + # ) + else: + uncond_tokens = negative_prompt + + negative_prompt_embeds_list = [] + for tokenizer, text_encoder in zip(tokenizers, text_encoders): + # textual inversion: procecss multi-vector tokens if necessary + if isinstance(self, TextualInversionLoaderMixin): + uncond_tokens = self.maybe_convert_prompt(uncond_tokens, tokenizer) + + max_length = prompt_embeds.shape[1] + uncond_input = tokenizer( + uncond_tokens, + padding="max_length", + max_length=max_length, + truncation=True, + return_tensors="pt", + ) + + negative_prompt_embeds = text_encoder( + uncond_input.input_ids.to(device), + output_hidden_states=True, + ) + # We are only ALWAYS interested in the pooled output of the final text encoder + negative_pooled_prompt_embeds = negative_prompt_embeds[0] + negative_prompt_embeds = negative_prompt_embeds.hidden_states[-2] + + if do_classifier_free_guidance: + # duplicate unconditional embeddings for each generation per prompt, using mps friendly method + seq_len = negative_prompt_embeds.shape[1] + + negative_prompt_embeds = negative_prompt_embeds.to(dtype=text_encoder.dtype, device=device) + + negative_prompt_embeds = negative_prompt_embeds.repeat(1, num_images_per_prompt, 1) + negative_prompt_embeds = negative_prompt_embeds.view( + batch_size_neg * num_images_per_prompt, seq_len, -1 + ) + + # For classifier free guidance, we need to do two forward passes. + # Here we concatenate the unconditional and text embeddings into a single batch + # to avoid doing two forward passes + + negative_prompt_embeds_list.append(negative_prompt_embeds) + + negative_prompt_embeds = torch.concat(negative_prompt_embeds_list, dim=-1) + + bs_embed = pooled_prompt_embeds.shape[0] + pooled_prompt_embeds = pooled_prompt_embeds.repeat(1, num_images_per_prompt).view( + bs_embed * num_images_per_prompt, -1 + ) + bs_embed = negative_pooled_prompt_embeds.shape[0] + negative_pooled_prompt_embeds = negative_pooled_prompt_embeds.repeat(1, num_images_per_prompt).view( + bs_embed * num_images_per_prompt, -1 + ) + + return prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds + + # Copied from diffusers_official.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.prepare_extra_step_kwargs + def prepare_extra_step_kwargs(self, generator, eta): + # prepare extra kwargs for the scheduler step, since not all schedulers have the same signature + # eta (η) is only used with the DDIMScheduler, it will be ignored for other schedulers. + # eta corresponds to η in DDIM paper: https://arxiv.org/abs/2010.02502 + # and should be between [0, 1] + + accepts_eta = "eta" in set(inspect.signature(self.scheduler.step).parameters.keys()) + extra_step_kwargs = {} + if accepts_eta: + extra_step_kwargs["eta"] = eta + + # check if the scheduler accepts generator + accepts_generator = "generator" in set(inspect.signature(self.scheduler.step).parameters.keys()) + if accepts_generator: + extra_step_kwargs["generator"] = generator + return extra_step_kwargs + + def check_inputs( + self, + prompt, + height, + width, + callback_steps, + negative_prompt=None, + prompt_embeds=None, + negative_prompt_embeds=None, + pooled_prompt_embeds=None, + negative_pooled_prompt_embeds=None, + ): + if height % 8 != 0 or width % 8 != 0: + raise ValueError(f"`height` and `width` have to be divisible by 8 but are {height} and {width}.") + + if (callback_steps is None) or ( + callback_steps is not None and (not isinstance(callback_steps, int) or callback_steps <= 0) + ): + raise ValueError( + f"`callback_steps` has to be a positive integer but is {callback_steps} of type" + f" {type(callback_steps)}." + ) + + if prompt is not None and prompt_embeds is not None: + raise ValueError( + f"Cannot forward both `prompt`: {prompt} and `prompt_embeds`: {prompt_embeds}. Please make sure to" + " only forward one of the two." + ) + elif prompt is None and prompt_embeds is None: + raise ValueError( + "Provide either `prompt` or `prompt_embeds`. Cannot leave both `prompt` and `prompt_embeds` undefined." + ) + elif prompt is not None and (not isinstance(prompt, str) and not isinstance(prompt, list)): + raise ValueError(f"`prompt` has to be of type `str` or `list` but is {type(prompt)}") + + if negative_prompt is not None and negative_prompt_embeds is not None: + raise ValueError( + f"Cannot forward both `negative_prompt`: {negative_prompt} and `negative_prompt_embeds`:" + f" {negative_prompt_embeds}. Please make sure to only forward one of the two." + ) + + if prompt_embeds is not None and negative_prompt_embeds is not None: + if prompt_embeds.shape != negative_prompt_embeds.shape: + raise ValueError( + "`prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but" + f" got: `prompt_embeds` {prompt_embeds.shape} != `negative_prompt_embeds`" + f" {negative_prompt_embeds.shape}." + ) + + if prompt_embeds is not None and pooled_prompt_embeds is None: + raise ValueError( + "If `prompt_embeds` are provided, `pooled_prompt_embeds` also have to be passed. Make sure to generate `pooled_prompt_embeds` from the same text encoder that was used to generate `prompt_embeds`." + ) + + if negative_prompt_embeds is not None and negative_pooled_prompt_embeds is None: + raise ValueError( + "If `negative_prompt_embeds` are provided, `negative_pooled_prompt_embeds` also have to be passed. Make sure to generate `negative_pooled_prompt_embeds` from the same text encoder that was used to generate `negative_prompt_embeds`." + ) + + # Copied from diffusers_official.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline.prepare_latents + def prepare_latents(self, batch_size, num_channels_latents, height, width, dtype, device, generator, latents=None): + shape = (batch_size, num_channels_latents, height // self.vae_scale_factor, width // self.vae_scale_factor) + if isinstance(generator, list) and len(generator) != batch_size: + raise ValueError( + f"You have passed a list of generators of length {len(generator)}, but requested an effective batch" + f" size of {batch_size}. Make sure the batch size matches the length of the generators." + ) + + if latents is None: + latents = randn_tensor(shape, generator=generator, device=device, dtype=dtype) + else: + latents = latents.to(device) + + # scale the initial noise by the standard deviation required by the scheduler + latents = latents * self.scheduler.init_noise_sigma + return latents + + def _get_add_time_ids(self, original_size, crops_coords_top_left, target_size, dtype): + add_time_ids = list(original_size + crops_coords_top_left + target_size) + + passed_add_embed_dim = ( + self.unet.config.addition_time_embed_dim * len(add_time_ids) + self.text_encoder_2.config.projection_dim + ) + expected_add_embed_dim = self.unet.add_embedding.linear_1.in_features + + if expected_add_embed_dim != passed_add_embed_dim: + raise ValueError( + f"Model expects an added time embedding vector of length {expected_add_embed_dim}, but a vector of {passed_add_embed_dim} was created. The model has an incorrect config. Please check `unet.config.time_embedding_type` and `text_encoder_2.config.projection_dim`." + ) + + add_time_ids = torch.tensor([add_time_ids], dtype=dtype) + return add_time_ids + + @torch.no_grad() + def sample( + self, + prompt: Union[str, List[str]] = None, + height: Optional[int] = None, + width: Optional[int] = None, + num_inference_steps: int = 50, + guidance_scale: float = 5.0, + negative_prompt: Optional[Union[str, List[str]]] = None, + num_images_per_prompt: Optional[int] = 1, + eta: float = 0.0, + generator: Optional[Union[torch.Generator, List[torch.Generator]]] = None, + latents: Optional[torch.FloatTensor] = None, + prompt_embeds: Optional[torch.FloatTensor] = None, + negative_prompt_embeds: Optional[torch.FloatTensor] = None, + pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + negative_pooled_prompt_embeds: Optional[torch.FloatTensor] = None, + output_type: Optional[str] = "pil", + return_dict: bool = True, + callback: Optional[Callable[[int, int, torch.FloatTensor], None]] = None, + callback_steps: int = 1, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + guidance_rescale: float = 0.0, + original_size: Optional[Tuple[int, int]] = None, + crops_coords_top_left: Tuple[int, int] = (0, 0), + target_size: Optional[Tuple[int, int]] = None, + # Rich-Text args + use_guidance: bool = False, + inject_selfattn: float = 0.0, + inject_background: float = 0.0, + text_format_dict: Optional[dict] = None, + run_rich_text: bool = False, + ): + r""" + Function invoked when calling the pipeline for generation. + + Args: + prompt (`str` or `List[str]`, *optional*): + The prompt or prompts to guide the image generation. If not defined, one has to pass `prompt_embeds`. + instead. + height (`int`, *optional*, defaults to self.unet.config.sample_size * self.vae_scale_factor): + The height in pixels of the generated image. + width (`int`, *optional*, defaults to self.unet.config.sample_size * self.vae_scale_factor): + The width in pixels of the generated image. + num_inference_steps (`int`, *optional*, defaults to 50): + The number of denoising steps. More denoising steps usually lead to a higher quality image at the + expense of slower inference. + guidance_scale (`float`, *optional*, defaults to 7.5): + Guidance scale as defined in [Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598). + `guidance_scale` is defined as `w` of equation 2. of [Imagen + Paper](https://arxiv.org/pdf/2205.11487.pdf). Guidance scale is enabled by setting `guidance_scale > + 1`. Higher guidance scale encourages to generate images that are closely linked to the text `prompt`, + usually at the expense of lower image quality. + negative_prompt (`str` or `List[str]`, *optional*): + The prompt or prompts not to guide the image generation. If not defined, one has to pass + `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is + less than `1`). + num_images_per_prompt (`int`, *optional*, defaults to 1): + The number of images to generate per prompt. + eta (`float`, *optional*, defaults to 0.0): + Corresponds to parameter eta (η) in the DDIM paper: https://arxiv.org/abs/2010.02502. Only applies to + [`schedulers.DDIMScheduler`], will be ignored for others. + generator (`torch.Generator` or `List[torch.Generator]`, *optional*): + One or a list of [torch generator(s)](https://pytorch.org/docs/stable/generated/torch.Generator.html) + to make generation deterministic. + latents (`torch.FloatTensor`, *optional*): + Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for image + generation. Can be used to tweak the same generation with different prompts. If not provided, a latents + tensor will ge generated by sampling using the supplied random `generator`. + prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not + provided, text embeddings will be generated from `prompt` input argument. + negative_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input + argument. + pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. + If not provided, pooled text embeddings will be generated from `prompt` input argument. + negative_pooled_prompt_embeds (`torch.FloatTensor`, *optional*): + Pre-generated negative pooled text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt + weighting. If not provided, pooled negative_prompt_embeds will be generated from `negative_prompt` + input argument. + output_type (`str`, *optional*, defaults to `"pil"`): + The output format of the generate image. Choose between + [PIL](https://pillow.readthedocs.io/en/stable/): `PIL.Image.Image` or `np.array`. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] instead of a + plain tuple. + callback (`Callable`, *optional*): + A function that will be called every `callback_steps` steps during inference. The function will be + called with the following arguments: `callback(step: int, timestep: int, latents: torch.FloatTensor)`. + callback_steps (`int`, *optional*, defaults to 1): + The frequency at which the `callback` function will be called. If not specified, the callback will be + called at every step. + cross_attention_kwargs (`dict`, *optional*): + A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under + `self.processor` in + [diffusers.cross_attention](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/cross_attention.py). + guidance_rescale (`float`, *optional*, defaults to 0.7): + Guidance rescale factor proposed by [Common Diffusion Noise Schedules and Sample Steps are + Flawed](https://arxiv.org/pdf/2305.08891.pdf) `guidance_scale` is defined as `φ` in equation 16. of + [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf). + Guidance rescale factor should fix overexposure when using zero terminal SNR. + original_size (`Tuple[int]`, *optional*, defaults to (1024, 1024)): + TODO + crops_coords_top_left (`Tuple[int]`, *optional*, defaults to (0, 0)): + TODO + target_size (`Tuple[int]`, *optional*, defaults to (1024, 1024)): + TODO + + Examples: + + Returns: + [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] or `tuple`: + [`~pipelines.stable_diffusion.StableDiffusionXLPipelineOutput`] if `return_dict` is True, otherwise a + `tuple. When returning a tuple, the first element is a list with the generated images, and the second + element is a list of `bool`s denoting whether the corresponding generated image likely represents + "not-safe-for-work" (nsfw) content, according to the `safety_checker`. + """ + # 0. Default height and width to unet + height = height or self.default_sample_size * self.vae_scale_factor + width = width or self.default_sample_size * self.vae_scale_factor + + original_size = original_size or (height, width) + target_size = target_size or (height, width) + + # 1. Check inputs. Raise error if not correct + self.check_inputs( + prompt, + height, + width, + callback_steps, + negative_prompt, + prompt_embeds, + negative_prompt_embeds, + pooled_prompt_embeds, + negative_pooled_prompt_embeds, + ) + + # 2. Define call parameters + if prompt is not None and isinstance(prompt, str): + batch_size = 1 + elif prompt is not None and isinstance(prompt, list): + # TODO: support batched prompts + batch_size = 1 + # batch_size = len(prompt) + else: + batch_size = prompt_embeds.shape[0] + + device = self._execution_device + + # here `guidance_scale` is defined analog to the guidance weight `w` of equation (2) + # of the Imagen paper: https://arxiv.org/pdf/2205.11487.pdf . `guidance_scale = 1` + # corresponds to doing no classifier free guidance. + do_classifier_free_guidance = guidance_scale > 1.0 + + # 3. Encode input prompt + text_encoder_lora_scale = ( + cross_attention_kwargs.get("scale", None) if cross_attention_kwargs is not None else None + ) + ( + prompt_embeds, + negative_prompt_embeds, + pooled_prompt_embeds, + negative_pooled_prompt_embeds, + ) = self.encode_prompt( + prompt, + device, + num_images_per_prompt, + do_classifier_free_guidance, + negative_prompt, + prompt_embeds=prompt_embeds, + negative_prompt_embeds=negative_prompt_embeds, + pooled_prompt_embeds=pooled_prompt_embeds, + negative_pooled_prompt_embeds=negative_pooled_prompt_embeds, + lora_scale=text_encoder_lora_scale, + ) + + # 4. Prepare timesteps + self.scheduler.set_timesteps(num_inference_steps, device=device) + + timesteps = self.scheduler.timesteps + + # 5. Prepare latent variables + num_channels_latents = self.unet.config.in_channels + latents = self.prepare_latents( + batch_size * num_images_per_prompt, + num_channels_latents, + height, + width, + prompt_embeds.dtype, + device, + generator, + latents, + ) + + # 6. Prepare extra step kwargs. TODO: Logic should ideally just be moved out of the pipeline + extra_step_kwargs = self.prepare_extra_step_kwargs(generator, eta) + + # 7. Prepare added time ids & embeddings + add_text_embeds = pooled_prompt_embeds + add_time_ids = self._get_add_time_ids( + original_size, crops_coords_top_left, target_size, dtype=prompt_embeds.dtype + ) + + if do_classifier_free_guidance: + prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds], dim=0) + add_text_embeds = torch.cat([negative_pooled_prompt_embeds, add_text_embeds], dim=0) + add_time_ids = torch.cat([add_time_ids, add_time_ids], dim=0) + + prompt_embeds = prompt_embeds.to(device) + add_text_embeds = add_text_embeds.to(device) + add_time_ids = add_time_ids.to(device).repeat(batch_size * num_images_per_prompt, 1) + + # make sure the VAE is in float32 mode, as it overflows in float16 + self.vae.to(dtype=torch.float32) + + # 8. Denoising loop + num_warmup_steps = len(timesteps) - num_inference_steps * self.scheduler.order + if run_rich_text: + if inject_selfattn > 0 or inject_background > 0: + latents_reference = latents.clone().detach() + n_styles = prompt_embeds.shape[0]-1 + self.masks = [mask.to(dtype=prompt_embeds.dtype) for mask in self.masks] + print(n_styles, len(self.masks)) + with self.progress_bar(total=num_inference_steps) as progress_bar: + for i, t in enumerate(self.scheduler.timesteps): + # predict the noise residual + with torch.no_grad(): + feat_inject_step = t > (1-inject_selfattn) * 1000 + background_inject_step = i < inject_background * len(self.scheduler.timesteps) + latent_model_input = self.scheduler.scale_model_input(latents, t) + # import ipdb;ipdb.set_trace() + # unconditional prediction + noise_pred_uncond_cur = self.unet(latent_model_input, t, encoder_hidden_states=prompt_embeds[:1], + cross_attention_kwargs=cross_attention_kwargs, + added_cond_kwargs={"text_embeds": add_text_embeds[:1], "time_ids": add_time_ids[:1]} + )['sample'] + # tokens without any style or footnote + self.register_fontsize_hooks(text_format_dict) + noise_pred_text_cur = self.unet(latent_model_input, t, encoder_hidden_states=prompt_embeds[-1:], + cross_attention_kwargs=cross_attention_kwargs, + added_cond_kwargs={"text_embeds": add_text_embeds[-1:], "time_ids": add_time_ids[:1]} + )['sample'] + self.remove_fontsize_hooks() + if inject_selfattn > 0 or inject_background > 0: + latent_reference_model_input = self.scheduler.scale_model_input(latents_reference, t) + noise_pred_uncond_refer = self.unet(latent_reference_model_input, t, encoder_hidden_states=prompt_embeds[:1], + cross_attention_kwargs=cross_attention_kwargs, + added_cond_kwargs={"text_embeds": add_text_embeds[:1], "time_ids": add_time_ids[:1]} + )['sample'] + self.register_selfattn_hooks(feat_inject_step) + noise_pred_text_refer = self.unet(latent_reference_model_input, t, encoder_hidden_states=prompt_embeds[-1:], + cross_attention_kwargs=cross_attention_kwargs, + added_cond_kwargs={"text_embeds": add_text_embeds[-1:], "time_ids": add_time_ids[:1]} + )['sample'] + self.remove_selfattn_hooks() + noise_pred_uncond = noise_pred_uncond_cur * self.masks[-1] + noise_pred_text = noise_pred_text_cur * self.masks[-1] + # tokens with style or footnote + for style_i, mask in enumerate(self.masks[:-1]): + self.register_replacement_hooks(feat_inject_step) + noise_pred_text_cur = self.unet(latent_model_input, t, encoder_hidden_states=prompt_embeds[style_i+1:style_i+2], + cross_attention_kwargs=cross_attention_kwargs, + added_cond_kwargs={"text_embeds": add_text_embeds[style_i+1:style_i+2], "time_ids": add_time_ids[:1]} + )['sample'] + self.remove_replacement_hooks() + noise_pred_uncond = noise_pred_uncond + noise_pred_uncond_cur*mask + noise_pred_text = noise_pred_text + noise_pred_text_cur*mask + + # perform guidance + noise_pred = noise_pred_uncond + guidance_scale * \ + (noise_pred_text - noise_pred_uncond) + + if do_classifier_free_guidance and guidance_rescale > 0.0: + # TODO: Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf + # noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=guidance_rescale) + raise NotImplementedError + + if inject_selfattn > 0 or background_inject_step > 0: + noise_pred_refer = noise_pred_uncond_refer + guidance_scale * \ + (noise_pred_text_refer - noise_pred_uncond_refer) + + # compute the previous noisy sample x_t -> x_t-1 + latents_reference = self.scheduler.step(torch.cat([noise_pred, noise_pred_refer]), t, + torch.cat([latents, latents_reference]))[ + 'prev_sample'] + latents, latents_reference = torch.chunk( + latents_reference, 2, dim=0) + + else: + # compute the previous noisy sample x_t -> x_t-1 + latents = self.scheduler.step(noise_pred, t, latents)[ + 'prev_sample'] + + # apply guidance + if use_guidance and t < text_format_dict['guidance_start_step']: + with torch.enable_grad(): + if not latents.requires_grad: + latents.requires_grad = True + # import ipdb;ipdb.set_trace() + # latents_0 = self.predict_x0(latents, noise_pred, t).to(dtype=latents.dtype) + latents_0 = self.predict_x0(latents, noise_pred, t).to(dtype=torch.bfloat16) + latents_inp = latents_0 / self.vae.config.scaling_factor + # imgs = self.vae.decode(latents_inp.to(dtype=torch.float32)).sample + imgs = self.vae.to(dtype=latents_inp.dtype).decode(latents_inp).sample + imgs = (imgs / 2 + 0.5).clamp(0, 1) + loss_total = 0. + for attn_map, rgb_val in zip(text_format_dict['color_obj_atten'], text_format_dict['target_RGB']): + avg_rgb = ( + imgs*attn_map[:, 0]).sum(2).sum(2)/attn_map[:, 0].sum() + loss = self.color_loss( + avg_rgb, rgb_val[:, :, 0, 0])*100 + loss_total += loss + loss_total.backward() + latents = ( + latents - latents.grad * text_format_dict['color_guidance_weight'] * text_format_dict['color_obj_atten_all']).detach().clone().to(dtype=prompt_embeds.dtype) + self.unet.to(device=latents.device) + + # apply background injection + if i == int(inject_background * len(self.scheduler.timesteps)) and inject_background > 0: + latents = latents_reference * self.masks[-1] + latents * \ + (1-self.masks[-1]) + + # call the callback, if provided + if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0): + progress_bar.update() + if callback is not None and i % callback_steps == 0: + callback(i, t, latents) + else: + with self.progress_bar(total=num_inference_steps) as progress_bar: + for i, t in enumerate(timesteps): + # expand the latents if we are doing classifier free guidance + latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents + + latent_model_input = self.scheduler.scale_model_input(latent_model_input, t) + + # predict the noise residual + added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids} + noise_pred = self.unet( + latent_model_input, + t, + encoder_hidden_states=prompt_embeds, + cross_attention_kwargs=cross_attention_kwargs, + added_cond_kwargs=added_cond_kwargs, + return_dict=False, + )[0] + + # perform guidance + if do_classifier_free_guidance: + noise_pred_uncond, noise_pred_text = noise_pred.chunk(2) + noise_pred = noise_pred_uncond + guidance_scale * (noise_pred_text - noise_pred_uncond) + + if do_classifier_free_guidance and guidance_rescale > 0.0: + # Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf + noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=guidance_rescale) + + # compute the previous noisy sample x_t -> x_t-1 + latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0] + + # call the callback, if provided + if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0): + progress_bar.update() + if callback is not None and i % callback_steps == 0: + callback(i, t, latents) + + + use_torch_2_0_or_xformers = isinstance( + self.vae.decoder.mid_block.attentions[0].processor, + ( + AttnProcessor2_0, + XFormersAttnProcessor, + LoRAXFormersAttnProcessor, + LoRAAttnProcessor2_0, + ), + ) + # if xformers or torch_2_0 is used attention block does not need + # to be in float32 which can save lots of memory + if use_torch_2_0_or_xformers: + self.vae.post_quant_conv.to(latents.dtype) + self.vae.decoder.conv_in.to(latents.dtype) + self.vae.decoder.mid_block.to(latents.dtype) + else: + latents = latents.float() + + if not output_type == "latent": + image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0] + else: + image = latents + return StableDiffusionXLPipelineOutput(images=image) + + image = self.watermark.apply_watermark(image) + image = self.image_processor.postprocess(image, output_type=output_type) + + # Offload last model to CPU + if hasattr(self, "final_offload_hook") and self.final_offload_hook is not None: + self.final_offload_hook.offload() + + if not return_dict: + return (image,) + + return StableDiffusionXLPipelineOutput(images=image) + + def predict_x0(self, x_t, eps_t, t): + alpha_t = self.scheduler.alphas_cumprod[t.cpu().long().item()] + return (x_t - eps_t * torch.sqrt(1-alpha_t)) / torch.sqrt(alpha_t) + + def register_tokenmap_hooks(self): + r"""Function for registering hooks during evaluation. + We mainly store activation maps averaged over queries. + """ + self.forward_hooks = [] + + def save_activations(selfattn_maps, crossattn_maps, n_maps, name, module, inp, out): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + # out[0] - final output of attention layer + # out[1] - attention probability matrices + if name in n_maps: + n_maps[name] += 1 + else: + n_maps[name] = 1 + if 'attn2' in name: + assert out[1][0].shape[-1] == 77 + if name in CrossAttentionLayers_XL and n_maps[name] > 10: + # if n_maps[name] > 10: + if name in crossattn_maps: + crossattn_maps[name] += out[1][0].detach().cpu()[1:2] + else: + crossattn_maps[name] = out[1][0].detach().cpu()[1:2] + # For visualization + # crossattn_maps[name].append(out[1][0].detach().cpu()[1:2]) + else: + assert out[1][0].shape[-1] != 77 + # if name in SelfAttentionLayers and n_maps[name] > 10: + if n_maps[name] > 10: + if name in selfattn_maps: + selfattn_maps[name] += out[1][0].detach().cpu()[1:2] + else: + selfattn_maps[name] = out[1][0].detach().cpu()[1:2] + + selfattn_maps = collections.defaultdict(list) + crossattn_maps = collections.defaultdict(list) + n_maps = collections.defaultdict(list) + + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name: + # Register hook to obtain outputs at every attention layer. + self.forward_hooks.append(module.register_forward_hook( + partial(save_activations, selfattn_maps, + crossattn_maps, n_maps, name) + )) + # attention_dict is a dictionary containing attention maps for every attention layer + self.selfattn_maps = selfattn_maps + self.crossattn_maps = crossattn_maps + self.n_maps = n_maps + + def remove_tokenmap_hooks(self): + for hook in self.forward_hooks: + hook.remove() + self.selfattn_maps = None + self.crossattn_maps = None + self.n_maps = None + + def register_replacement_hooks(self, feat_inject_step=False): + r"""Function for registering hooks to replace self attention. + """ + self.forward_replacement_hooks = [] + + def replace_activations(name, module, args): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + if 'attn1' in name: + modified_args = (args[0], self.self_attention_maps_cur[name].to(args[0].device)) + return modified_args + # cross attention injection + # elif 'attn2' in name: + # modified_map = { + # 'reference': self.self_attention_maps_cur[name], + # 'inject_pos': self.inject_pos, + # } + # modified_args = (args[0], modified_map) + # return modified_args + + def replace_resnet_activations(name, module, args): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + modified_args = (args[0], args[1], + self.self_attention_maps_cur[name].to(args[0].device)) + return modified_args + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name and feat_inject_step: + # Register hook to obtain outputs at every attention layer. + self.forward_replacement_hooks.append(module.register_forward_pre_hook( + partial(replace_activations, name) + )) + if name == 'up_blocks.1.resnets.1' and feat_inject_step: + # Register hook to obtain outputs at every attention layer. + self.forward_replacement_hooks.append(module.register_forward_pre_hook( + partial(replace_resnet_activations, name) + )) + + def remove_replacement_hooks(self): + for hook in self.forward_replacement_hooks: + hook.remove() + + + def register_selfattn_hooks(self, feat_inject_step=False): + r"""Function for registering hooks during evaluation. + We mainly store activation maps averaged over queries. + """ + self.selfattn_forward_hooks = [] + + def save_activations(activations, name, module, inp, out): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + # out[0] - final output of attention layer + # out[1] - attention probability matrix + if 'attn2' in name: + assert out[1][1].shape[-1] == 77 + # cross attention injection + # activations[name] = out[1][1].detach() + else: + assert out[1][1].shape[-1] != 77 + activations[name] = out[1][1].detach().cpu() + + def save_resnet_activations(activations, name, module, inp, out): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + # out[0] - final output of residual layer + # out[1] - residual hidden feature + # import ipdb;ipdb.set_trace() + # assert out[1].shape[-1] == 64 + activations[name] = out[1].detach().cpu() + attention_dict = collections.defaultdict(list) + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name and feat_inject_step: + # Register hook to obtain outputs at every attention layer. + self.selfattn_forward_hooks.append(module.register_forward_hook( + partial(save_activations, attention_dict, name) + )) + if name == 'up_blocks.1.resnets.1' and feat_inject_step: + self.selfattn_forward_hooks.append(module.register_forward_hook( + partial(save_resnet_activations, attention_dict, name) + )) + # attention_dict is a dictionary containing attention maps for every attention layer + self.self_attention_maps_cur = attention_dict + + def remove_selfattn_hooks(self): + for hook in self.selfattn_forward_hooks: + hook.remove() + + def register_fontsize_hooks(self, text_format_dict={}): + r"""Function for registering hooks to replace self attention. + """ + self.forward_fontsize_hooks = [] + + def adjust_attn_weights(name, module, args): + r""" + PyTorch Forward hook to save outputs at each forward pass. + """ + if 'attn2' in name: + modified_args = (args[0], None, attn_weights) + return modified_args + + if text_format_dict['word_pos'] is not None and text_format_dict['font_size'] is not None: + attn_weights = {'word_pos': text_format_dict['word_pos'], 'font_size': text_format_dict['font_size']} + else: + attn_weights = None + + for name, module in self.unet.named_modules(): + leaf_name = name.split('.')[-1] + if 'attn' in leaf_name and attn_weights is not None: + # Register hook to obtain outputs at every attention layer. + self.forward_fontsize_hooks.append(module.register_forward_pre_hook( + partial(adjust_attn_weights, name) + )) + + def remove_fontsize_hooks(self): + for hook in self.forward_fontsize_hooks: + hook.remove() \ No newline at end of file diff --git a/gf/sd-webui-rich-text/scripts/models/resnet.py b/gf/sd-webui-rich-text/scripts/models/resnet.py new file mode 100644 index 0000000000000000000000000000000000000000..278519dfa924aef61aca99dae7c782361fc5171b --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/resnet.py @@ -0,0 +1,882 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# `TemporalConvLayer` Copyright 2023 Alibaba DAMO-VILAB, The ModelScope Team and The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from functools import partial +from typing import Optional + +import torch +import torch.nn as nn +import torch.nn.functional as F + +from diffusers_official.models.activations import get_activation +from diffusers_official.models.attention import AdaGroupNorm +from scripts.models.attention_processor import SpatialNorm + + +class Upsample1D(nn.Module): + """A 1D upsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + use_conv_transpose (`bool`, default `False`): + option to use a convolution transpose. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + """ + + def __init__(self, channels, use_conv=False, use_conv_transpose=False, out_channels=None, name="conv"): + super().__init__() + self.channels = channels + self.out_channels = out_channels or channels + self.use_conv = use_conv + self.use_conv_transpose = use_conv_transpose + self.name = name + + self.conv = None + if use_conv_transpose: + self.conv = nn.ConvTranspose1d(channels, self.out_channels, 4, 2, 1) + elif use_conv: + self.conv = nn.Conv1d(self.channels, self.out_channels, 3, padding=1) + + def forward(self, inputs): + assert inputs.shape[1] == self.channels + if self.use_conv_transpose: + return self.conv(inputs) + + outputs = F.interpolate(inputs, scale_factor=2.0, mode="nearest") + + if self.use_conv: + outputs = self.conv(outputs) + + return outputs + + +class Downsample1D(nn.Module): + """A 1D downsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + padding (`int`, default `1`): + padding for the convolution. + """ + + def __init__(self, channels, use_conv=False, out_channels=None, padding=1, name="conv"): + super().__init__() + self.channels = channels + self.out_channels = out_channels or channels + self.use_conv = use_conv + self.padding = padding + stride = 2 + self.name = name + + if use_conv: + self.conv = nn.Conv1d(self.channels, self.out_channels, 3, stride=stride, padding=padding) + else: + assert self.channels == self.out_channels + self.conv = nn.AvgPool1d(kernel_size=stride, stride=stride) + + def forward(self, inputs): + assert inputs.shape[1] == self.channels + return self.conv(inputs) + + +class Upsample2D(nn.Module): + """A 2D upsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + use_conv_transpose (`bool`, default `False`): + option to use a convolution transpose. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + """ + + def __init__(self, channels, use_conv=False, use_conv_transpose=False, out_channels=None, name="conv"): + super().__init__() + self.channels = channels + self.out_channels = out_channels or channels + self.use_conv = use_conv + self.use_conv_transpose = use_conv_transpose + self.name = name + + conv = None + if use_conv_transpose: + conv = nn.ConvTranspose2d(channels, self.out_channels, 4, 2, 1) + elif use_conv: + conv = nn.Conv2d(self.channels, self.out_channels, 3, padding=1) + + # TODO(Suraj, Patrick) - clean up after weight dicts are correctly renamed + if name == "conv": + self.conv = conv + else: + self.Conv2d_0 = conv + + def forward(self, hidden_states, output_size=None): + assert hidden_states.shape[1] == self.channels + + if self.use_conv_transpose: + return self.conv(hidden_states) + + # Cast to float32 to as 'upsample_nearest2d_out_frame' op does not support bfloat16 + # TODO(Suraj): Remove this cast once the issue is fixed in PyTorch + # https://github.com/pytorch/pytorch/issues/86679 + dtype = hidden_states.dtype + if dtype == torch.bfloat16: + hidden_states = hidden_states.to(torch.float32) + + # upsample_nearest_nhwc fails with large batch sizes. see https://github.com/huggingface/diffusers/issues/984 + if hidden_states.shape[0] >= 64: + hidden_states = hidden_states.contiguous() + + # if `output_size` is passed we force the interpolation output + # size and do not make use of `scale_factor=2` + if output_size is None: + hidden_states = F.interpolate(hidden_states, scale_factor=2.0, mode="nearest") + else: + hidden_states = F.interpolate(hidden_states, size=output_size, mode="nearest") + + # If the input is bfloat16, we cast back to bfloat16 + if dtype == torch.bfloat16: + hidden_states = hidden_states.to(dtype) + + # TODO(Suraj, Patrick) - clean up after weight dicts are correctly renamed + if self.use_conv: + if self.name == "conv": + hidden_states = self.conv(hidden_states) + else: + hidden_states = self.Conv2d_0(hidden_states) + + return hidden_states + + +class Downsample2D(nn.Module): + """A 2D downsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + padding (`int`, default `1`): + padding for the convolution. + """ + + def __init__(self, channels, use_conv=False, out_channels=None, padding=1, name="conv"): + super().__init__() + self.channels = channels + self.out_channels = out_channels or channels + self.use_conv = use_conv + self.padding = padding + stride = 2 + self.name = name + + if use_conv: + conv = nn.Conv2d(self.channels, self.out_channels, 3, stride=stride, padding=padding) + else: + assert self.channels == self.out_channels + conv = nn.AvgPool2d(kernel_size=stride, stride=stride) + + # TODO(Suraj, Patrick) - clean up after weight dicts are correctly renamed + if name == "conv": + self.Conv2d_0 = conv + self.conv = conv + elif name == "Conv2d_0": + self.conv = conv + else: + self.conv = conv + + def forward(self, hidden_states): + assert hidden_states.shape[1] == self.channels + if self.use_conv and self.padding == 0: + pad = (0, 1, 0, 1) + hidden_states = F.pad(hidden_states, pad, mode="constant", value=0) + + assert hidden_states.shape[1] == self.channels + hidden_states = self.conv(hidden_states) + + return hidden_states + + +class FirUpsample2D(nn.Module): + """A 2D FIR upsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + fir_kernel (`tuple`, default `(1, 3, 3, 1)`): + kernel for the FIR filter. + """ + + def __init__(self, channels=None, out_channels=None, use_conv=False, fir_kernel=(1, 3, 3, 1)): + super().__init__() + out_channels = out_channels if out_channels else channels + if use_conv: + self.Conv2d_0 = nn.Conv2d(channels, out_channels, kernel_size=3, stride=1, padding=1) + self.use_conv = use_conv + self.fir_kernel = fir_kernel + self.out_channels = out_channels + + def _upsample_2d(self, hidden_states, weight=None, kernel=None, factor=2, gain=1): + """Fused `upsample_2d()` followed by `Conv2d()`. + + Padding is performed only once at the beginning, not between the operations. The fused op is considerably more + efficient than performing the same calculation using standard TensorFlow ops. It supports gradients of + arbitrary order. + + Args: + hidden_states: Input tensor of the shape `[N, C, H, W]` or `[N, H, W, C]`. + weight: Weight tensor of the shape `[filterH, filterW, inChannels, + outChannels]`. Grouped convolution can be performed by `inChannels = x.shape[0] // numGroups`. + kernel: FIR filter of the shape `[firH, firW]` or `[firN]` + (separable). The default is `[1] * factor`, which corresponds to nearest-neighbor upsampling. + factor: Integer upsampling factor (default: 2). + gain: Scaling factor for signal magnitude (default: 1.0). + + Returns: + output: Tensor of the shape `[N, C, H * factor, W * factor]` or `[N, H * factor, W * factor, C]`, and same + datatype as `hidden_states`. + """ + + assert isinstance(factor, int) and factor >= 1 + + # Setup filter kernel. + if kernel is None: + kernel = [1] * factor + + # setup kernel + kernel = torch.tensor(kernel, dtype=torch.float32) + if kernel.ndim == 1: + kernel = torch.outer(kernel, kernel) + kernel /= torch.sum(kernel) + + kernel = kernel * (gain * (factor**2)) + + if self.use_conv: + convH = weight.shape[2] + convW = weight.shape[3] + inC = weight.shape[1] + + pad_value = (kernel.shape[0] - factor) - (convW - 1) + + stride = (factor, factor) + # Determine data dimensions. + output_shape = ( + (hidden_states.shape[2] - 1) * factor + convH, + (hidden_states.shape[3] - 1) * factor + convW, + ) + output_padding = ( + output_shape[0] - (hidden_states.shape[2] - 1) * stride[0] - convH, + output_shape[1] - (hidden_states.shape[3] - 1) * stride[1] - convW, + ) + assert output_padding[0] >= 0 and output_padding[1] >= 0 + num_groups = hidden_states.shape[1] // inC + + # Transpose weights. + weight = torch.reshape(weight, (num_groups, -1, inC, convH, convW)) + weight = torch.flip(weight, dims=[3, 4]).permute(0, 2, 1, 3, 4) + weight = torch.reshape(weight, (num_groups * inC, -1, convH, convW)) + + inverse_conv = F.conv_transpose2d( + hidden_states, weight, stride=stride, output_padding=output_padding, padding=0 + ) + + output = upfirdn2d_native( + inverse_conv, + torch.tensor(kernel, device=inverse_conv.device), + pad=((pad_value + 1) // 2 + factor - 1, pad_value // 2 + 1), + ) + else: + pad_value = kernel.shape[0] - factor + output = upfirdn2d_native( + hidden_states, + torch.tensor(kernel, device=hidden_states.device), + up=factor, + pad=((pad_value + 1) // 2 + factor - 1, pad_value // 2), + ) + + return output + + def forward(self, hidden_states): + if self.use_conv: + height = self._upsample_2d(hidden_states, self.Conv2d_0.weight, kernel=self.fir_kernel) + height = height + self.Conv2d_0.bias.reshape(1, -1, 1, 1) + else: + height = self._upsample_2d(hidden_states, kernel=self.fir_kernel, factor=2) + + return height + + +class FirDownsample2D(nn.Module): + """A 2D FIR downsampling layer with an optional convolution. + + Parameters: + channels (`int`): + number of channels in the inputs and outputs. + use_conv (`bool`, default `False`): + option to use a convolution. + out_channels (`int`, optional): + number of output channels. Defaults to `channels`. + fir_kernel (`tuple`, default `(1, 3, 3, 1)`): + kernel for the FIR filter. + """ + + def __init__(self, channels=None, out_channels=None, use_conv=False, fir_kernel=(1, 3, 3, 1)): + super().__init__() + out_channels = out_channels if out_channels else channels + if use_conv: + self.Conv2d_0 = nn.Conv2d(channels, out_channels, kernel_size=3, stride=1, padding=1) + self.fir_kernel = fir_kernel + self.use_conv = use_conv + self.out_channels = out_channels + + def _downsample_2d(self, hidden_states, weight=None, kernel=None, factor=2, gain=1): + """Fused `Conv2d()` followed by `downsample_2d()`. + Padding is performed only once at the beginning, not between the operations. The fused op is considerably more + efficient than performing the same calculation using standard TensorFlow ops. It supports gradients of + arbitrary order. + + Args: + hidden_states: Input tensor of the shape `[N, C, H, W]` or `[N, H, W, C]`. + weight: + Weight tensor of the shape `[filterH, filterW, inChannels, outChannels]`. Grouped convolution can be + performed by `inChannels = x.shape[0] // numGroups`. + kernel: FIR filter of the shape `[firH, firW]` or `[firN]` (separable). The default is `[1] * + factor`, which corresponds to average pooling. + factor: Integer downsampling factor (default: 2). + gain: Scaling factor for signal magnitude (default: 1.0). + + Returns: + output: Tensor of the shape `[N, C, H // factor, W // factor]` or `[N, H // factor, W // factor, C]`, and + same datatype as `x`. + """ + + assert isinstance(factor, int) and factor >= 1 + if kernel is None: + kernel = [1] * factor + + # setup kernel + kernel = torch.tensor(kernel, dtype=torch.float32) + if kernel.ndim == 1: + kernel = torch.outer(kernel, kernel) + kernel /= torch.sum(kernel) + + kernel = kernel * gain + + if self.use_conv: + _, _, convH, convW = weight.shape + pad_value = (kernel.shape[0] - factor) + (convW - 1) + stride_value = [factor, factor] + upfirdn_input = upfirdn2d_native( + hidden_states, + torch.tensor(kernel, device=hidden_states.device), + pad=((pad_value + 1) // 2, pad_value // 2), + ) + output = F.conv2d(upfirdn_input, weight, stride=stride_value, padding=0) + else: + pad_value = kernel.shape[0] - factor + output = upfirdn2d_native( + hidden_states, + torch.tensor(kernel, device=hidden_states.device), + down=factor, + pad=((pad_value + 1) // 2, pad_value // 2), + ) + + return output + + def forward(self, hidden_states): + if self.use_conv: + downsample_input = self._downsample_2d(hidden_states, weight=self.Conv2d_0.weight, kernel=self.fir_kernel) + hidden_states = downsample_input + self.Conv2d_0.bias.reshape(1, -1, 1, 1) + else: + hidden_states = self._downsample_2d(hidden_states, kernel=self.fir_kernel, factor=2) + + return hidden_states + + +# downsample/upsample layer used in k-upscaler, might be able to use FirDownsample2D/DirUpsample2D instead +class KDownsample2D(nn.Module): + def __init__(self, pad_mode="reflect"): + super().__init__() + self.pad_mode = pad_mode + kernel_1d = torch.tensor([[1 / 8, 3 / 8, 3 / 8, 1 / 8]]) + self.pad = kernel_1d.shape[1] // 2 - 1 + self.register_buffer("kernel", kernel_1d.T @ kernel_1d, persistent=False) + + def forward(self, inputs): + inputs = F.pad(inputs, (self.pad,) * 4, self.pad_mode) + weight = inputs.new_zeros([inputs.shape[1], inputs.shape[1], self.kernel.shape[0], self.kernel.shape[1]]) + indices = torch.arange(inputs.shape[1], device=inputs.device) + kernel = self.kernel.to(weight)[None, :].expand(inputs.shape[1], -1, -1) + weight[indices, indices] = kernel + return F.conv2d(inputs, weight, stride=2) + + +class KUpsample2D(nn.Module): + def __init__(self, pad_mode="reflect"): + super().__init__() + self.pad_mode = pad_mode + kernel_1d = torch.tensor([[1 / 8, 3 / 8, 3 / 8, 1 / 8]]) * 2 + self.pad = kernel_1d.shape[1] // 2 - 1 + self.register_buffer("kernel", kernel_1d.T @ kernel_1d, persistent=False) + + def forward(self, inputs): + inputs = F.pad(inputs, ((self.pad + 1) // 2,) * 4, self.pad_mode) + weight = inputs.new_zeros([inputs.shape[1], inputs.shape[1], self.kernel.shape[0], self.kernel.shape[1]]) + indices = torch.arange(inputs.shape[1], device=inputs.device) + kernel = self.kernel.to(weight)[None, :].expand(inputs.shape[1], -1, -1) + weight[indices, indices] = kernel + return F.conv_transpose2d(inputs, weight, stride=2, padding=self.pad * 2 + 1) + + +class ResnetBlock2D(nn.Module): + r""" + A Resnet block. + + Parameters: + in_channels (`int`): The number of channels in the input. + out_channels (`int`, *optional*, default to be `None`): + The number of output channels for the first conv2d layer. If None, same as `in_channels`. + dropout (`float`, *optional*, defaults to `0.0`): The dropout probability to use. + temb_channels (`int`, *optional*, default to `512`): the number of channels in timestep embedding. + groups (`int`, *optional*, default to `32`): The number of groups to use for the first normalization layer. + groups_out (`int`, *optional*, default to None): + The number of groups to use for the second normalization layer. if set to None, same as `groups`. + eps (`float`, *optional*, defaults to `1e-6`): The epsilon to use for the normalization. + non_linearity (`str`, *optional*, default to `"swish"`): the activation function to use. + time_embedding_norm (`str`, *optional*, default to `"default"` ): Time scale shift config. + By default, apply timestep embedding conditioning with a simple shift mechanism. Choose "scale_shift" or + "ada_group" for a stronger conditioning with scale and shift. + kernel (`torch.FloatTensor`, optional, default to None): FIR filter, see + [`~models.resnet.FirUpsample2D`] and [`~models.resnet.FirDownsample2D`]. + output_scale_factor (`float`, *optional*, default to be `1.0`): the scale factor to use for the output. + use_in_shortcut (`bool`, *optional*, default to `True`): + If `True`, add a 1x1 nn.conv2d layer for skip-connection. + up (`bool`, *optional*, default to `False`): If `True`, add an upsample layer. + down (`bool`, *optional*, default to `False`): If `True`, add a downsample layer. + conv_shortcut_bias (`bool`, *optional*, default to `True`): If `True`, adds a learnable bias to the + `conv_shortcut` output. + conv_2d_out_channels (`int`, *optional*, default to `None`): the number of channels in the output. + If None, same as `out_channels`. + """ + + def __init__( + self, + *, + in_channels, + out_channels=None, + conv_shortcut=False, + dropout=0.0, + temb_channels=512, + groups=32, + groups_out=None, + pre_norm=True, + eps=1e-6, + non_linearity="swish", + skip_time_act=False, + time_embedding_norm="default", # default, scale_shift, ada_group, spatial + kernel=None, + output_scale_factor=1.0, + use_in_shortcut=None, + up=False, + down=False, + conv_shortcut_bias: bool = True, + conv_2d_out_channels: Optional[int] = None, + ): + super().__init__() + self.pre_norm = pre_norm + self.pre_norm = True + self.in_channels = in_channels + out_channels = in_channels if out_channels is None else out_channels + self.out_channels = out_channels + self.use_conv_shortcut = conv_shortcut + self.up = up + self.down = down + self.output_scale_factor = output_scale_factor + self.time_embedding_norm = time_embedding_norm + self.skip_time_act = skip_time_act + + if groups_out is None: + groups_out = groups + + if self.time_embedding_norm == "ada_group": + self.norm1 = AdaGroupNorm(temb_channels, in_channels, groups, eps=eps) + elif self.time_embedding_norm == "spatial": + self.norm1 = SpatialNorm(in_channels, temb_channels) + else: + self.norm1 = torch.nn.GroupNorm(num_groups=groups, num_channels=in_channels, eps=eps, affine=True) + + self.conv1 = torch.nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=1, padding=1) + + if temb_channels is not None: + if self.time_embedding_norm == "default": + self.time_emb_proj = torch.nn.Linear(temb_channels, out_channels) + elif self.time_embedding_norm == "scale_shift": + self.time_emb_proj = torch.nn.Linear(temb_channels, 2 * out_channels) + elif self.time_embedding_norm == "ada_group" or self.time_embedding_norm == "spatial": + self.time_emb_proj = None + else: + raise ValueError(f"unknown time_embedding_norm : {self.time_embedding_norm} ") + else: + self.time_emb_proj = None + + if self.time_embedding_norm == "ada_group": + self.norm2 = AdaGroupNorm(temb_channels, out_channels, groups_out, eps=eps) + elif self.time_embedding_norm == "spatial": + self.norm2 = SpatialNorm(out_channels, temb_channels) + else: + self.norm2 = torch.nn.GroupNorm(num_groups=groups_out, num_channels=out_channels, eps=eps, affine=True) + + self.dropout = torch.nn.Dropout(dropout) + conv_2d_out_channels = conv_2d_out_channels or out_channels + self.conv2 = torch.nn.Conv2d(out_channels, conv_2d_out_channels, kernel_size=3, stride=1, padding=1) + + self.nonlinearity = get_activation(non_linearity) + + self.upsample = self.downsample = None + if self.up: + if kernel == "fir": + fir_kernel = (1, 3, 3, 1) + self.upsample = lambda x: upsample_2d(x, kernel=fir_kernel) + elif kernel == "sde_vp": + self.upsample = partial(F.interpolate, scale_factor=2.0, mode="nearest") + else: + self.upsample = Upsample2D(in_channels, use_conv=False) + elif self.down: + if kernel == "fir": + fir_kernel = (1, 3, 3, 1) + self.downsample = lambda x: downsample_2d(x, kernel=fir_kernel) + elif kernel == "sde_vp": + self.downsample = partial(F.avg_pool2d, kernel_size=2, stride=2) + else: + self.downsample = Downsample2D(in_channels, use_conv=False, padding=1, name="op") + + self.use_in_shortcut = self.in_channels != conv_2d_out_channels if use_in_shortcut is None else use_in_shortcut + + self.conv_shortcut = None + if self.use_in_shortcut: + self.conv_shortcut = torch.nn.Conv2d( + in_channels, conv_2d_out_channels, kernel_size=1, stride=1, padding=0, bias=conv_shortcut_bias + ) + + # Rich-Text: feature injection + def forward(self, input_tensor, temb, inject_states=None): + hidden_states = input_tensor + + if self.time_embedding_norm == "ada_group" or self.time_embedding_norm == "spatial": + hidden_states = self.norm1(hidden_states, temb) + else: + hidden_states = self.norm1(hidden_states) + + hidden_states = self.nonlinearity(hidden_states) + + if self.upsample is not None: + # upsample_nearest_nhwc fails with large batch sizes. see https://github.com/huggingface/diffusers/issues/984 + if hidden_states.shape[0] >= 64: + input_tensor = input_tensor.contiguous() + hidden_states = hidden_states.contiguous() + input_tensor = self.upsample(input_tensor) + hidden_states = self.upsample(hidden_states) + elif self.downsample is not None: + input_tensor = self.downsample(input_tensor) + hidden_states = self.downsample(hidden_states) + + hidden_states = self.conv1(hidden_states) + + if self.time_emb_proj is not None: + if not self.skip_time_act: + temb = self.nonlinearity(temb) + temb = self.time_emb_proj(temb)[:, :, None, None] + + if temb is not None and self.time_embedding_norm == "default": + hidden_states = hidden_states + temb + + if self.time_embedding_norm == "ada_group" or self.time_embedding_norm == "spatial": + hidden_states = self.norm2(hidden_states, temb) + else: + hidden_states = self.norm2(hidden_states) + + if temb is not None and self.time_embedding_norm == "scale_shift": + scale, shift = torch.chunk(temb, 2, dim=1) + hidden_states = hidden_states * (1 + scale) + shift + + hidden_states = self.nonlinearity(hidden_states) + + hidden_states = self.dropout(hidden_states) + hidden_states = self.conv2(hidden_states) + + if self.conv_shortcut is not None: + input_tensor = self.conv_shortcut(input_tensor) + + # Rich-Text: feature injection + if inject_states is not None: + output_tensor = (input_tensor + inject_states) / self.output_scale_factor + else: + output_tensor = (input_tensor + hidden_states) / self.output_scale_factor + + return output_tensor, hidden_states + + +# unet_rl.py +def rearrange_dims(tensor): + if len(tensor.shape) == 2: + return tensor[:, :, None] + if len(tensor.shape) == 3: + return tensor[:, :, None, :] + elif len(tensor.shape) == 4: + return tensor[:, :, 0, :] + else: + raise ValueError(f"`len(tensor)`: {len(tensor)} has to be 2, 3 or 4.") + + +class Conv1dBlock(nn.Module): + """ + Conv1d --> GroupNorm --> Mish + """ + + def __init__(self, inp_channels, out_channels, kernel_size, n_groups=8): + super().__init__() + + self.conv1d = nn.Conv1d(inp_channels, out_channels, kernel_size, padding=kernel_size // 2) + self.group_norm = nn.GroupNorm(n_groups, out_channels) + self.mish = nn.Mish() + + def forward(self, inputs): + intermediate_repr = self.conv1d(inputs) + intermediate_repr = rearrange_dims(intermediate_repr) + intermediate_repr = self.group_norm(intermediate_repr) + intermediate_repr = rearrange_dims(intermediate_repr) + output = self.mish(intermediate_repr) + return output + + +# unet_rl.py +class ResidualTemporalBlock1D(nn.Module): + def __init__(self, inp_channels, out_channels, embed_dim, kernel_size=5): + super().__init__() + self.conv_in = Conv1dBlock(inp_channels, out_channels, kernel_size) + self.conv_out = Conv1dBlock(out_channels, out_channels, kernel_size) + + self.time_emb_act = nn.Mish() + self.time_emb = nn.Linear(embed_dim, out_channels) + + self.residual_conv = ( + nn.Conv1d(inp_channels, out_channels, 1) if inp_channels != out_channels else nn.Identity() + ) + + def forward(self, inputs, t): + """ + Args: + inputs : [ batch_size x inp_channels x horizon ] + t : [ batch_size x embed_dim ] + + returns: + out : [ batch_size x out_channels x horizon ] + """ + t = self.time_emb_act(t) + t = self.time_emb(t) + out = self.conv_in(inputs) + rearrange_dims(t) + out = self.conv_out(out) + return out + self.residual_conv(inputs) + + +def upsample_2d(hidden_states, kernel=None, factor=2, gain=1): + r"""Upsample2D a batch of 2D images with the given filter. + Accepts a batch of 2D images of the shape `[N, C, H, W]` or `[N, H, W, C]` and upsamples each image with the given + filter. The filter is normalized so that if the input pixels are constant, they will be scaled by the specified + `gain`. Pixels outside the image are assumed to be zero, and the filter is padded with zeros so that its shape is + a: multiple of the upsampling factor. + + Args: + hidden_states: Input tensor of the shape `[N, C, H, W]` or `[N, H, W, C]`. + kernel: FIR filter of the shape `[firH, firW]` or `[firN]` + (separable). The default is `[1] * factor`, which corresponds to nearest-neighbor upsampling. + factor: Integer upsampling factor (default: 2). + gain: Scaling factor for signal magnitude (default: 1.0). + + Returns: + output: Tensor of the shape `[N, C, H * factor, W * factor]` + """ + assert isinstance(factor, int) and factor >= 1 + if kernel is None: + kernel = [1] * factor + + kernel = torch.tensor(kernel, dtype=torch.float32) + if kernel.ndim == 1: + kernel = torch.outer(kernel, kernel) + kernel /= torch.sum(kernel) + + kernel = kernel * (gain * (factor**2)) + pad_value = kernel.shape[0] - factor + output = upfirdn2d_native( + hidden_states, + kernel.to(device=hidden_states.device), + up=factor, + pad=((pad_value + 1) // 2 + factor - 1, pad_value // 2), + ) + return output + + +def downsample_2d(hidden_states, kernel=None, factor=2, gain=1): + r"""Downsample2D a batch of 2D images with the given filter. + Accepts a batch of 2D images of the shape `[N, C, H, W]` or `[N, H, W, C]` and downsamples each image with the + given filter. The filter is normalized so that if the input pixels are constant, they will be scaled by the + specified `gain`. Pixels outside the image are assumed to be zero, and the filter is padded with zeros so that its + shape is a multiple of the downsampling factor. + + Args: + hidden_states: Input tensor of the shape `[N, C, H, W]` or `[N, H, W, C]`. + kernel: FIR filter of the shape `[firH, firW]` or `[firN]` + (separable). The default is `[1] * factor`, which corresponds to average pooling. + factor: Integer downsampling factor (default: 2). + gain: Scaling factor for signal magnitude (default: 1.0). + + Returns: + output: Tensor of the shape `[N, C, H // factor, W // factor]` + """ + + assert isinstance(factor, int) and factor >= 1 + if kernel is None: + kernel = [1] * factor + + kernel = torch.tensor(kernel, dtype=torch.float32) + if kernel.ndim == 1: + kernel = torch.outer(kernel, kernel) + kernel /= torch.sum(kernel) + + kernel = kernel * gain + pad_value = kernel.shape[0] - factor + output = upfirdn2d_native( + hidden_states, kernel.to(device=hidden_states.device), down=factor, pad=((pad_value + 1) // 2, pad_value // 2) + ) + return output + + +def upfirdn2d_native(tensor, kernel, up=1, down=1, pad=(0, 0)): + up_x = up_y = up + down_x = down_y = down + pad_x0 = pad_y0 = pad[0] + pad_x1 = pad_y1 = pad[1] + + _, channel, in_h, in_w = tensor.shape + tensor = tensor.reshape(-1, in_h, in_w, 1) + + _, in_h, in_w, minor = tensor.shape + kernel_h, kernel_w = kernel.shape + + out = tensor.view(-1, in_h, 1, in_w, 1, minor) + out = F.pad(out, [0, 0, 0, up_x - 1, 0, 0, 0, up_y - 1]) + out = out.view(-1, in_h * up_y, in_w * up_x, minor) + + out = F.pad(out, [0, 0, max(pad_x0, 0), max(pad_x1, 0), max(pad_y0, 0), max(pad_y1, 0)]) + out = out.to(tensor.device) # Move back to mps if necessary + out = out[ + :, + max(-pad_y0, 0) : out.shape[1] - max(-pad_y1, 0), + max(-pad_x0, 0) : out.shape[2] - max(-pad_x1, 0), + :, + ] + + out = out.permute(0, 3, 1, 2) + out = out.reshape([-1, 1, in_h * up_y + pad_y0 + pad_y1, in_w * up_x + pad_x0 + pad_x1]) + w = torch.flip(kernel, [0, 1]).view(1, 1, kernel_h, kernel_w) + out = F.conv2d(out, w) + out = out.reshape( + -1, + minor, + in_h * up_y + pad_y0 + pad_y1 - kernel_h + 1, + in_w * up_x + pad_x0 + pad_x1 - kernel_w + 1, + ) + out = out.permute(0, 2, 3, 1) + out = out[:, ::down_y, ::down_x, :] + + out_h = (in_h * up_y + pad_y0 + pad_y1 - kernel_h) // down_y + 1 + out_w = (in_w * up_x + pad_x0 + pad_x1 - kernel_w) // down_x + 1 + + return out.view(-1, channel, out_h, out_w) + + +class TemporalConvLayer(nn.Module): + """ + Temporal convolutional layer that can be used for video (sequence of images) input Code mostly copied from: + https://github.com/modelscope/modelscope/blob/1509fdb973e5871f37148a4b5e5964cafd43e64d/modelscope/models/multi_modal/video_synthesis/unet_sd.py#L1016 + """ + + def __init__(self, in_dim, out_dim=None, dropout=0.0): + super().__init__() + out_dim = out_dim or in_dim + self.in_dim = in_dim + self.out_dim = out_dim + + # conv layers + self.conv1 = nn.Sequential( + nn.GroupNorm(32, in_dim), nn.SiLU(), nn.Conv3d(in_dim, out_dim, (3, 1, 1), padding=(1, 0, 0)) + ) + self.conv2 = nn.Sequential( + nn.GroupNorm(32, out_dim), + nn.SiLU(), + nn.Dropout(dropout), + nn.Conv3d(out_dim, in_dim, (3, 1, 1), padding=(1, 0, 0)), + ) + self.conv3 = nn.Sequential( + nn.GroupNorm(32, out_dim), + nn.SiLU(), + nn.Dropout(dropout), + nn.Conv3d(out_dim, in_dim, (3, 1, 1), padding=(1, 0, 0)), + ) + self.conv4 = nn.Sequential( + nn.GroupNorm(32, out_dim), + nn.SiLU(), + nn.Dropout(dropout), + nn.Conv3d(out_dim, in_dim, (3, 1, 1), padding=(1, 0, 0)), + ) + + # zero out the last layer params,so the conv block is identity + nn.init.zeros_(self.conv4[-1].weight) + nn.init.zeros_(self.conv4[-1].bias) + + def forward(self, hidden_states, num_frames=1): + hidden_states = ( + hidden_states[None, :].reshape((-1, num_frames) + hidden_states.shape[1:]).permute(0, 2, 1, 3, 4) + ) + + identity = hidden_states + hidden_states = self.conv1(hidden_states) + hidden_states = self.conv2(hidden_states) + hidden_states = self.conv3(hidden_states) + hidden_states = self.conv4(hidden_states) + + hidden_states = identity + hidden_states + + hidden_states = hidden_states.permute(0, 2, 1, 3, 4).reshape( + (hidden_states.shape[0] * hidden_states.shape[2], -1) + hidden_states.shape[3:] + ) + return hidden_states diff --git a/gf/sd-webui-rich-text/scripts/models/transformer_2d.py b/gf/sd-webui-rich-text/scripts/models/transformer_2d.py new file mode 100644 index 0000000000000000000000000000000000000000..703987da00b7062acdbaf928ee0f768423514820 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/transformer_2d.py @@ -0,0 +1,341 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Any, Dict, Optional + +import torch +import torch.nn.functional as F +from torch import nn + +from diffusers_official.configuration_utils import ConfigMixin, register_to_config +from diffusers_official.models.embeddings import ImagePositionalEmbeddings +from diffusers_official.utils import BaseOutput, deprecate +from diffusers_official.models.embeddings import PatchEmbed +from diffusers_official.models.modeling_utils import ModelMixin + +from scripts.models.attention import BasicTransformerBlock + +@dataclass +class Transformer2DModelOutput(BaseOutput): + """ + The output of [`Transformer2DModel`]. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` or `(batch size, num_vector_embeds - 1, num_latent_pixels)` if [`Transformer2DModel`] is discrete): + The hidden states output conditioned on the `encoder_hidden_states` input. If discrete, returns probability + distributions for the unnoised latent pixels. + """ + + sample: torch.FloatTensor + + +class Transformer2DModel(ModelMixin, ConfigMixin): + """ + A 2D Transformer model for image-like data. + + Parameters: + num_attention_heads (`int`, *optional*, defaults to 16): The number of heads to use for multi-head attention. + attention_head_dim (`int`, *optional*, defaults to 88): The number of channels in each head. + in_channels (`int`, *optional*): + The number of channels in the input and output (specify if the input is **continuous**). + num_layers (`int`, *optional*, defaults to 1): The number of layers of Transformer blocks to use. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The number of `encoder_hidden_states` dimensions to use. + sample_size (`int`, *optional*): The width of the latent images (specify if the input is **discrete**). + This is fixed during training since it is used to learn a number of position embeddings. + num_vector_embeds (`int`, *optional*): + The number of classes of the vector embeddings of the latent pixels (specify if the input is **discrete**). + Includes the class for the masked latent pixel. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to use in feed-forward. + num_embeds_ada_norm ( `int`, *optional*): + The number of diffusion steps used during training. Pass if at least one of the norm_layers is + `AdaLayerNorm`. This is fixed during training since it is used to learn a number of embeddings that are + added to the hidden states. + + During inference, you can denoise for up to but not more steps than `num_embeds_ada_norm`. + attention_bias (`bool`, *optional*): + Configure if the `TransformerBlocks` attention should contain a bias parameter. + """ + + @register_to_config + def __init__( + self, + num_attention_heads: int = 16, + attention_head_dim: int = 88, + in_channels: Optional[int] = None, + out_channels: Optional[int] = None, + num_layers: int = 1, + dropout: float = 0.0, + norm_num_groups: int = 32, + cross_attention_dim: Optional[int] = None, + attention_bias: bool = False, + sample_size: Optional[int] = None, + num_vector_embeds: Optional[int] = None, + patch_size: Optional[int] = None, + activation_fn: str = "geglu", + num_embeds_ada_norm: Optional[int] = None, + use_linear_projection: bool = False, + only_cross_attention: bool = False, + upcast_attention: bool = False, + norm_type: str = "layer_norm", + norm_elementwise_affine: bool = True, + ): + super().__init__() + self.use_linear_projection = use_linear_projection + self.num_attention_heads = num_attention_heads + self.attention_head_dim = attention_head_dim + inner_dim = num_attention_heads * attention_head_dim + + # 1. Transformer2DModel can process both standard continuous images of shape `(batch_size, num_channels, width, height)` as well as quantized image embeddings of shape `(batch_size, num_image_vectors)` + # Define whether input is continuous or discrete depending on configuration + self.is_input_continuous = (in_channels is not None) and (patch_size is None) + self.is_input_vectorized = num_vector_embeds is not None + self.is_input_patches = in_channels is not None and patch_size is not None + + if norm_type == "layer_norm" and num_embeds_ada_norm is not None: + deprecation_message = ( + f"The configuration file of this model: {self.__class__} is outdated. `norm_type` is either not set or" + " incorrectly set to `'layer_norm'`.Make sure to set `norm_type` to `'ada_norm'` in the config." + " Please make sure to update the config accordingly as leaving `norm_type` might led to incorrect" + " results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it" + " would be very nice if you could open a Pull request for the `transformer/config.json` file" + ) + deprecate("norm_type!=num_embeds_ada_norm", "1.0.0", deprecation_message, standard_warn=False) + norm_type = "ada_norm" + + if self.is_input_continuous and self.is_input_vectorized: + raise ValueError( + f"Cannot define both `in_channels`: {in_channels} and `num_vector_embeds`: {num_vector_embeds}. Make" + " sure that either `in_channels` or `num_vector_embeds` is None." + ) + elif self.is_input_vectorized and self.is_input_patches: + raise ValueError( + f"Cannot define both `num_vector_embeds`: {num_vector_embeds} and `patch_size`: {patch_size}. Make" + " sure that either `num_vector_embeds` or `num_patches` is None." + ) + elif not self.is_input_continuous and not self.is_input_vectorized and not self.is_input_patches: + raise ValueError( + f"Has to define `in_channels`: {in_channels}, `num_vector_embeds`: {num_vector_embeds}, or patch_size:" + f" {patch_size}. Make sure that `in_channels`, `num_vector_embeds` or `num_patches` is not None." + ) + + # 2. Define input layers + if self.is_input_continuous: + self.in_channels = in_channels + + self.norm = torch.nn.GroupNorm(num_groups=norm_num_groups, num_channels=in_channels, eps=1e-6, affine=True) + if use_linear_projection: + self.proj_in = nn.Linear(in_channels, inner_dim) + else: + self.proj_in = nn.Conv2d(in_channels, inner_dim, kernel_size=1, stride=1, padding=0) + elif self.is_input_vectorized: + assert sample_size is not None, "Transformer2DModel over discrete input must provide sample_size" + assert num_vector_embeds is not None, "Transformer2DModel over discrete input must provide num_embed" + + self.height = sample_size + self.width = sample_size + self.num_vector_embeds = num_vector_embeds + self.num_latent_pixels = self.height * self.width + + self.latent_image_embedding = ImagePositionalEmbeddings( + num_embed=num_vector_embeds, embed_dim=inner_dim, height=self.height, width=self.width + ) + elif self.is_input_patches: + assert sample_size is not None, "Transformer2DModel over patched input must provide sample_size" + + self.height = sample_size + self.width = sample_size + + self.patch_size = patch_size + self.pos_embed = PatchEmbed( + height=sample_size, + width=sample_size, + patch_size=patch_size, + in_channels=in_channels, + embed_dim=inner_dim, + ) + + # 3. Define transformers blocks + self.transformer_blocks = nn.ModuleList( + [ + BasicTransformerBlock( + inner_dim, + num_attention_heads, + attention_head_dim, + dropout=dropout, + cross_attention_dim=cross_attention_dim, + activation_fn=activation_fn, + num_embeds_ada_norm=num_embeds_ada_norm, + attention_bias=attention_bias, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + norm_type=norm_type, + norm_elementwise_affine=norm_elementwise_affine, + ) + for d in range(num_layers) + ] + ) + + # 4. Define output layers + self.out_channels = in_channels if out_channels is None else out_channels + if self.is_input_continuous: + # TODO: should use out_channels for continuous projections + if use_linear_projection: + self.proj_out = nn.Linear(inner_dim, in_channels) + else: + self.proj_out = nn.Conv2d(inner_dim, in_channels, kernel_size=1, stride=1, padding=0) + elif self.is_input_vectorized: + self.norm_out = nn.LayerNorm(inner_dim) + self.out = nn.Linear(inner_dim, self.num_vector_embeds - 1) + elif self.is_input_patches: + self.norm_out = nn.LayerNorm(inner_dim, elementwise_affine=False, eps=1e-6) + self.proj_out_1 = nn.Linear(inner_dim, 2 * inner_dim) + self.proj_out_2 = nn.Linear(inner_dim, patch_size * patch_size * self.out_channels) + + def forward( + self, + hidden_states: torch.Tensor, + encoder_hidden_states: Optional[torch.Tensor] = None, + timestep: Optional[torch.LongTensor] = None, + class_labels: Optional[torch.LongTensor] = None, + cross_attention_kwargs: Dict[str, Any] = None, + attention_mask: Optional[torch.Tensor] = None, + encoder_attention_mask: Optional[torch.Tensor] = None, + return_dict: bool = True, + ): + """ + The [`Transformer2DModel`] forward method. + + Args: + hidden_states (`torch.LongTensor` of shape `(batch size, num latent pixels)` if discrete, `torch.FloatTensor` of shape `(batch size, channel, height, width)` if continuous): + Input `hidden_states`. + encoder_hidden_states ( `torch.FloatTensor` of shape `(batch size, sequence len, embed dims)`, *optional*): + Conditional embeddings for cross attention layer. If not given, cross-attention defaults to + self-attention. + timestep ( `torch.LongTensor`, *optional*): + Used to indicate denoising step. Optional timestep to be applied as an embedding in `AdaLayerNorm`. + class_labels ( `torch.LongTensor` of shape `(batch size, num classes)`, *optional*): + Used to indicate class labels conditioning. Optional class labels to be applied as an embedding in + `AdaLayerZeroNorm`. + encoder_attention_mask ( `torch.Tensor`, *optional*): + Cross-attention mask applied to `encoder_hidden_states`. Two formats supported: + + * Mask `(batch, sequence_length)` True = keep, False = discard. + * Bias `(batch, 1, sequence_length)` 0 = keep, -10000 = discard. + + If `ndim == 2`: will be interpreted as a mask, then converted into a bias consistent with the format + above. This bias will be added to the cross-attention scores. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.unet_2d_condition.UNet2DConditionOutput`] instead of a plain + tuple. + + Returns: + If `return_dict` is True, an [`~models.transformer_2d.Transformer2DModelOutput`] is returned, otherwise a + `tuple` where the first element is the sample tensor. + """ + # ensure attention_mask is a bias, and give it a singleton query_tokens dimension. + # we may have done this conversion already, e.g. if we came here via UNet2DConditionModel#forward. + # we can tell by counting dims; if ndim == 2: it's a mask rather than a bias. + # expects mask of shape: + # [batch, key_tokens] + # adds singleton query_tokens dimension: + # [batch, 1, key_tokens] + # this helps to broadcast it as a bias over attention scores, which will be in one of the following shapes: + # [batch, heads, query_tokens, key_tokens] (e.g. torch sdp attn) + # [batch * heads, query_tokens, key_tokens] (e.g. xformers or classic attn) + if attention_mask is not None and attention_mask.ndim == 2: + # assume that mask is expressed as: + # (1 = keep, 0 = discard) + # convert mask into a bias that can be added to attention scores: + # (keep = +0, discard = -10000.0) + attention_mask = (1 - attention_mask.to(hidden_states.dtype)) * -10000.0 + attention_mask = attention_mask.unsqueeze(1) + + # convert encoder_attention_mask to a bias the same way we do for attention_mask + if encoder_attention_mask is not None and encoder_attention_mask.ndim == 2: + encoder_attention_mask = (1 - encoder_attention_mask.to(hidden_states.dtype)) * -10000.0 + encoder_attention_mask = encoder_attention_mask.unsqueeze(1) + + # 1. Input + if self.is_input_continuous: + batch, _, height, width = hidden_states.shape + residual = hidden_states + + hidden_states = self.norm(hidden_states) + if not self.use_linear_projection: + hidden_states = self.proj_in(hidden_states) + inner_dim = hidden_states.shape[1] + hidden_states = hidden_states.permute(0, 2, 3, 1).reshape(batch, height * width, inner_dim) + else: + inner_dim = hidden_states.shape[1] + hidden_states = hidden_states.permute(0, 2, 3, 1).reshape(batch, height * width, inner_dim) + hidden_states = self.proj_in(hidden_states) + elif self.is_input_vectorized: + hidden_states = self.latent_image_embedding(hidden_states) + elif self.is_input_patches: + hidden_states = self.pos_embed(hidden_states) + + # 2. Blocks + for block in self.transformer_blocks: + hidden_states = block( + hidden_states, + attention_mask=attention_mask, + encoder_hidden_states=encoder_hidden_states, + encoder_attention_mask=encoder_attention_mask, + timestep=timestep, + cross_attention_kwargs=cross_attention_kwargs, + class_labels=class_labels, + ) + + # 3. Output + if self.is_input_continuous: + if not self.use_linear_projection: + hidden_states = hidden_states.reshape(batch, height, width, inner_dim).permute(0, 3, 1, 2).contiguous() + hidden_states = self.proj_out(hidden_states) + else: + hidden_states = self.proj_out(hidden_states) + hidden_states = hidden_states.reshape(batch, height, width, inner_dim).permute(0, 3, 1, 2).contiguous() + + output = hidden_states + residual + elif self.is_input_vectorized: + hidden_states = self.norm_out(hidden_states) + logits = self.out(hidden_states) + # (batch, self.num_vector_embeds - 1, self.num_latent_pixels) + logits = logits.permute(0, 2, 1) + + # log(p(x_0)) + output = F.log_softmax(logits.double(), dim=1).float() + elif self.is_input_patches: + # TODO: cleanup! + conditioning = self.transformer_blocks[0].norm1.emb( + timestep, class_labels, hidden_dtype=hidden_states.dtype + ) + shift, scale = self.proj_out_1(F.silu(conditioning)).chunk(2, dim=1) + hidden_states = self.norm_out(hidden_states) * (1 + scale[:, None]) + shift[:, None] + hidden_states = self.proj_out_2(hidden_states) + + # unpatchify + height = width = int(hidden_states.shape[1] ** 0.5) + hidden_states = hidden_states.reshape( + shape=(-1, height, width, self.patch_size, self.patch_size, self.out_channels) + ) + hidden_states = torch.einsum("nhwpqc->nchpwq", hidden_states) + output = hidden_states.reshape( + shape=(-1, self.out_channels, height * self.patch_size, width * self.patch_size) + ) + + if not return_dict: + return (output,) + + return Transformer2DModelOutput(sample=output) diff --git a/gf/sd-webui-rich-text/scripts/models/unet_2d_blocks.py b/gf/sd-webui-rich-text/scripts/models/unet_2d_blocks.py new file mode 100644 index 0000000000000000000000000000000000000000..e82868bfce623d847820866cbda716676036d60a --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/unet_2d_blocks.py @@ -0,0 +1,3198 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Any, Dict, Optional, Tuple + +import numpy as np +import torch +import torch.nn.functional as F +from torch import nn + +from diffusers_official.utils import is_torch_version, logging +from diffusers_official.models.attention import AdaGroupNorm +from scripts.models.attention_processor import Attention, AttnAddedKVProcessor, AttnAddedKVProcessor2_0 +from scripts.models.dual_transformer_2d import DualTransformer2DModel +from scripts.models.resnet import Downsample2D, FirDownsample2D, FirUpsample2D, KDownsample2D, KUpsample2D, ResnetBlock2D, Upsample2D +from scripts.models.transformer_2d import Transformer2DModel + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +def get_down_block( + down_block_type, + num_layers, + in_channels, + out_channels, + temb_channels, + add_downsample, + resnet_eps, + resnet_act_fn, + transformer_layers_per_block=1, + num_attention_heads=None, + resnet_groups=None, + cross_attention_dim=None, + downsample_padding=None, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + resnet_time_scale_shift="default", + resnet_skip_time_act=False, + resnet_out_scale_factor=1.0, + cross_attention_norm=None, + attention_head_dim=None, + downsample_type=None, +): + # If attn head dim is not defined, we default it to the number of heads + if attention_head_dim is None: + logger.warn( + f"It is recommended to provide `attention_head_dim` when calling `get_down_block`. Defaulting `attention_head_dim` to {num_attention_heads}." + ) + attention_head_dim = num_attention_heads + + down_block_type = down_block_type[7:] if down_block_type.startswith("UNetRes") else down_block_type + if down_block_type == "DownBlock2D": + return DownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "ResnetDownsampleBlock2D": + return ResnetDownsampleBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + output_scale_factor=resnet_out_scale_factor, + ) + elif down_block_type == "AttnDownBlock2D": + if add_downsample is False: + downsample_type = None + else: + downsample_type = downsample_type or "conv" # default to 'conv' + return AttnDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + downsample_type=downsample_type, + ) + elif down_block_type == "CrossAttnDownBlock2D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for CrossAttnDownBlock2D") + return CrossAttnDownBlock2D( + num_layers=num_layers, + transformer_layers_per_block=transformer_layers_per_block, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "SimpleCrossAttnDownBlock2D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for SimpleCrossAttnDownBlock2D") + return SimpleCrossAttnDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + cross_attention_dim=cross_attention_dim, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + output_scale_factor=resnet_out_scale_factor, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + ) + elif down_block_type == "SkipDownBlock2D": + return SkipDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + downsample_padding=downsample_padding, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "AttnSkipDownBlock2D": + return AttnSkipDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "DownEncoderBlock2D": + return DownEncoderBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "AttnDownEncoderBlock2D": + return AttnDownEncoderBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + downsample_padding=downsample_padding, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif down_block_type == "KDownBlock2D": + return KDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + ) + elif down_block_type == "KCrossAttnDownBlock2D": + return KCrossAttnDownBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_downsample=add_downsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + cross_attention_dim=cross_attention_dim, + attention_head_dim=attention_head_dim, + add_self_attention=True if not add_downsample else False, + ) + raise ValueError(f"{down_block_type} does not exist.") + + +def get_up_block( + up_block_type, + num_layers, + in_channels, + out_channels, + prev_output_channel, + temb_channels, + add_upsample, + resnet_eps, + resnet_act_fn, + transformer_layers_per_block=1, + num_attention_heads=None, + resnet_groups=None, + cross_attention_dim=None, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + resnet_time_scale_shift="default", + resnet_skip_time_act=False, + resnet_out_scale_factor=1.0, + cross_attention_norm=None, + attention_head_dim=None, + upsample_type=None, +): + # If attn head dim is not defined, we default it to the number of heads + if attention_head_dim is None: + logger.warn( + f"It is recommended to provide `attention_head_dim` when calling `get_up_block`. Defaulting `attention_head_dim` to {num_attention_heads}." + ) + attention_head_dim = num_attention_heads + + up_block_type = up_block_type[7:] if up_block_type.startswith("UNetRes") else up_block_type + if up_block_type == "UpBlock2D": + return UpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "ResnetUpsampleBlock2D": + return ResnetUpsampleBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + output_scale_factor=resnet_out_scale_factor, + ) + elif up_block_type == "CrossAttnUpBlock2D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for CrossAttnUpBlock2D") + return CrossAttnUpBlock2D( + num_layers=num_layers, + transformer_layers_per_block=transformer_layers_per_block, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + cross_attention_dim=cross_attention_dim, + num_attention_heads=num_attention_heads, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "SimpleCrossAttnUpBlock2D": + if cross_attention_dim is None: + raise ValueError("cross_attention_dim must be specified for SimpleCrossAttnUpBlock2D") + return SimpleCrossAttnUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + cross_attention_dim=cross_attention_dim, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + output_scale_factor=resnet_out_scale_factor, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + ) + elif up_block_type == "AttnUpBlock2D": + if add_upsample is False: + upsample_type = None + else: + upsample_type = upsample_type or "conv" # default to 'conv' + + return AttnUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + upsample_type=upsample_type, + ) + elif up_block_type == "SkipUpBlock2D": + return SkipUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "AttnSkipUpBlock2D": + return AttnSkipUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + prev_output_channel=prev_output_channel, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + ) + elif up_block_type == "UpDecoderBlock2D": + return UpDecoderBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + temb_channels=temb_channels, + ) + elif up_block_type == "AttnUpDecoderBlock2D": + return AttnUpDecoderBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + resnet_groups=resnet_groups, + attention_head_dim=attention_head_dim, + resnet_time_scale_shift=resnet_time_scale_shift, + temb_channels=temb_channels, + ) + elif up_block_type == "KUpBlock2D": + return KUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + ) + elif up_block_type == "KCrossAttnUpBlock2D": + return KCrossAttnUpBlock2D( + num_layers=num_layers, + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + add_upsample=add_upsample, + resnet_eps=resnet_eps, + resnet_act_fn=resnet_act_fn, + cross_attention_dim=cross_attention_dim, + attention_head_dim=attention_head_dim, + ) + + raise ValueError(f"{up_block_type} does not exist.") + + +class UNetMidBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", # default, spatial + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + add_attention: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + ): + super().__init__() + resnet_groups = resnet_groups if resnet_groups is not None else min(in_channels // 4, 32) + self.add_attention = add_attention + + # there is always at least one resnet + resnets = [ + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ] + attentions = [] + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {in_channels}." + ) + attention_head_dim = in_channels + + for _ in range(num_layers): + if self.add_attention: + attentions.append( + Attention( + in_channels, + heads=in_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups if resnet_time_scale_shift == "default" else None, + spatial_norm_dim=temb_channels if resnet_time_scale_shift == "spatial" else None, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + else: + attentions.append(None) + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + def forward(self, hidden_states, temb=None): + hidden_states = self.resnets[0](hidden_states, temb) + for attn, resnet in zip(self.attentions, self.resnets[1:]): + if attn is not None: + hidden_states = attn(hidden_states, temb=temb) + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + return hidden_states + + +class UNetMidBlock2DCrossAttn(nn.Module): + def __init__( + self, + in_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + transformer_layers_per_block: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + output_scale_factor=1.0, + cross_attention_dim=1280, + dual_cross_attention=False, + use_linear_projection=False, + upcast_attention=False, + ): + super().__init__() + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + resnet_groups = resnet_groups if resnet_groups is not None else min(in_channels // 4, 32) + + # there is always at least one resnet + resnets = [ + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ] + attentions = [] + + for _ in range(num_layers): + if not dual_cross_attention: + attentions.append( + Transformer2DModel( + num_attention_heads, + in_channels // num_attention_heads, + in_channels=in_channels, + num_layers=transformer_layers_per_block, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + upcast_attention=upcast_attention, + ) + ) + else: + attentions.append( + DualTransformer2DModel( + num_attention_heads, + in_channels // num_attention_heads, + in_channels=in_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ) -> torch.FloatTensor: + # Rich-Text: ignore the features + hidden_states, _ = self.resnets[0](hidden_states, temb) + for attn, resnet in zip(self.attentions, self.resnets[1:]): + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + attention_mask=attention_mask, + encoder_attention_mask=encoder_attention_mask, + return_dict=False, + )[0] + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + return hidden_states + + +class UNetMidBlock2DSimpleCrossAttn(nn.Module): + def __init__( + self, + in_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + cross_attention_dim=1280, + skip_time_act=False, + only_cross_attention=False, + cross_attention_norm=None, + ): + super().__init__() + + self.has_cross_attention = True + + self.attention_head_dim = attention_head_dim + resnet_groups = resnet_groups if resnet_groups is not None else min(in_channels // 4, 32) + + self.num_heads = in_channels // self.attention_head_dim + + # there is always at least one resnet + resnets = [ + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ] + attentions = [] + + for _ in range(num_layers): + processor = ( + AttnAddedKVProcessor2_0() if hasattr(F, "scaled_dot_product_attention") else AttnAddedKVProcessor() + ) + + attentions.append( + Attention( + query_dim=in_channels, + cross_attention_dim=in_channels, + heads=self.num_heads, + dim_head=self.attention_head_dim, + added_kv_proj_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + bias=True, + upcast_softmax=True, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + processor=processor, + ) + ) + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=in_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + if attention_mask is None: + # if encoder_hidden_states is defined: we are doing cross-attn, so we should use cross-attn mask. + mask = None if encoder_hidden_states is None else encoder_attention_mask + else: + # when attention_mask is defined: we don't even check for encoder_attention_mask. + # this is to maintain compatibility with UnCLIP, which uses 'attention_mask' param for cross-attn masks. + # TODO: UnCLIP should express cross-attn mask via encoder_attention_mask param instead of via attention_mask. + # then we can simplify this whole if/else block to: + # mask = attention_mask if encoder_hidden_states is None else encoder_attention_mask + mask = attention_mask + + hidden_states = self.resnets[0](hidden_states, temb) + for attn, resnet in zip(self.attentions, self.resnets[1:]): + # attn + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=mask, + **cross_attention_kwargs, + ) + + # resnet + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + return hidden_states + + +class AttnDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + downsample_padding=1, + downsample_type="conv", + ): + super().__init__() + resnets = [] + attentions = [] + self.downsample_type = downsample_type + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if downsample_type == "conv": + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + elif downsample_type == "resnet": + self.downsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + down=True, + ) + ] + ) + else: + self.downsamplers = None + + def forward(self, hidden_states, temb=None, upsample_size=None): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + hidden_states = attn(hidden_states) + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + if self.downsample_type == "resnet": + hidden_states = downsampler(hidden_states, temb=temb) + else: + hidden_states = downsampler(hidden_states) + + output_states += (hidden_states,) + + return hidden_states, output_states + + +class CrossAttnDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + transformer_layers_per_block: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + downsample_padding=1, + add_downsample=True, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + ): + super().__init__() + resnets = [] + attentions = [] + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + if not dual_cross_attention: + attentions.append( + Transformer2DModel( + num_attention_heads, + out_channels // num_attention_heads, + in_channels=out_channels, + num_layers=transformer_layers_per_block, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + ) + ) + else: + attentions.append( + DualTransformer2DModel( + num_attention_heads, + out_channels // num_attention_heads, + in_channels=out_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {} + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), + hidden_states, + temb, + **ckpt_kwargs, + ) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + None, # timestep + None, # class_labels + cross_attention_kwargs, + attention_mask, + encoder_attention_mask, + **ckpt_kwargs, + )[0] + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + attention_mask=attention_mask, + encoder_attention_mask=encoder_attention_mask, + return_dict=False, + )[0] + + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + output_states = output_states + (hidden_states,) + + return hidden_states, output_states + + +class DownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, temb=None): + output_states = () + + for resnet in self.resnets: + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + output_states = output_states + (hidden_states,) + + return hidden_states, output_states + + +class DownEncoderBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=None, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + def forward(self, hidden_states): + for resnet in self.resnets: + hidden_states = resnet(hidden_states, temb=None) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + return hidden_states + + +class AttnDownEncoderBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + resnets = [] + attentions = [] + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=None, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + Downsample2D( + out_channels, use_conv=True, out_channels=out_channels, padding=downsample_padding, name="op" + ) + ] + ) + else: + self.downsamplers = None + + def forward(self, hidden_states): + for resnet, attn in zip(self.resnets, self.attentions): + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb=None) + hidden_states = attn(hidden_states) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + return hidden_states + + +class AttnSkipDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=np.sqrt(2.0), + add_downsample=True, + ): + super().__init__() + self.attentions = nn.ModuleList([]) + self.resnets = nn.ModuleList([]) + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + self.resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(in_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + self.attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=32, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + if add_downsample: + self.resnet_down = ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + use_in_shortcut=True, + down=True, + kernel="fir", + ) + self.downsamplers = nn.ModuleList([FirDownsample2D(out_channels, out_channels=out_channels)]) + self.skip_conv = nn.Conv2d(3, out_channels, kernel_size=(1, 1), stride=(1, 1)) + else: + self.resnet_down = None + self.downsamplers = None + self.skip_conv = None + + def forward(self, hidden_states, temb=None, skip_sample=None): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + hidden_states = attn(hidden_states) + output_states += (hidden_states,) + + if self.downsamplers is not None: + hidden_states = self.resnet_down(hidden_states, temb) + for downsampler in self.downsamplers: + skip_sample = downsampler(skip_sample) + + hidden_states = self.skip_conv(skip_sample) + hidden_states + + output_states += (hidden_states,) + + return hidden_states, output_states, skip_sample + + +class SkipDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_pre_norm: bool = True, + output_scale_factor=np.sqrt(2.0), + add_downsample=True, + downsample_padding=1, + ): + super().__init__() + self.resnets = nn.ModuleList([]) + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + self.resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(in_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + if add_downsample: + self.resnet_down = ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + use_in_shortcut=True, + down=True, + kernel="fir", + ) + self.downsamplers = nn.ModuleList([FirDownsample2D(out_channels, out_channels=out_channels)]) + self.skip_conv = nn.Conv2d(3, out_channels, kernel_size=(1, 1), stride=(1, 1)) + else: + self.resnet_down = None + self.downsamplers = None + self.skip_conv = None + + def forward(self, hidden_states, temb=None, skip_sample=None): + output_states = () + + for resnet in self.resnets: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + output_states += (hidden_states,) + + if self.downsamplers is not None: + hidden_states = self.resnet_down(hidden_states, temb) + for downsampler in self.downsamplers: + skip_sample = downsampler(skip_sample) + + hidden_states = self.skip_conv(skip_sample) + hidden_states + + output_states += (hidden_states,) + + return hidden_states, output_states, skip_sample + + +class ResnetDownsampleBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_downsample=True, + skip_time_act=False, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + down=True, + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, temb=None): + output_states = () + + for resnet in self.resnets: + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states, temb) + + output_states = output_states + (hidden_states,) + + return hidden_states, output_states + + +class SimpleCrossAttnDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + add_downsample=True, + skip_time_act=False, + only_cross_attention=False, + cross_attention_norm=None, + ): + super().__init__() + + self.has_cross_attention = True + + resnets = [] + attentions = [] + + self.attention_head_dim = attention_head_dim + self.num_heads = out_channels // self.attention_head_dim + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + processor = ( + AttnAddedKVProcessor2_0() if hasattr(F, "scaled_dot_product_attention") else AttnAddedKVProcessor() + ) + + attentions.append( + Attention( + query_dim=out_channels, + cross_attention_dim=out_channels, + heads=self.num_heads, + dim_head=attention_head_dim, + added_kv_proj_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + bias=True, + upcast_softmax=True, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + processor=processor, + ) + ) + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + self.downsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + down=True, + ) + ] + ) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + output_states = () + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + if attention_mask is None: + # if encoder_hidden_states is defined: we are doing cross-attn, so we should use cross-attn mask. + mask = None if encoder_hidden_states is None else encoder_attention_mask + else: + # when attention_mask is defined: we don't even check for encoder_attention_mask. + # this is to maintain compatibility with UnCLIP, which uses 'attention_mask' param for cross-attn masks. + # TODO: UnCLIP should express cross-attn mask via encoder_attention_mask param instead of via attention_mask. + # then we can simplify this whole if/else block to: + # mask = attention_mask if encoder_hidden_states is None else encoder_attention_mask + mask = attention_mask + + for resnet, attn in zip(self.resnets, self.attentions): + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + hidden_states = torch.utils.checkpoint.checkpoint(create_custom_forward(resnet), hidden_states, temb) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + mask, + cross_attention_kwargs, + )[0] + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=mask, + **cross_attention_kwargs, + ) + + output_states = output_states + (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states, temb) + + output_states = output_states + (hidden_states,) + + return hidden_states, output_states + + +class KDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 4, + resnet_eps: float = 1e-5, + resnet_act_fn: str = "gelu", + resnet_group_size: int = 32, + add_downsample=False, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + groups = in_channels // resnet_group_size + groups_out = out_channels // resnet_group_size + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + dropout=dropout, + temb_channels=temb_channels, + groups=groups, + groups_out=groups_out, + eps=resnet_eps, + non_linearity=resnet_act_fn, + time_embedding_norm="ada_group", + conv_shortcut_bias=False, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_downsample: + # YiYi's comments- might be able to use FirDownsample2D, look into details later + self.downsamplers = nn.ModuleList([KDownsample2D()]) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, temb=None): + output_states = () + + for resnet in self.resnets: + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + output_states += (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + return hidden_states, output_states + + +class KCrossAttnDownBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + cross_attention_dim: int, + dropout: float = 0.0, + num_layers: int = 4, + resnet_group_size: int = 32, + add_downsample=True, + attention_head_dim: int = 64, + add_self_attention: bool = False, + resnet_eps: float = 1e-5, + resnet_act_fn: str = "gelu", + ): + super().__init__() + resnets = [] + attentions = [] + + self.has_cross_attention = True + + for i in range(num_layers): + in_channels = in_channels if i == 0 else out_channels + groups = in_channels // resnet_group_size + groups_out = out_channels // resnet_group_size + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + dropout=dropout, + temb_channels=temb_channels, + groups=groups, + groups_out=groups_out, + eps=resnet_eps, + non_linearity=resnet_act_fn, + time_embedding_norm="ada_group", + conv_shortcut_bias=False, + ) + ) + attentions.append( + KAttentionBlock( + out_channels, + out_channels // attention_head_dim, + attention_head_dim, + cross_attention_dim=cross_attention_dim, + temb_channels=temb_channels, + attention_bias=True, + add_self_attention=add_self_attention, + cross_attention_norm="layer_norm", + group_size=resnet_group_size, + ) + ) + + self.resnets = nn.ModuleList(resnets) + self.attentions = nn.ModuleList(attentions) + + if add_downsample: + self.downsamplers = nn.ModuleList([KDownsample2D()]) + else: + self.downsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + output_states = () + + for resnet, attn in zip(self.resnets, self.attentions): + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {} + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), + hidden_states, + temb, + **ckpt_kwargs, + ) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + temb, + attention_mask, + cross_attention_kwargs, + encoder_attention_mask, + **ckpt_kwargs, + ) + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + emb=temb, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + encoder_attention_mask=encoder_attention_mask, + ) + + if self.downsamplers is None: + output_states += (None,) + else: + output_states += (hidden_states,) + + if self.downsamplers is not None: + for downsampler in self.downsamplers: + hidden_states = downsampler(hidden_states) + + return hidden_states, output_states + + +class AttnUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + upsample_type="conv", + ): + super().__init__() + resnets = [] + attentions = [] + + self.upsample_type = upsample_type + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `in_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if upsample_type == "conv": + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + elif upsample_type == "resnet": + self.upsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + up=True, + ) + ] + ) + else: + self.upsamplers = None + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None): + for resnet, attn in zip(self.resnets, self.attentions): + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + hidden_states = attn(hidden_states) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + if self.upsample_type == "resnet": + hidden_states = upsampler(hidden_states, temb=temb) + else: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +class CrossAttnUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + prev_output_channel: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + transformer_layers_per_block: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + num_attention_heads=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + add_upsample=True, + dual_cross_attention=False, + use_linear_projection=False, + only_cross_attention=False, + upcast_attention=False, + ): + super().__init__() + resnets = [] + attentions = [] + + self.has_cross_attention = True + self.num_attention_heads = num_attention_heads + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + if not dual_cross_attention: + attentions.append( + Transformer2DModel( + num_attention_heads, + out_channels // num_attention_heads, + in_channels=out_channels, + num_layers=transformer_layers_per_block, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention, + upcast_attention=upcast_attention, + ) + ) + else: + attentions.append( + DualTransformer2DModel( + num_attention_heads, + out_channels // num_attention_heads, + in_channels=out_channels, + num_layers=1, + cross_attention_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + ) + ) + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + res_hidden_states_tuple: Tuple[torch.FloatTensor, ...], + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + upsample_size: Optional[int] = None, + attention_mask: Optional[torch.FloatTensor] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + for resnet, attn in zip(self.resnets, self.attentions): + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {} + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), + hidden_states, + temb, + **ckpt_kwargs, + ) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + None, # timestep + None, # class_labels + cross_attention_kwargs, + attention_mask, + encoder_attention_mask, + **ckpt_kwargs, + )[0] + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + attention_mask=attention_mask, + encoder_attention_mask=encoder_attention_mask, + return_dict=False, + )[0] + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, upsample_size) + + return hidden_states + + +class UpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_upsample=True, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, upsample_size) + + return hidden_states + + +class UpDecoderBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", # default, spatial + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_upsample=True, + temb_channels=None, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + input_channels = in_channels if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=input_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + def forward(self, hidden_states, temb=None): + for resnet in self.resnets: + hidden_states = resnet(hidden_states, temb=temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +class AttnUpDecoderBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=1.0, + add_upsample=True, + temb_channels=None, + ): + super().__init__() + resnets = [] + attentions = [] + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `out_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + for i in range(num_layers): + input_channels = in_channels if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=input_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=resnet_groups if resnet_time_scale_shift != "spatial" else None, + spatial_norm_dim=temb_channels if resnet_time_scale_shift == "spatial" else None, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([Upsample2D(out_channels, use_conv=True, out_channels=out_channels)]) + else: + self.upsamplers = None + + def forward(self, hidden_states, temb=None): + for resnet, attn in zip(self.resnets, self.attentions): + hidden_states = resnet(hidden_states, temb=temb) + hidden_states = attn(hidden_states, temb=temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +class AttnSkipUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_pre_norm: bool = True, + attention_head_dim=1, + output_scale_factor=np.sqrt(2.0), + add_upsample=True, + ): + super().__init__() + self.attentions = nn.ModuleList([]) + self.resnets = nn.ModuleList([]) + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + self.resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(resnet_in_channels + res_skip_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + if attention_head_dim is None: + logger.warn( + f"It is not recommend to pass `attention_head_dim=None`. Defaulting `attention_head_dim` to `out_channels`: {out_channels}." + ) + attention_head_dim = out_channels + + self.attentions.append( + Attention( + out_channels, + heads=out_channels // attention_head_dim, + dim_head=attention_head_dim, + rescale_output_factor=output_scale_factor, + eps=resnet_eps, + norm_num_groups=32, + residual_connection=True, + bias=True, + upcast_softmax=True, + _from_deprecated_attn_block=True, + ) + ) + + self.upsampler = FirUpsample2D(in_channels, out_channels=out_channels) + if add_upsample: + self.resnet_up = ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(out_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + use_in_shortcut=True, + up=True, + kernel="fir", + ) + self.skip_conv = nn.Conv2d(out_channels, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) + self.skip_norm = torch.nn.GroupNorm( + num_groups=min(out_channels // 4, 32), num_channels=out_channels, eps=resnet_eps, affine=True + ) + self.act = nn.SiLU() + else: + self.resnet_up = None + self.skip_conv = None + self.skip_norm = None + self.act = None + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, skip_sample=None): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + hidden_states = self.attentions[0](hidden_states) + + if skip_sample is not None: + skip_sample = self.upsampler(skip_sample) + else: + skip_sample = 0 + + if self.resnet_up is not None: + skip_sample_states = self.skip_norm(hidden_states) + skip_sample_states = self.act(skip_sample_states) + skip_sample_states = self.skip_conv(skip_sample_states) + + skip_sample = skip_sample + skip_sample_states + + hidden_states = self.resnet_up(hidden_states, temb) + + return hidden_states, skip_sample + + +class SkipUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_pre_norm: bool = True, + output_scale_factor=np.sqrt(2.0), + add_upsample=True, + upsample_padding=1, + ): + super().__init__() + self.resnets = nn.ModuleList([]) + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + self.resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min((resnet_in_channels + res_skip_channels) // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + ) + ) + + self.upsampler = FirUpsample2D(in_channels, out_channels=out_channels) + if add_upsample: + self.resnet_up = ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=min(out_channels // 4, 32), + groups_out=min(out_channels // 4, 32), + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + use_in_shortcut=True, + up=True, + kernel="fir", + ) + self.skip_conv = nn.Conv2d(out_channels, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) + self.skip_norm = torch.nn.GroupNorm( + num_groups=min(out_channels // 4, 32), num_channels=out_channels, eps=resnet_eps, affine=True + ) + self.act = nn.SiLU() + else: + self.resnet_up = None + self.skip_conv = None + self.skip_norm = None + self.act = None + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, skip_sample=None): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + if skip_sample is not None: + skip_sample = self.upsampler(skip_sample) + else: + skip_sample = 0 + + if self.resnet_up is not None: + skip_sample_states = self.skip_norm(hidden_states) + skip_sample_states = self.act(skip_sample_states) + skip_sample_states = self.skip_conv(skip_sample_states) + + skip_sample = skip_sample + skip_sample_states + + hidden_states = self.resnet_up(hidden_states, temb) + + return hidden_states, skip_sample + + +class ResnetUpsampleBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + prev_output_channel: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + output_scale_factor=1.0, + add_upsample=True, + skip_time_act=False, + ): + super().__init__() + resnets = [] + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + up=True, + ) + ] + ) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None): + for resnet in self.resnets: + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, temb) + + return hidden_states + + +class SimpleCrossAttnUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + prev_output_channel: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 1, + resnet_eps: float = 1e-6, + resnet_time_scale_shift: str = "default", + resnet_act_fn: str = "swish", + resnet_groups: int = 32, + resnet_pre_norm: bool = True, + attention_head_dim=1, + cross_attention_dim=1280, + output_scale_factor=1.0, + add_upsample=True, + skip_time_act=False, + only_cross_attention=False, + cross_attention_norm=None, + ): + super().__init__() + resnets = [] + attentions = [] + + self.has_cross_attention = True + self.attention_head_dim = attention_head_dim + + self.num_heads = out_channels // self.attention_head_dim + + for i in range(num_layers): + res_skip_channels = in_channels if (i == num_layers - 1) else out_channels + resnet_in_channels = prev_output_channel if i == 0 else out_channels + + resnets.append( + ResnetBlock2D( + in_channels=resnet_in_channels + res_skip_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + ) + ) + + processor = ( + AttnAddedKVProcessor2_0() if hasattr(F, "scaled_dot_product_attention") else AttnAddedKVProcessor() + ) + + attentions.append( + Attention( + query_dim=out_channels, + cross_attention_dim=out_channels, + heads=self.num_heads, + dim_head=self.attention_head_dim, + added_kv_proj_dim=cross_attention_dim, + norm_num_groups=resnet_groups, + bias=True, + upcast_softmax=True, + only_cross_attention=only_cross_attention, + cross_attention_norm=cross_attention_norm, + processor=processor, + ) + ) + self.attentions = nn.ModuleList(attentions) + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList( + [ + ResnetBlock2D( + in_channels=out_channels, + out_channels=out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=resnet_groups, + dropout=dropout, + time_embedding_norm=resnet_time_scale_shift, + non_linearity=resnet_act_fn, + output_scale_factor=output_scale_factor, + pre_norm=resnet_pre_norm, + skip_time_act=skip_time_act, + up=True, + ) + ] + ) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + res_hidden_states_tuple: Tuple[torch.FloatTensor, ...], + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + upsample_size: Optional[int] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + if attention_mask is None: + # if encoder_hidden_states is defined: we are doing cross-attn, so we should use cross-attn mask. + mask = None if encoder_hidden_states is None else encoder_attention_mask + else: + # when attention_mask is defined: we don't even check for encoder_attention_mask. + # this is to maintain compatibility with UnCLIP, which uses 'attention_mask' param for cross-attn masks. + # TODO: UnCLIP should express cross-attn mask via encoder_attention_mask param instead of via attention_mask. + # then we can simplify this whole if/else block to: + # mask = attention_mask if encoder_hidden_states is None else encoder_attention_mask + mask = attention_mask + + for resnet, attn in zip(self.resnets, self.attentions): + # resnet + # pop res hidden states + res_hidden_states = res_hidden_states_tuple[-1] + res_hidden_states_tuple = res_hidden_states_tuple[:-1] + hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1) + + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + hidden_states = torch.utils.checkpoint.checkpoint(create_custom_forward(resnet), hidden_states, temb) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + mask, + cross_attention_kwargs, + )[0] + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=mask, + **cross_attention_kwargs, + ) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states, temb) + + return hidden_states + + +class KUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 5, + resnet_eps: float = 1e-5, + resnet_act_fn: str = "gelu", + resnet_group_size: Optional[int] = 32, + add_upsample=True, + ): + super().__init__() + resnets = [] + k_in_channels = 2 * out_channels + k_out_channels = in_channels + num_layers = num_layers - 1 + + for i in range(num_layers): + in_channels = k_in_channels if i == 0 else out_channels + groups = in_channels // resnet_group_size + groups_out = out_channels // resnet_group_size + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=k_out_channels if (i == num_layers - 1) else out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=groups, + groups_out=groups_out, + dropout=dropout, + non_linearity=resnet_act_fn, + time_embedding_norm="ada_group", + conv_shortcut_bias=False, + ) + ) + + self.resnets = nn.ModuleList(resnets) + + if add_upsample: + self.upsamplers = nn.ModuleList([KUpsample2D()]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward(self, hidden_states, res_hidden_states_tuple, temb=None, upsample_size=None): + res_hidden_states_tuple = res_hidden_states_tuple[-1] + if res_hidden_states_tuple is not None: + hidden_states = torch.cat([hidden_states, res_hidden_states_tuple], dim=1) + + for resnet in self.resnets: + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module): + def custom_forward(*inputs): + return module(*inputs) + + return custom_forward + + if is_torch_version(">=", "1.11.0"): + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb, use_reentrant=False + ) + else: + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), hidden_states, temb + ) + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +class KCrossAttnUpBlock2D(nn.Module): + def __init__( + self, + in_channels: int, + out_channels: int, + temb_channels: int, + dropout: float = 0.0, + num_layers: int = 4, + resnet_eps: float = 1e-5, + resnet_act_fn: str = "gelu", + resnet_group_size: int = 32, + attention_head_dim=1, # attention dim_head + cross_attention_dim: int = 768, + add_upsample: bool = True, + upcast_attention: bool = False, + ): + super().__init__() + resnets = [] + attentions = [] + + is_first_block = in_channels == out_channels == temb_channels + is_middle_block = in_channels != out_channels + add_self_attention = True if is_first_block else False + + self.has_cross_attention = True + self.attention_head_dim = attention_head_dim + + # in_channels, and out_channels for the block (k-unet) + k_in_channels = out_channels if is_first_block else 2 * out_channels + k_out_channels = in_channels + + num_layers = num_layers - 1 + + for i in range(num_layers): + in_channels = k_in_channels if i == 0 else out_channels + groups = in_channels // resnet_group_size + groups_out = out_channels // resnet_group_size + + if is_middle_block and (i == num_layers - 1): + conv_2d_out_channels = k_out_channels + else: + conv_2d_out_channels = None + + resnets.append( + ResnetBlock2D( + in_channels=in_channels, + out_channels=out_channels, + conv_2d_out_channels=conv_2d_out_channels, + temb_channels=temb_channels, + eps=resnet_eps, + groups=groups, + groups_out=groups_out, + dropout=dropout, + non_linearity=resnet_act_fn, + time_embedding_norm="ada_group", + conv_shortcut_bias=False, + ) + ) + attentions.append( + KAttentionBlock( + k_out_channels if (i == num_layers - 1) else out_channels, + k_out_channels // attention_head_dim + if (i == num_layers - 1) + else out_channels // attention_head_dim, + attention_head_dim, + cross_attention_dim=cross_attention_dim, + temb_channels=temb_channels, + attention_bias=True, + add_self_attention=add_self_attention, + cross_attention_norm="layer_norm", + upcast_attention=upcast_attention, + ) + ) + + self.resnets = nn.ModuleList(resnets) + self.attentions = nn.ModuleList(attentions) + + if add_upsample: + self.upsamplers = nn.ModuleList([KUpsample2D()]) + else: + self.upsamplers = None + + self.gradient_checkpointing = False + + def forward( + self, + hidden_states: torch.FloatTensor, + res_hidden_states_tuple: Tuple[torch.FloatTensor, ...], + temb: Optional[torch.FloatTensor] = None, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + upsample_size: Optional[int] = None, + attention_mask: Optional[torch.FloatTensor] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + res_hidden_states_tuple = res_hidden_states_tuple[-1] + if res_hidden_states_tuple is not None: + hidden_states = torch.cat([hidden_states, res_hidden_states_tuple], dim=1) + + for resnet, attn in zip(self.resnets, self.attentions): + if self.training and self.gradient_checkpointing: + + def create_custom_forward(module, return_dict=None): + def custom_forward(*inputs): + if return_dict is not None: + return module(*inputs, return_dict=return_dict) + else: + return module(*inputs) + + return custom_forward + + ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {} + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(resnet), + hidden_states, + temb, + **ckpt_kwargs, + ) + hidden_states = torch.utils.checkpoint.checkpoint( + create_custom_forward(attn, return_dict=False), + hidden_states, + encoder_hidden_states, + temb, + attention_mask, + cross_attention_kwargs, + encoder_attention_mask, + **ckpt_kwargs, + )[0] + else: + # Rich-Text: ignore the features + hidden_states, _ = resnet(hidden_states, temb) + hidden_states = attn( + hidden_states, + encoder_hidden_states=encoder_hidden_states, + emb=temb, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + encoder_attention_mask=encoder_attention_mask, + ) + + if self.upsamplers is not None: + for upsampler in self.upsamplers: + hidden_states = upsampler(hidden_states) + + return hidden_states + + +# can potentially later be renamed to `No-feed-forward` attention +class KAttentionBlock(nn.Module): + r""" + A basic Transformer block. + + Parameters: + dim (`int`): The number of channels in the input and output. + num_attention_heads (`int`): The number of heads to use for multi-head attention. + attention_head_dim (`int`): The number of channels in each head. + dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use. + cross_attention_dim (`int`, *optional*): The size of the encoder_hidden_states vector for cross attention. + activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward. + num_embeds_ada_norm (: + obj: `int`, *optional*): The number of diffusion steps used during training. See `Transformer2DModel`. + attention_bias (: + obj: `bool`, *optional*, defaults to `False`): Configure if the attentions should contain a bias parameter. + """ + + def __init__( + self, + dim: int, + num_attention_heads: int, + attention_head_dim: int, + dropout: float = 0.0, + cross_attention_dim: Optional[int] = None, + attention_bias: bool = False, + upcast_attention: bool = False, + temb_channels: int = 768, # for ada_group_norm + add_self_attention: bool = False, + cross_attention_norm: Optional[str] = None, + group_size: int = 32, + ): + super().__init__() + self.add_self_attention = add_self_attention + + # 1. Self-Attn + if add_self_attention: + self.norm1 = AdaGroupNorm(temb_channels, dim, max(1, dim // group_size)) + self.attn1 = Attention( + query_dim=dim, + heads=num_attention_heads, + dim_head=attention_head_dim, + dropout=dropout, + bias=attention_bias, + cross_attention_dim=None, + cross_attention_norm=None, + ) + + # 2. Cross-Attn + self.norm2 = AdaGroupNorm(temb_channels, dim, max(1, dim // group_size)) + self.attn2 = Attention( + query_dim=dim, + cross_attention_dim=cross_attention_dim, + heads=num_attention_heads, + dim_head=attention_head_dim, + dropout=dropout, + bias=attention_bias, + upcast_attention=upcast_attention, + cross_attention_norm=cross_attention_norm, + ) + + def _to_3d(self, hidden_states, height, weight): + return hidden_states.permute(0, 2, 3, 1).reshape(hidden_states.shape[0], height * weight, -1) + + def _to_4d(self, hidden_states, height, weight): + return hidden_states.permute(0, 2, 1).reshape(hidden_states.shape[0], -1, height, weight) + + def forward( + self, + hidden_states: torch.FloatTensor, + encoder_hidden_states: Optional[torch.FloatTensor] = None, + # TODO: mark emb as non-optional (self.norm2 requires it). + # requires assessing impact of change to positional param interface. + emb: Optional[torch.FloatTensor] = None, + attention_mask: Optional[torch.FloatTensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + encoder_attention_mask: Optional[torch.FloatTensor] = None, + ): + cross_attention_kwargs = cross_attention_kwargs if cross_attention_kwargs is not None else {} + + # 1. Self-Attention + if self.add_self_attention: + norm_hidden_states = self.norm1(hidden_states, emb) + + height, weight = norm_hidden_states.shape[2:] + norm_hidden_states = self._to_3d(norm_hidden_states, height, weight) + + attn_output = self.attn1( + norm_hidden_states, + encoder_hidden_states=None, + attention_mask=attention_mask, + **cross_attention_kwargs, + ) + attn_output = self._to_4d(attn_output, height, weight) + + hidden_states = attn_output + hidden_states + + # 2. Cross-Attention/None + norm_hidden_states = self.norm2(hidden_states, emb) + + height, weight = norm_hidden_states.shape[2:] + norm_hidden_states = self._to_3d(norm_hidden_states, height, weight) + attn_output = self.attn2( + norm_hidden_states, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask if encoder_hidden_states is None else encoder_attention_mask, + **cross_attention_kwargs, + ) + attn_output = self._to_4d(attn_output, height, weight) + + hidden_states = attn_output + hidden_states + + return hidden_states diff --git a/gf/sd-webui-rich-text/scripts/models/unet_2d_condition.py b/gf/sd-webui-rich-text/scripts/models/unet_2d_condition.py new file mode 100644 index 0000000000000000000000000000000000000000..8d9acb6c6579f192b78550af768a9bf086a54590 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/unet_2d_condition.py @@ -0,0 +1,983 @@ +# Copyright 2023 The HuggingFace Team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from dataclasses import dataclass +from typing import Any, Dict, List, Optional, Tuple, Union + +import torch +import torch.nn as nn +import torch.utils.checkpoint + +from diffusers_official.configuration_utils import ConfigMixin, register_to_config +from diffusers_official.loaders import UNet2DConditionLoadersMixin +from diffusers_official.utils import BaseOutput, logging +from diffusers_official.models.activations import get_activation + +from diffusers_official.models.embeddings import ( + GaussianFourierProjection, + ImageHintTimeEmbedding, + ImageProjection, + ImageTimeEmbedding, + TextImageProjection, + TextImageTimeEmbedding, + TextTimeEmbedding, + TimestepEmbedding, + Timesteps, +) +from diffusers_official.models.modeling_utils import ModelMixin + +from scripts.models.attention_processor import AttentionProcessor, AttnProcessor + +from scripts.models.unet_2d_blocks import ( + CrossAttnDownBlock2D, + CrossAttnUpBlock2D, + DownBlock2D, + UNetMidBlock2DCrossAttn, + UNetMidBlock2DSimpleCrossAttn, + UpBlock2D, + get_down_block, + get_up_block, +) + + +logger = logging.get_logger(__name__) # pylint: disable=invalid-name + + +@dataclass +class UNet2DConditionOutput(BaseOutput): + """ + The output of [`UNet2DConditionModel`]. + + Args: + sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)`): + The hidden states output conditioned on `encoder_hidden_states` input. Output of last layer of model. + """ + + sample: torch.FloatTensor = None + + +class UNet2DConditionModel(ModelMixin, ConfigMixin, UNet2DConditionLoadersMixin): + r""" + A conditional 2D UNet model that takes a noisy sample, conditional state, and a timestep and returns a sample + shaped output. + + This model inherits from [`ModelMixin`]. Check the superclass documentation for it's generic methods implemented + for all models (such as downloading or saving). + + Parameters: + sample_size (`int` or `Tuple[int, int]`, *optional*, defaults to `None`): + Height and width of input/output sample. + in_channels (`int`, *optional*, defaults to 4): Number of channels in the input sample. + out_channels (`int`, *optional*, defaults to 4): Number of channels in the output. + center_input_sample (`bool`, *optional*, defaults to `False`): Whether to center the input sample. + flip_sin_to_cos (`bool`, *optional*, defaults to `False`): + Whether to flip the sin to cos in the time embedding. + freq_shift (`int`, *optional*, defaults to 0): The frequency shift to apply to the time embedding. + down_block_types (`Tuple[str]`, *optional*, defaults to `("CrossAttnDownBlock2D", "CrossAttnDownBlock2D", "CrossAttnDownBlock2D", "DownBlock2D")`): + The tuple of downsample blocks to use. + mid_block_type (`str`, *optional*, defaults to `"UNetMidBlock2DCrossAttn"`): + Block type for middle of UNet, it can be either `UNetMidBlock2DCrossAttn` or + `UNetMidBlock2DSimpleCrossAttn`. If `None`, the mid block layer is skipped. + up_block_types (`Tuple[str]`, *optional*, defaults to `("UpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D")`): + The tuple of upsample blocks to use. + only_cross_attention(`bool` or `Tuple[bool]`, *optional*, default to `False`): + Whether to include self-attention in the basic transformer blocks, see + [`~models.attention.BasicTransformerBlock`]. + block_out_channels (`Tuple[int]`, *optional*, defaults to `(320, 640, 1280, 1280)`): + The tuple of output channels for each block. + layers_per_block (`int`, *optional*, defaults to 2): The number of layers per block. + downsample_padding (`int`, *optional*, defaults to 1): The padding to use for the downsampling convolution. + mid_block_scale_factor (`float`, *optional*, defaults to 1.0): The scale factor to use for the mid block. + act_fn (`str`, *optional*, defaults to `"silu"`): The activation function to use. + norm_num_groups (`int`, *optional*, defaults to 32): The number of groups to use for the normalization. + If `None`, normalization and activation layers is skipped in post-processing. + norm_eps (`float`, *optional*, defaults to 1e-5): The epsilon to use for the normalization. + cross_attention_dim (`int` or `Tuple[int]`, *optional*, defaults to 1280): + The dimension of the cross attention features. + transformer_layers_per_block (`int` or `Tuple[int]`, *optional*, defaults to 1): + The number of transformer blocks of type [`~models.attention.BasicTransformerBlock`]. Only relevant for + [`~models.unet_2d_blocks.CrossAttnDownBlock2D`], [`~models.unet_2d_blocks.CrossAttnUpBlock2D`], + [`~models.unet_2d_blocks.UNetMidBlock2DCrossAttn`]. + encoder_hid_dim (`int`, *optional*, defaults to None): + If `encoder_hid_dim_type` is defined, `encoder_hidden_states` will be projected from `encoder_hid_dim` + dimension to `cross_attention_dim`. + encoder_hid_dim_type (`str`, *optional*, defaults to `None`): + If given, the `encoder_hidden_states` and potentially other embeddings are down-projected to text + embeddings of dimension `cross_attention` according to `encoder_hid_dim_type`. + attention_head_dim (`int`, *optional*, defaults to 8): The dimension of the attention heads. + num_attention_heads (`int`, *optional*): + The number of attention heads. If not defined, defaults to `attention_head_dim` + resnet_time_scale_shift (`str`, *optional*, defaults to `"default"`): Time scale shift config + for ResNet blocks (see [`~models.resnet.ResnetBlock2D`]). Choose from `default` or `scale_shift`. + class_embed_type (`str`, *optional*, defaults to `None`): + The type of class embedding to use which is ultimately summed with the time embeddings. Choose from `None`, + `"timestep"`, `"identity"`, `"projection"`, or `"simple_projection"`. + addition_embed_type (`str`, *optional*, defaults to `None`): + Configures an optional embedding which will be summed with the time embeddings. Choose from `None` or + "text". "text" will use the `TextTimeEmbedding` layer. + addition_time_embed_dim: (`int`, *optional*, defaults to `None`): + Dimension for the timestep embeddings. + num_class_embeds (`int`, *optional*, defaults to `None`): + Input dimension of the learnable embedding matrix to be projected to `time_embed_dim`, when performing + class conditioning with `class_embed_type` equal to `None`. + time_embedding_type (`str`, *optional*, defaults to `positional`): + The type of position embedding to use for timesteps. Choose from `positional` or `fourier`. + time_embedding_dim (`int`, *optional*, defaults to `None`): + An optional override for the dimension of the projected time embedding. + time_embedding_act_fn (`str`, *optional*, defaults to `None`): + Optional activation function to use only once on the time embeddings before they are passed to the rest of + the UNet. Choose from `silu`, `mish`, `gelu`, and `swish`. + timestep_post_act (`str`, *optional*, defaults to `None`): + The second activation function to use in timestep embedding. Choose from `silu`, `mish` and `gelu`. + time_cond_proj_dim (`int`, *optional*, defaults to `None`): + The dimension of `cond_proj` layer in the timestep embedding. + conv_in_kernel (`int`, *optional*, default to `3`): The kernel size of `conv_in` layer. + conv_out_kernel (`int`, *optional*, default to `3`): The kernel size of `conv_out` layer. + projection_class_embeddings_input_dim (`int`, *optional*): The dimension of the `class_labels` input when + `class_embed_type="projection"`. Required when `class_embed_type="projection"`. + class_embeddings_concat (`bool`, *optional*, defaults to `False`): Whether to concatenate the time + embeddings with the class embeddings. + mid_block_only_cross_attention (`bool`, *optional*, defaults to `None`): + Whether to use cross attention with the mid block when using the `UNetMidBlock2DSimpleCrossAttn`. If + `only_cross_attention` is given as a single boolean and `mid_block_only_cross_attention` is `None`, the + `only_cross_attention` value is used as the value for `mid_block_only_cross_attention`. Default to `False` + otherwise. + """ + + _supports_gradient_checkpointing = True + + @register_to_config + def __init__( + self, + sample_size: Optional[int] = None, + in_channels: int = 4, + out_channels: int = 4, + center_input_sample: bool = False, + flip_sin_to_cos: bool = True, + freq_shift: int = 0, + down_block_types: Tuple[str] = ( + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "CrossAttnDownBlock2D", + "DownBlock2D", + ), + mid_block_type: Optional[str] = "UNetMidBlock2DCrossAttn", + up_block_types: Tuple[str] = ("UpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D", "CrossAttnUpBlock2D"), + only_cross_attention: Union[bool, Tuple[bool]] = False, + block_out_channels: Tuple[int] = (320, 640, 1280, 1280), + layers_per_block: Union[int, Tuple[int]] = 2, + downsample_padding: int = 1, + mid_block_scale_factor: float = 1, + act_fn: str = "silu", + norm_num_groups: Optional[int] = 32, + norm_eps: float = 1e-5, + cross_attention_dim: Union[int, Tuple[int]] = 1280, + transformer_layers_per_block: Union[int, Tuple[int]] = 1, + encoder_hid_dim: Optional[int] = None, + encoder_hid_dim_type: Optional[str] = None, + attention_head_dim: Union[int, Tuple[int]] = 8, + num_attention_heads: Optional[Union[int, Tuple[int]]] = None, + dual_cross_attention: bool = False, + use_linear_projection: bool = False, + class_embed_type: Optional[str] = None, + addition_embed_type: Optional[str] = None, + addition_time_embed_dim: Optional[int] = None, + num_class_embeds: Optional[int] = None, + upcast_attention: bool = False, + resnet_time_scale_shift: str = "default", + resnet_skip_time_act: bool = False, + resnet_out_scale_factor: int = 1.0, + time_embedding_type: str = "positional", + time_embedding_dim: Optional[int] = None, + time_embedding_act_fn: Optional[str] = None, + timestep_post_act: Optional[str] = None, + time_cond_proj_dim: Optional[int] = None, + conv_in_kernel: int = 3, + conv_out_kernel: int = 3, + projection_class_embeddings_input_dim: Optional[int] = None, + class_embeddings_concat: bool = False, + mid_block_only_cross_attention: Optional[bool] = None, + cross_attention_norm: Optional[str] = None, + addition_embed_type_num_heads=64, + ): + super().__init__() + + self.sample_size = sample_size + + if num_attention_heads is not None: + raise ValueError( + "At the moment it is not possible to define the number of attention heads via `num_attention_heads` because of a naming issue as described in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131. Passing `num_attention_heads` will only be supported in diffusers v0.19." + ) + + # If `num_attention_heads` is not defined (which is the case for most models) + # it will default to `attention_head_dim`. This looks weird upon first reading it and it is. + # The reason for this behavior is to correct for incorrectly named variables that were introduced + # when this library was created. The incorrect naming was only discovered much later in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131 + # Changing `attention_head_dim` to `num_attention_heads` for 40,000+ configurations is too backwards breaking + # which is why we correct for the naming here. + num_attention_heads = num_attention_heads or attention_head_dim + + # Check inputs + if len(down_block_types) != len(up_block_types): + raise ValueError( + f"Must provide the same number of `down_block_types` as `up_block_types`. `down_block_types`: {down_block_types}. `up_block_types`: {up_block_types}." + ) + + if len(block_out_channels) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `block_out_channels` as `down_block_types`. `block_out_channels`: {block_out_channels}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(only_cross_attention, bool) and len(only_cross_attention) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `only_cross_attention` as `down_block_types`. `only_cross_attention`: {only_cross_attention}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(num_attention_heads, int) and len(num_attention_heads) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `num_attention_heads` as `down_block_types`. `num_attention_heads`: {num_attention_heads}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(attention_head_dim, int) and len(attention_head_dim) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `attention_head_dim` as `down_block_types`. `attention_head_dim`: {attention_head_dim}. `down_block_types`: {down_block_types}." + ) + + if isinstance(cross_attention_dim, list) and len(cross_attention_dim) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `cross_attention_dim` as `down_block_types`. `cross_attention_dim`: {cross_attention_dim}. `down_block_types`: {down_block_types}." + ) + + if not isinstance(layers_per_block, int) and len(layers_per_block) != len(down_block_types): + raise ValueError( + f"Must provide the same number of `layers_per_block` as `down_block_types`. `layers_per_block`: {layers_per_block}. `down_block_types`: {down_block_types}." + ) + + # input + conv_in_padding = (conv_in_kernel - 1) // 2 + self.conv_in = nn.Conv2d( + in_channels, block_out_channels[0], kernel_size=conv_in_kernel, padding=conv_in_padding + ) + + # time + if time_embedding_type == "fourier": + time_embed_dim = time_embedding_dim or block_out_channels[0] * 2 + if time_embed_dim % 2 != 0: + raise ValueError(f"`time_embed_dim` should be divisible by 2, but is {time_embed_dim}.") + self.time_proj = GaussianFourierProjection( + time_embed_dim // 2, set_W_to_weight=False, log=False, flip_sin_to_cos=flip_sin_to_cos + ) + timestep_input_dim = time_embed_dim + elif time_embedding_type == "positional": + time_embed_dim = time_embedding_dim or block_out_channels[0] * 4 + + self.time_proj = Timesteps(block_out_channels[0], flip_sin_to_cos, freq_shift) + timestep_input_dim = block_out_channels[0] + else: + raise ValueError( + f"{time_embedding_type} does not exist. Please make sure to use one of `fourier` or `positional`." + ) + + self.time_embedding = TimestepEmbedding( + timestep_input_dim, + time_embed_dim, + act_fn=act_fn, + post_act_fn=timestep_post_act, + cond_proj_dim=time_cond_proj_dim, + ) + + if encoder_hid_dim_type is None and encoder_hid_dim is not None: + encoder_hid_dim_type = "text_proj" + self.register_to_config(encoder_hid_dim_type=encoder_hid_dim_type) + logger.info("encoder_hid_dim_type defaults to 'text_proj' as `encoder_hid_dim` is defined.") + + if encoder_hid_dim is None and encoder_hid_dim_type is not None: + raise ValueError( + f"`encoder_hid_dim` has to be defined when `encoder_hid_dim_type` is set to {encoder_hid_dim_type}." + ) + + if encoder_hid_dim_type == "text_proj": + self.encoder_hid_proj = nn.Linear(encoder_hid_dim, cross_attention_dim) + elif encoder_hid_dim_type == "text_image_proj": + # image_embed_dim DOESN'T have to be `cross_attention_dim`. To not clutter the __init__ too much + # they are set to `cross_attention_dim` here as this is exactly the required dimension for the currently only use + # case when `addition_embed_type == "text_image_proj"` (Kadinsky 2.1)` + self.encoder_hid_proj = TextImageProjection( + text_embed_dim=encoder_hid_dim, + image_embed_dim=cross_attention_dim, + cross_attention_dim=cross_attention_dim, + ) + elif encoder_hid_dim_type == "image_proj": + # Kandinsky 2.2 + self.encoder_hid_proj = ImageProjection( + image_embed_dim=encoder_hid_dim, + cross_attention_dim=cross_attention_dim, + ) + elif encoder_hid_dim_type is not None: + raise ValueError( + f"encoder_hid_dim_type: {encoder_hid_dim_type} must be None, 'text_proj' or 'text_image_proj'." + ) + else: + self.encoder_hid_proj = None + + # class embedding + if class_embed_type is None and num_class_embeds is not None: + self.class_embedding = nn.Embedding(num_class_embeds, time_embed_dim) + elif class_embed_type == "timestep": + self.class_embedding = TimestepEmbedding(timestep_input_dim, time_embed_dim, act_fn=act_fn) + elif class_embed_type == "identity": + self.class_embedding = nn.Identity(time_embed_dim, time_embed_dim) + elif class_embed_type == "projection": + if projection_class_embeddings_input_dim is None: + raise ValueError( + "`class_embed_type`: 'projection' requires `projection_class_embeddings_input_dim` be set" + ) + # The projection `class_embed_type` is the same as the timestep `class_embed_type` except + # 1. the `class_labels` inputs are not first converted to sinusoidal embeddings + # 2. it projects from an arbitrary input dimension. + # + # Note that `TimestepEmbedding` is quite general, being mainly linear layers and activations. + # When used for embedding actual timesteps, the timesteps are first converted to sinusoidal embeddings. + # As a result, `TimestepEmbedding` can be passed arbitrary vectors. + self.class_embedding = TimestepEmbedding(projection_class_embeddings_input_dim, time_embed_dim) + elif class_embed_type == "simple_projection": + if projection_class_embeddings_input_dim is None: + raise ValueError( + "`class_embed_type`: 'simple_projection' requires `projection_class_embeddings_input_dim` be set" + ) + self.class_embedding = nn.Linear(projection_class_embeddings_input_dim, time_embed_dim) + else: + self.class_embedding = None + + if addition_embed_type == "text": + if encoder_hid_dim is not None: + text_time_embedding_from_dim = encoder_hid_dim + else: + text_time_embedding_from_dim = cross_attention_dim + + self.add_embedding = TextTimeEmbedding( + text_time_embedding_from_dim, time_embed_dim, num_heads=addition_embed_type_num_heads + ) + elif addition_embed_type == "text_image": + # text_embed_dim and image_embed_dim DON'T have to be `cross_attention_dim`. To not clutter the __init__ too much + # they are set to `cross_attention_dim` here as this is exactly the required dimension for the currently only use + # case when `addition_embed_type == "text_image"` (Kadinsky 2.1)` + self.add_embedding = TextImageTimeEmbedding( + text_embed_dim=cross_attention_dim, image_embed_dim=cross_attention_dim, time_embed_dim=time_embed_dim + ) + elif addition_embed_type == "text_time": + self.add_time_proj = Timesteps(addition_time_embed_dim, flip_sin_to_cos, freq_shift) + self.add_embedding = TimestepEmbedding(projection_class_embeddings_input_dim, time_embed_dim) + elif addition_embed_type == "image": + # Kandinsky 2.2 + self.add_embedding = ImageTimeEmbedding(image_embed_dim=encoder_hid_dim, time_embed_dim=time_embed_dim) + elif addition_embed_type == "image_hint": + # Kandinsky 2.2 ControlNet + self.add_embedding = ImageHintTimeEmbedding(image_embed_dim=encoder_hid_dim, time_embed_dim=time_embed_dim) + elif addition_embed_type is not None: + raise ValueError(f"addition_embed_type: {addition_embed_type} must be None, 'text' or 'text_image'.") + + if time_embedding_act_fn is None: + self.time_embed_act = None + else: + self.time_embed_act = get_activation(time_embedding_act_fn) + + self.down_blocks = nn.ModuleList([]) + self.up_blocks = nn.ModuleList([]) + + if isinstance(only_cross_attention, bool): + if mid_block_only_cross_attention is None: + mid_block_only_cross_attention = only_cross_attention + + only_cross_attention = [only_cross_attention] * len(down_block_types) + + if mid_block_only_cross_attention is None: + mid_block_only_cross_attention = False + + if isinstance(num_attention_heads, int): + num_attention_heads = (num_attention_heads,) * len(down_block_types) + + if isinstance(attention_head_dim, int): + attention_head_dim = (attention_head_dim,) * len(down_block_types) + + if isinstance(cross_attention_dim, int): + cross_attention_dim = (cross_attention_dim,) * len(down_block_types) + + if isinstance(layers_per_block, int): + layers_per_block = [layers_per_block] * len(down_block_types) + + if isinstance(transformer_layers_per_block, int): + transformer_layers_per_block = [transformer_layers_per_block] * len(down_block_types) + + if class_embeddings_concat: + # The time embeddings are concatenated with the class embeddings. The dimension of the + # time embeddings passed to the down, middle, and up blocks is twice the dimension of the + # regular time embeddings + blocks_time_embed_dim = time_embed_dim * 2 + else: + blocks_time_embed_dim = time_embed_dim + + # down + output_channel = block_out_channels[0] + for i, down_block_type in enumerate(down_block_types): + input_channel = output_channel + output_channel = block_out_channels[i] + is_final_block = i == len(block_out_channels) - 1 + + down_block = get_down_block( + down_block_type, + num_layers=layers_per_block[i], + transformer_layers_per_block=transformer_layers_per_block[i], + in_channels=input_channel, + out_channels=output_channel, + temb_channels=blocks_time_embed_dim, + add_downsample=not is_final_block, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + cross_attention_dim=cross_attention_dim[i], + num_attention_heads=num_attention_heads[i], + downsample_padding=downsample_padding, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention[i], + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + resnet_skip_time_act=resnet_skip_time_act, + resnet_out_scale_factor=resnet_out_scale_factor, + cross_attention_norm=cross_attention_norm, + attention_head_dim=attention_head_dim[i] if attention_head_dim[i] is not None else output_channel, + ) + self.down_blocks.append(down_block) + + # mid + if mid_block_type == "UNetMidBlock2DCrossAttn": + self.mid_block = UNetMidBlock2DCrossAttn( + transformer_layers_per_block=transformer_layers_per_block[-1], + in_channels=block_out_channels[-1], + temb_channels=blocks_time_embed_dim, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + output_scale_factor=mid_block_scale_factor, + resnet_time_scale_shift=resnet_time_scale_shift, + cross_attention_dim=cross_attention_dim[-1], + num_attention_heads=num_attention_heads[-1], + resnet_groups=norm_num_groups, + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + upcast_attention=upcast_attention, + ) + elif mid_block_type == "UNetMidBlock2DSimpleCrossAttn": + self.mid_block = UNetMidBlock2DSimpleCrossAttn( + in_channels=block_out_channels[-1], + temb_channels=blocks_time_embed_dim, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + output_scale_factor=mid_block_scale_factor, + cross_attention_dim=cross_attention_dim[-1], + attention_head_dim=attention_head_dim[-1], + resnet_groups=norm_num_groups, + resnet_time_scale_shift=resnet_time_scale_shift, + skip_time_act=resnet_skip_time_act, + only_cross_attention=mid_block_only_cross_attention, + cross_attention_norm=cross_attention_norm, + ) + elif mid_block_type is None: + self.mid_block = None + else: + raise ValueError(f"unknown mid_block_type : {mid_block_type}") + + # count how many layers upsample the images + self.num_upsamplers = 0 + + # up + reversed_block_out_channels = list(reversed(block_out_channels)) + reversed_num_attention_heads = list(reversed(num_attention_heads)) + reversed_layers_per_block = list(reversed(layers_per_block)) + reversed_cross_attention_dim = list(reversed(cross_attention_dim)) + reversed_transformer_layers_per_block = list(reversed(transformer_layers_per_block)) + only_cross_attention = list(reversed(only_cross_attention)) + + output_channel = reversed_block_out_channels[0] + for i, up_block_type in enumerate(up_block_types): + is_final_block = i == len(block_out_channels) - 1 + + prev_output_channel = output_channel + output_channel = reversed_block_out_channels[i] + input_channel = reversed_block_out_channels[min(i + 1, len(block_out_channels) - 1)] + + # add upsample block for all BUT final layer + if not is_final_block: + add_upsample = True + self.num_upsamplers += 1 + else: + add_upsample = False + + up_block = get_up_block( + up_block_type, + num_layers=reversed_layers_per_block[i] + 1, + transformer_layers_per_block=reversed_transformer_layers_per_block[i], + in_channels=input_channel, + out_channels=output_channel, + prev_output_channel=prev_output_channel, + temb_channels=blocks_time_embed_dim, + add_upsample=add_upsample, + resnet_eps=norm_eps, + resnet_act_fn=act_fn, + resnet_groups=norm_num_groups, + cross_attention_dim=reversed_cross_attention_dim[i], + num_attention_heads=reversed_num_attention_heads[i], + dual_cross_attention=dual_cross_attention, + use_linear_projection=use_linear_projection, + only_cross_attention=only_cross_attention[i], + upcast_attention=upcast_attention, + resnet_time_scale_shift=resnet_time_scale_shift, + resnet_skip_time_act=resnet_skip_time_act, + resnet_out_scale_factor=resnet_out_scale_factor, + cross_attention_norm=cross_attention_norm, + attention_head_dim=attention_head_dim[i] if attention_head_dim[i] is not None else output_channel, + ) + self.up_blocks.append(up_block) + prev_output_channel = output_channel + + # out + if norm_num_groups is not None: + self.conv_norm_out = nn.GroupNorm( + num_channels=block_out_channels[0], num_groups=norm_num_groups, eps=norm_eps + ) + + self.conv_act = get_activation(act_fn) + + else: + self.conv_norm_out = None + self.conv_act = None + + conv_out_padding = (conv_out_kernel - 1) // 2 + self.conv_out = nn.Conv2d( + block_out_channels[0], out_channels, kernel_size=conv_out_kernel, padding=conv_out_padding + ) + + @property + def attn_processors(self) -> Dict[str, AttentionProcessor]: + r""" + Returns: + `dict` of attention processors: A dictionary containing all attention processors used in the model with + indexed by its weight name. + """ + # set recursively + processors = {} + + def fn_recursive_add_processors(name: str, module: torch.nn.Module, processors: Dict[str, AttentionProcessor]): + if hasattr(module, "set_processor"): + processors[f"{name}.processor"] = module.processor + + for sub_name, child in module.named_children(): + fn_recursive_add_processors(f"{name}.{sub_name}", child, processors) + + return processors + + for name, module in self.named_children(): + fn_recursive_add_processors(name, module, processors) + + return processors + + def set_attn_processor(self, processor: Union[AttentionProcessor, Dict[str, AttentionProcessor]]): + r""" + Sets the attention processor to use to compute attention. + + Parameters: + processor (`dict` of `AttentionProcessor` or only `AttentionProcessor`): + The instantiated processor class or a dictionary of processor classes that will be set as the processor + for **all** `Attention` layers. + + If `processor` is a dict, the key needs to define the path to the corresponding cross attention + processor. This is strongly recommended when setting trainable attention processors. + + """ + count = len(self.attn_processors.keys()) + + if isinstance(processor, dict) and len(processor) != count: + raise ValueError( + f"A dict of processors was passed, but the number of processors {len(processor)} does not match the" + f" number of attention layers: {count}. Please make sure to pass {count} processor classes." + ) + + def fn_recursive_attn_processor(name: str, module: torch.nn.Module, processor): + if hasattr(module, "set_processor"): + if not isinstance(processor, dict): + module.set_processor(processor) + else: + module.set_processor(processor.pop(f"{name}.processor")) + + for sub_name, child in module.named_children(): + fn_recursive_attn_processor(f"{name}.{sub_name}", child, processor) + + for name, module in self.named_children(): + fn_recursive_attn_processor(name, module, processor) + + def set_default_attn_processor(self): + """ + Disables custom attention processors and sets the default attention implementation. + """ + self.set_attn_processor(AttnProcessor()) + + def set_attention_slice(self, slice_size): + r""" + Enable sliced attention computation. + + When this option is enabled, the attention module splits the input tensor in slices to compute attention in + several steps. This is useful for saving some memory in exchange for a small decrease in speed. + + Args: + slice_size (`str` or `int` or `list(int)`, *optional*, defaults to `"auto"`): + When `"auto"`, input to the attention heads is halved, so attention is computed in two steps. If + `"max"`, maximum amount of memory is saved by running only one slice at a time. If a number is + provided, uses as many slices as `attention_head_dim // slice_size`. In this case, `attention_head_dim` + must be a multiple of `slice_size`. + """ + sliceable_head_dims = [] + + def fn_recursive_retrieve_sliceable_dims(module: torch.nn.Module): + if hasattr(module, "set_attention_slice"): + sliceable_head_dims.append(module.sliceable_head_dim) + + for child in module.children(): + fn_recursive_retrieve_sliceable_dims(child) + + # retrieve number of attention layers + for module in self.children(): + fn_recursive_retrieve_sliceable_dims(module) + + num_sliceable_layers = len(sliceable_head_dims) + + if slice_size == "auto": + # half the attention head size is usually a good trade-off between + # speed and memory + slice_size = [dim // 2 for dim in sliceable_head_dims] + elif slice_size == "max": + # make smallest slice possible + slice_size = num_sliceable_layers * [1] + + slice_size = num_sliceable_layers * [slice_size] if not isinstance(slice_size, list) else slice_size + + if len(slice_size) != len(sliceable_head_dims): + raise ValueError( + f"You have provided {len(slice_size)}, but {self.config} has {len(sliceable_head_dims)} different" + f" attention layers. Make sure to match `len(slice_size)` to be {len(sliceable_head_dims)}." + ) + + for i in range(len(slice_size)): + size = slice_size[i] + dim = sliceable_head_dims[i] + if size is not None and size > dim: + raise ValueError(f"size {size} has to be smaller or equal to {dim}.") + + # Recursively walk through all the children. + # Any children which exposes the set_attention_slice method + # gets the message + def fn_recursive_set_attention_slice(module: torch.nn.Module, slice_size: List[int]): + if hasattr(module, "set_attention_slice"): + module.set_attention_slice(slice_size.pop()) + + for child in module.children(): + fn_recursive_set_attention_slice(child, slice_size) + + reversed_slice_size = list(reversed(slice_size)) + for module in self.children(): + fn_recursive_set_attention_slice(module, reversed_slice_size) + + def _set_gradient_checkpointing(self, module, value=False): + if isinstance(module, (CrossAttnDownBlock2D, DownBlock2D, CrossAttnUpBlock2D, UpBlock2D)): + module.gradient_checkpointing = value + + def forward( + self, + sample: torch.FloatTensor, + timestep: Union[torch.Tensor, float, int], + encoder_hidden_states: torch.Tensor, + class_labels: Optional[torch.Tensor] = None, + timestep_cond: Optional[torch.Tensor] = None, + attention_mask: Optional[torch.Tensor] = None, + cross_attention_kwargs: Optional[Dict[str, Any]] = None, + added_cond_kwargs: Optional[Dict[str, torch.Tensor]] = None, + down_block_additional_residuals: Optional[Tuple[torch.Tensor]] = None, + mid_block_additional_residual: Optional[torch.Tensor] = None, + encoder_attention_mask: Optional[torch.Tensor] = None, + return_dict: bool = True, + ) -> Union[UNet2DConditionOutput, Tuple]: + r""" + The [`UNet2DConditionModel`] forward method. + + Args: + sample (`torch.FloatTensor`): + The noisy input tensor with the following shape `(batch, channel, height, width)`. + timestep (`torch.FloatTensor` or `float` or `int`): The number of timesteps to denoise an input. + encoder_hidden_states (`torch.FloatTensor`): + The encoder hidden states with shape `(batch, sequence_length, feature_dim)`. + encoder_attention_mask (`torch.Tensor`): + A cross-attention mask of shape `(batch, sequence_length)` is applied to `encoder_hidden_states`. If + `True` the mask is kept, otherwise if `False` it is discarded. Mask will be converted into a bias, + which adds large negative values to the attention scores corresponding to "discard" tokens. + return_dict (`bool`, *optional*, defaults to `True`): + Whether or not to return a [`~models.unet_2d_condition.UNet2DConditionOutput`] instead of a plain + tuple. + cross_attention_kwargs (`dict`, *optional*): + A kwargs dictionary that if specified is passed along to the [`AttnProcessor`]. + added_cond_kwargs: (`dict`, *optional*): + A kwargs dictionary containin additional embeddings that if specified are added to the embeddings that + are passed along to the UNet blocks. + + Returns: + [`~models.unet_2d_condition.UNet2DConditionOutput`] or `tuple`: + If `return_dict` is True, an [`~models.unet_2d_condition.UNet2DConditionOutput`] is returned, otherwise + a `tuple` is returned where the first element is the sample tensor. + """ + # By default samples have to be AT least a multiple of the overall upsampling factor. + # The overall upsampling factor is equal to 2 ** (# num of upsampling layers). + # However, the upsampling interpolation output size can be forced to fit any upsampling size + # on the fly if necessary. + default_overall_up_factor = 2**self.num_upsamplers + + # upsample size should be forwarded when sample is not a multiple of `default_overall_up_factor` + forward_upsample_size = False + upsample_size = None + + if any(s % default_overall_up_factor != 0 for s in sample.shape[-2:]): + logger.info("Forward upsample size to force interpolation output size.") + forward_upsample_size = True + + # ensure attention_mask is a bias, and give it a singleton query_tokens dimension + # expects mask of shape: + # [batch, key_tokens] + # adds singleton query_tokens dimension: + # [batch, 1, key_tokens] + # this helps to broadcast it as a bias over attention scores, which will be in one of the following shapes: + # [batch, heads, query_tokens, key_tokens] (e.g. torch sdp attn) + # [batch * heads, query_tokens, key_tokens] (e.g. xformers or classic attn) + if attention_mask is not None: + # assume that mask is expressed as: + # (1 = keep, 0 = discard) + # convert mask into a bias that can be added to attention scores: + # (keep = +0, discard = -10000.0) + attention_mask = (1 - attention_mask.to(sample.dtype)) * -10000.0 + attention_mask = attention_mask.unsqueeze(1) + + # convert encoder_attention_mask to a bias the same way we do for attention_mask + if encoder_attention_mask is not None: + encoder_attention_mask = (1 - encoder_attention_mask.to(sample.dtype)) * -10000.0 + encoder_attention_mask = encoder_attention_mask.unsqueeze(1) + + # 0. center input if necessary + if self.config.center_input_sample: + sample = 2 * sample - 1.0 + + # 1. time + timesteps = timestep + if not torch.is_tensor(timesteps): + # TODO: this requires sync between CPU and GPU. So try to pass timesteps as tensors if you can + # This would be a good case for the `match` statement (Python 3.10+) + is_mps = sample.device.type == "mps" + if isinstance(timestep, float): + dtype = torch.float32 if is_mps else torch.float64 + else: + dtype = torch.int32 if is_mps else torch.int64 + timesteps = torch.tensor([timesteps], dtype=dtype, device=sample.device) + elif len(timesteps.shape) == 0: + timesteps = timesteps[None].to(sample.device) + + # broadcast to batch dimension in a way that's compatible with ONNX/Core ML + timesteps = timesteps.expand(sample.shape[0]) + + t_emb = self.time_proj(timesteps) + + # `Timesteps` does not contain any weights and will always return f32 tensors + # but time_embedding might actually be running in fp16. so we need to cast here. + # there might be better ways to encapsulate this. + t_emb = t_emb.to(dtype=sample.dtype) + + emb = self.time_embedding(t_emb, timestep_cond) + aug_emb = None + + if self.class_embedding is not None: + if class_labels is None: + raise ValueError("class_labels should be provided when num_class_embeds > 0") + + if self.config.class_embed_type == "timestep": + class_labels = self.time_proj(class_labels) + + # `Timesteps` does not contain any weights and will always return f32 tensors + # there might be better ways to encapsulate this. + class_labels = class_labels.to(dtype=sample.dtype) + + class_emb = self.class_embedding(class_labels).to(dtype=sample.dtype) + + if self.config.class_embeddings_concat: + emb = torch.cat([emb, class_emb], dim=-1) + else: + emb = emb + class_emb + + if self.config.addition_embed_type == "text": + aug_emb = self.add_embedding(encoder_hidden_states) + elif self.config.addition_embed_type == "text_image": + # Kandinsky 2.1 - style + if "image_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'text_image' which requires the keyword argument `image_embeds` to be passed in `added_cond_kwargs`" + ) + + image_embs = added_cond_kwargs.get("image_embeds") + text_embs = added_cond_kwargs.get("text_embeds", encoder_hidden_states) + aug_emb = self.add_embedding(text_embs, image_embs) + elif self.config.addition_embed_type == "text_time": + if "text_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'text_time' which requires the keyword argument `text_embeds` to be passed in `added_cond_kwargs`" + ) + text_embeds = added_cond_kwargs.get("text_embeds") + if "time_ids" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'text_time' which requires the keyword argument `time_ids` to be passed in `added_cond_kwargs`" + ) + time_ids = added_cond_kwargs.get("time_ids") + time_embeds = self.add_time_proj(time_ids.flatten()) + time_embeds = time_embeds.reshape((text_embeds.shape[0], -1)) + + add_embeds = torch.concat([text_embeds, time_embeds], dim=-1) + add_embeds = add_embeds.to(emb.dtype) + aug_emb = self.add_embedding(add_embeds) + elif self.config.addition_embed_type == "image": + # Kandinsky 2.2 - style + if "image_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'image' which requires the keyword argument `image_embeds` to be passed in `added_cond_kwargs`" + ) + image_embs = added_cond_kwargs.get("image_embeds") + aug_emb = self.add_embedding(image_embs) + elif self.config.addition_embed_type == "image_hint": + # Kandinsky 2.2 - style + if "image_embeds" not in added_cond_kwargs or "hint" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `addition_embed_type` set to 'image_hint' which requires the keyword arguments `image_embeds` and `hint` to be passed in `added_cond_kwargs`" + ) + image_embs = added_cond_kwargs.get("image_embeds") + hint = added_cond_kwargs.get("hint") + aug_emb, hint = self.add_embedding(image_embs, hint) + sample = torch.cat([sample, hint], dim=1) + + emb = emb + aug_emb if aug_emb is not None else emb + + if self.time_embed_act is not None: + emb = self.time_embed_act(emb) + + if self.encoder_hid_proj is not None and self.config.encoder_hid_dim_type == "text_proj": + encoder_hidden_states = self.encoder_hid_proj(encoder_hidden_states) + elif self.encoder_hid_proj is not None and self.config.encoder_hid_dim_type == "text_image_proj": + # Kadinsky 2.1 - style + if "image_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `encoder_hid_dim_type` set to 'text_image_proj' which requires the keyword argument `image_embeds` to be passed in `added_conditions`" + ) + + image_embeds = added_cond_kwargs.get("image_embeds") + encoder_hidden_states = self.encoder_hid_proj(encoder_hidden_states, image_embeds) + elif self.encoder_hid_proj is not None and self.config.encoder_hid_dim_type == "image_proj": + # Kandinsky 2.2 - style + if "image_embeds" not in added_cond_kwargs: + raise ValueError( + f"{self.__class__} has the config param `encoder_hid_dim_type` set to 'image_proj' which requires the keyword argument `image_embeds` to be passed in `added_conditions`" + ) + image_embeds = added_cond_kwargs.get("image_embeds") + encoder_hidden_states = self.encoder_hid_proj(image_embeds) + # 2. pre-process + sample = self.conv_in(sample) + + # 3. down + down_block_res_samples = (sample,) + for downsample_block in self.down_blocks: + if hasattr(downsample_block, "has_cross_attention") and downsample_block.has_cross_attention: + sample, res_samples = downsample_block( + hidden_states=sample, + temb=emb, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + encoder_attention_mask=encoder_attention_mask, + ) + else: + sample, res_samples = downsample_block(hidden_states=sample, temb=emb) + + down_block_res_samples += res_samples + + if down_block_additional_residuals is not None: + new_down_block_res_samples = () + + for down_block_res_sample, down_block_additional_residual in zip( + down_block_res_samples, down_block_additional_residuals + ): + down_block_res_sample = down_block_res_sample + down_block_additional_residual + new_down_block_res_samples = new_down_block_res_samples + (down_block_res_sample,) + + down_block_res_samples = new_down_block_res_samples + + # 4. mid + if self.mid_block is not None: + sample = self.mid_block( + sample, + emb, + encoder_hidden_states=encoder_hidden_states, + attention_mask=attention_mask, + cross_attention_kwargs=cross_attention_kwargs, + encoder_attention_mask=encoder_attention_mask, + ) + + if mid_block_additional_residual is not None: + sample = sample + mid_block_additional_residual + + # 5. up + for i, upsample_block in enumerate(self.up_blocks): + is_final_block = i == len(self.up_blocks) - 1 + + res_samples = down_block_res_samples[-len(upsample_block.resnets) :] + down_block_res_samples = down_block_res_samples[: -len(upsample_block.resnets)] + + # if we have not reached the final block and need to forward the + # upsample size, we do it here + if not is_final_block and forward_upsample_size: + upsample_size = down_block_res_samples[-1].shape[2:] + + if hasattr(upsample_block, "has_cross_attention") and upsample_block.has_cross_attention: + sample = upsample_block( + hidden_states=sample, + temb=emb, + res_hidden_states_tuple=res_samples, + encoder_hidden_states=encoder_hidden_states, + cross_attention_kwargs=cross_attention_kwargs, + upsample_size=upsample_size, + attention_mask=attention_mask, + encoder_attention_mask=encoder_attention_mask, + ) + else: + sample = upsample_block( + hidden_states=sample, temb=emb, res_hidden_states_tuple=res_samples, upsample_size=upsample_size + ) + + # 6. post-process + if self.conv_norm_out: + sample = self.conv_norm_out(sample) + sample = self.conv_act(sample) + sample = self.conv_out(sample) + + if not return_dict: + return (sample,) + + return UNet2DConditionOutput(sample=sample) diff --git a/gf/sd-webui-rich-text/scripts/models/utils/.DS_Store b/gf/sd-webui-rich-text/scripts/models/utils/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..b00dcba513dee725f9155ab9bed4dd95dcef3ad1 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/utils/.DS_Store differ diff --git a/gf/sd-webui-rich-text/scripts/models/utils/__pycache__/attention_utils.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/utils/__pycache__/attention_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..ba10e4a88a54bbb589f57ac0fb7790724b390d10 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/utils/__pycache__/attention_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/utils/__pycache__/richtext_utils.cpython-310.pyc b/gf/sd-webui-rich-text/scripts/models/utils/__pycache__/richtext_utils.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..b4211743cc8a958ee1acd51cdb2a010b585a7a48 Binary files /dev/null and b/gf/sd-webui-rich-text/scripts/models/utils/__pycache__/richtext_utils.cpython-310.pyc differ diff --git a/gf/sd-webui-rich-text/scripts/models/utils/attention_utils.py b/gf/sd-webui-rich-text/scripts/models/utils/attention_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..6d7a3b8f54ae0eee08d57ffae2a9a82e86287a9a --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/utils/attention_utils.py @@ -0,0 +1,727 @@ +import numpy as np +import os +import matplotlib as mpl +import matplotlib.pyplot as plt +import seaborn as sns +import torch +import torchvision + +from scripts.models.utils.richtext_utils import seed_everything +from sklearn.cluster import KMeans, SpectralClustering + +# SelfAttentionLayers = [ +# # 'down_blocks.0.attentions.0.transformer_blocks.0.attn1', +# # 'down_blocks.0.attentions.1.transformer_blocks.0.attn1', +# 'down_blocks.1.attentions.0.transformer_blocks.0.attn1', +# # 'down_blocks.1.attentions.1.transformer_blocks.0.attn1', +# 'down_blocks.2.attentions.0.transformer_blocks.0.attn1', +# 'down_blocks.2.attentions.1.transformer_blocks.0.attn1', +# 'mid_block.attentions.0.transformer_blocks.0.attn1', +# 'up_blocks.1.attentions.0.transformer_blocks.0.attn1', +# 'up_blocks.1.attentions.1.transformer_blocks.0.attn1', +# 'up_blocks.1.attentions.2.transformer_blocks.0.attn1', +# # 'up_blocks.2.attentions.0.transformer_blocks.0.attn1', +# 'up_blocks.2.attentions.1.transformer_blocks.0.attn1', +# # 'up_blocks.2.attentions.2.transformer_blocks.0.attn1', +# # 'up_blocks.3.attentions.0.transformer_blocks.0.attn1', +# # 'up_blocks.3.attentions.1.transformer_blocks.0.attn1', +# # 'up_blocks.3.attentions.2.transformer_blocks.0.attn1', +# ] + +SelfAttentionLayers = [ + # 'down_blocks.0.attentions.0.transformer_blocks.0.attn1', + # 'down_blocks.0.attentions.1.transformer_blocks.0.attn1', + 'down_blocks.1.attentions.0.transformer_blocks.0.attn1', + # 'down_blocks.1.attentions.1.transformer_blocks.0.attn1', + 'down_blocks.2.attentions.0.transformer_blocks.0.attn1', + 'down_blocks.2.attentions.1.transformer_blocks.0.attn1', + 'mid_block.attentions.0.transformer_blocks.0.attn1', + 'up_blocks.1.attentions.0.transformer_blocks.0.attn1', + 'up_blocks.1.attentions.1.transformer_blocks.0.attn1', + 'up_blocks.1.attentions.2.transformer_blocks.0.attn1', + # 'up_blocks.2.attentions.0.transformer_blocks.0.attn1', + 'up_blocks.2.attentions.1.transformer_blocks.0.attn1', + # 'up_blocks.2.attentions.2.transformer_blocks.0.attn1', + # 'up_blocks.3.attentions.0.transformer_blocks.0.attn1', + # 'up_blocks.3.attentions.1.transformer_blocks.0.attn1', + # 'up_blocks.3.attentions.2.transformer_blocks.0.attn1', +] + + +CrossAttentionLayers = [ + # 'down_blocks.0.attentions.0.transformer_blocks.0.attn2', + # 'down_blocks.0.attentions.1.transformer_blocks.0.attn2', + 'down_blocks.1.attentions.0.transformer_blocks.0.attn2', + # 'down_blocks.1.attentions.1.transformer_blocks.0.attn2', + 'down_blocks.2.attentions.0.transformer_blocks.0.attn2', + 'down_blocks.2.attentions.1.transformer_blocks.0.attn2', + 'mid_block.attentions.0.transformer_blocks.0.attn2', + 'up_blocks.1.attentions.0.transformer_blocks.0.attn2', + 'up_blocks.1.attentions.1.transformer_blocks.0.attn2', + 'up_blocks.1.attentions.2.transformer_blocks.0.attn2', + # 'up_blocks.2.attentions.0.transformer_blocks.0.attn2', + 'up_blocks.2.attentions.1.transformer_blocks.0.attn2', + # 'up_blocks.2.attentions.2.transformer_blocks.0.attn2', + # 'up_blocks.3.attentions.0.transformer_blocks.0.attn2', + # 'up_blocks.3.attentions.1.transformer_blocks.0.attn2', + # 'up_blocks.3.attentions.2.transformer_blocks.0.attn2' +] + +# CrossAttentionLayers = [ +# 'down_blocks.0.attentions.0.transformer_blocks.0.attn2', +# 'down_blocks.0.attentions.1.transformer_blocks.0.attn2', +# 'down_blocks.1.attentions.0.transformer_blocks.0.attn2', +# 'down_blocks.1.attentions.1.transformer_blocks.0.attn2', +# 'down_blocks.2.attentions.0.transformer_blocks.0.attn2', +# 'down_blocks.2.attentions.1.transformer_blocks.0.attn2', +# 'mid_block.attentions.0.transformer_blocks.0.attn2', +# 'up_blocks.1.attentions.0.transformer_blocks.0.attn2', +# 'up_blocks.1.attentions.1.transformer_blocks.0.attn2', +# 'up_blocks.1.attentions.2.transformer_blocks.0.attn2', +# 'up_blocks.2.attentions.0.transformer_blocks.0.attn2', +# 'up_blocks.2.attentions.1.transformer_blocks.0.attn2', +# 'up_blocks.2.attentions.2.transformer_blocks.0.attn2', +# 'up_blocks.3.attentions.0.transformer_blocks.0.attn2', +# 'up_blocks.3.attentions.1.transformer_blocks.0.attn2', +# 'up_blocks.3.attentions.2.transformer_blocks.0.attn2' +# ] + +# CrossAttentionLayers_XL = [ +# 'up_blocks.0.attentions.0.transformer_blocks.1.attn2', +# 'up_blocks.0.attentions.0.transformer_blocks.2.attn2', +# 'up_blocks.0.attentions.0.transformer_blocks.3.attn2', +# 'up_blocks.0.attentions.0.transformer_blocks.4.attn2', +# 'up_blocks.0.attentions.0.transformer_blocks.5.attn2', +# 'up_blocks.0.attentions.0.transformer_blocks.6.attn2', +# 'up_blocks.0.attentions.0.transformer_blocks.7.attn2', +# ] +CrossAttentionLayers_XL = [ + 'down_blocks.2.attentions.1.transformer_blocks.3.attn2', + 'down_blocks.2.attentions.1.transformer_blocks.4.attn2', + 'mid_block.attentions.0.transformer_blocks.0.attn2', + 'mid_block.attentions.0.transformer_blocks.1.attn2', + 'mid_block.attentions.0.transformer_blocks.2.attn2', + 'mid_block.attentions.0.transformer_blocks.3.attn2', + 'up_blocks.0.attentions.0.transformer_blocks.1.attn2', + 'up_blocks.0.attentions.0.transformer_blocks.2.attn2', + 'up_blocks.0.attentions.0.transformer_blocks.3.attn2', + 'up_blocks.0.attentions.0.transformer_blocks.4.attn2', + 'up_blocks.0.attentions.0.transformer_blocks.5.attn2', + 'up_blocks.0.attentions.0.transformer_blocks.6.attn2', + 'up_blocks.0.attentions.0.transformer_blocks.7.attn2', + 'up_blocks.1.attentions.0.transformer_blocks.0.attn2' +] + +def split_attention_maps_over_steps(attention_maps): + r"""Function for splitting attention maps over steps. + Args: + attention_maps (dict): Dictionary of attention maps. + sampler_order (int): Order of the sampler. + """ + # This function splits attention maps into unconditional and conditional score and over steps + + attention_maps_cond = dict() # Maps corresponding to conditional score + attention_maps_uncond = dict() # Maps corresponding to unconditional score + + for layer in attention_maps.keys(): + + for step_num in range(len(attention_maps[layer])): + if step_num not in attention_maps_cond: + attention_maps_cond[step_num] = dict() + attention_maps_uncond[step_num] = dict() + + attention_maps_uncond[step_num].update( + {layer: attention_maps[layer][step_num][:1]}) + attention_maps_cond[step_num].update( + {layer: attention_maps[layer][step_num][1:2]}) + + return attention_maps_cond, attention_maps_uncond + + +def save_attention_heatmaps(attention_maps, tokens_vis, save_dir, prefix): + r"""Function to plot heatmaps for attention maps. + + Args: + attention_maps (dict): Dictionary of attention maps per layer + save_dir (str): Directory to save attention maps + prefix (str): Filename prefix for html files + + Returns: + Heatmaps, one per sample. + """ + + html_names = [] + + idx = 0 + html_list = [] + + for layer in attention_maps.keys(): + if idx == 0: + # import ipdb;ipdb.set_trace() + # create a set of html files. + + batch_size = attention_maps[layer].shape[0] + + for sample_num in range(batch_size): + # html path + html_rel_path = os.path.join('sample_{}'.format( + sample_num), '{}.html'.format(prefix)) + html_names.append(html_rel_path) + html_path = os.path.join(save_dir, html_rel_path) + os.makedirs(os.path.dirname(html_path), exist_ok=True) + html_list.append(open(html_path, 'wt')) + html_list[sample_num].write( + '\n') + + for sample_num in range(batch_size): + + save_path = os.path.join(save_dir, 'sample_{}'.format(sample_num), + prefix, 'layer_{}'.format(layer)) + '.jpg' + Path(os.path.dirname(save_path)).mkdir(parents=True, exist_ok=True) + + layer_name = 'layer_{}'.format(layer) + html_list[sample_num].write( + f'\n') + + prefix_stem = prefix.split('/')[-1] + relative_image_path = os.path.join( + prefix_stem, 'layer_{}'.format(layer)) + '.jpg' + html_list[sample_num].write( + f'\n') + + plt.figure() + plt.clf() + nrows = 2 + ncols = 7 + fig, axs = plt.subplots(nrows=nrows, ncols=ncols) + + fig.set_figheight(8) + fig.set_figwidth(28.5) + + # axs[0].set_aspect('equal') + # axs[1].set_aspect('equal') + # axs[2].set_aspect('equal') + # axs[3].set_aspect('equal') + # axs[4].set_aspect('equal') + # axs[5].set_aspect('equal') + + cmap = plt.get_cmap('YlOrRd') + + for rid in range(nrows): + for cid in range(ncols): + tid = rid*ncols + cid + # import ipdb;ipdb.set_trace() + attention_map_cur = attention_maps[layer][sample_num, :, :, tid].numpy( + ) + vmax = float(attention_map_cur.max()) + vmin = float(attention_map_cur.min()) + sns.heatmap( + attention_map_cur, annot=False, cbar=False, ax=axs[rid, cid], + cmap=cmap, vmin=vmin, vmax=vmax + ) + axs[rid, cid].set_xlabel(tokens_vis[tid]) + + # axs[0].set_xlabel('Self attention') + # axs[1].set_xlabel('Temporal attention') + # axs[2].set_xlabel('T5 text attention') + # axs[3].set_xlabel('CLIP text attention') + # axs[4].set_xlabel('CLIP image attention') + # axs[5].set_xlabel('Null text token') + + norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax) + sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm) + # fig.colorbar(sm, cax=axs[6]) + + fig.tight_layout() + plt.savefig(save_path, dpi=64) + plt.close('all') + + if idx == (len(attention_maps.keys()) - 1): + for sample_num in range(batch_size): + html_list[sample_num].write('

{layer_name}

') + html_list[sample_num].close() + + idx += 1 + + return html_names + + +def create_recursive_html_link(html_path, save_dir): + r"""Function for creating recursive html links. + If the path is dir1/dir2/dir3/*.html, + we create chained directories + -dir1 + dir1.html (has links to all children) + -dir2 + dir2.html (has links to all children) + -dir3 + dir3.html + + Args: + html_path (str): Path to html file. + save_dir (str): Save directory. + """ + + html_path_split = os.path.splitext(html_path)[0].split('/') + if len(html_path_split) == 1: + return + + # First create the root directory + root_dir = html_path_split[0] + child_dir = html_path_split[1] + + cur_html_path = os.path.join(save_dir, '{}.html'.format(root_dir)) + if os.path.exists(cur_html_path): + + fp = open(cur_html_path, 'r') + lines_written = fp.readlines() + fp.close() + + fp = open(cur_html_path, 'a+') + child_path = os.path.join(root_dir, f'{child_dir}.html') + line_to_write = f'{child_dir}\n' + + if line_to_write not in lines_written: + fp.write('\n') + fp.write(line_to_write) + fp.write('
') + fp.close() + + else: + + fp = open(cur_html_path, 'w') + + child_path = os.path.join(root_dir, f'{child_dir}.html') + line_to_write = f'{child_dir}\n' + + fp.write('\n') + fp.write(line_to_write) + fp.write('
') + + fp.close() + + child_path = '/'.join(html_path.split('/')[1:]) + save_dir = os.path.join(save_dir, root_dir) + create_recursive_html_link(child_path, save_dir) + + +def visualize_attention_maps(attention_maps_all, save_dir, width, height, tokens_vis): + r"""Function to visualize attention maps. + Args: + save_dir (str): Path to save attention maps + batch_size (int): Batch size + sampler_order (int): Sampler order + """ + + rand_name = list(attention_maps_all.keys())[0] + nsteps = len(attention_maps_all[rand_name]) + hw_ori = width * height + + # html_path = save_dir + '.html' + text_input = save_dir.split('/')[-1] + # f = open(html_path, 'wt') + + all_html_paths = [] + + for step_num in range(0, nsteps, 5): + + # if cond_id == 'cond': + # attention_maps_cur = attention_maps_cond[step_num] + # else: + # attention_maps_cur = attention_maps_uncond[step_num] + + attention_maps = dict() + + for layer in attention_maps_all.keys(): + + attention_ind = attention_maps_all[layer][step_num].cpu() + + # Attention maps are of shape [batch_size, nkeys, 77] + # since they are averaged out while collecting from hooks to save memory. + # Now split the heads from batch dimension + bs, hw, nclip = attention_ind.shape + down_ratio = np.sqrt(hw_ori // hw) + width_cur = int(width // down_ratio) + height_cur = int(height // down_ratio) + attention_ind = attention_ind.reshape( + bs, height_cur, width_cur, nclip) + + attention_maps[layer] = attention_ind + + # Obtain heatmaps corresponding to random heads and individual heads + + html_names = save_attention_heatmaps( + attention_maps, tokens_vis, save_dir=save_dir, prefix='step_{}/attention_maps_cond'.format( + step_num) + ) + + # Write the logic for recursively creating pages + for html_name_cur in html_names: + all_html_paths.append(os.path.join(text_input, html_name_cur)) + + save_dir_root = '/'.join(save_dir.split('/')[0:-1]) + for html_pth in all_html_paths: + create_recursive_html_link(html_pth, save_dir_root) + + +def plot_attention_maps(atten_map_list, obj_tokens, save_dir, seed, tokens_vis=None): + for i, attn_map in enumerate(atten_map_list): + n_obj = len(attn_map) + plt.figure() + plt.clf() + + fig, axs = plt.subplots( + ncols=n_obj+1, gridspec_kw=dict(width_ratios=[1 for _ in range(n_obj)]+[0.1])) + + fig.set_figheight(3) + fig.set_figwidth(3*n_obj+0.1) + + cmap = plt.get_cmap('YlOrRd') + + vmax = 0 + vmin = 1 + for tid in range(n_obj): + attention_map_cur = attn_map[tid] + vmax = max(vmax, float(attention_map_cur.max())) + vmin = min(vmin, float(attention_map_cur.min())) + + for tid in range(n_obj): + sns.heatmap( + attn_map[tid][0], annot=False, cbar=False, ax=axs[tid], + cmap=cmap, vmin=vmin, vmax=vmax + ) + axs[tid].set_axis_off() + + if tokens_vis is not None: + if tid == n_obj-1: + axs_xlabel = 'other tokens' + else: + axs_xlabel = '' + for token_id in obj_tokens[tid]: + axs_xlabel += ' ' + tokens_vis[token_id.item() - + 1][:-len('')] + axs[tid].set_title(axs_xlabel) + + norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax) + sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm) + fig.colorbar(sm, cax=axs[-1]) + + fig.tight_layout() + + canvas = fig.canvas + canvas.draw() + width, height = canvas.get_width_height() + img = np.frombuffer(canvas.tostring_rgb(), + dtype='uint8').reshape((height, width, 3)) + # plt.savefig(os.path.join( + # save_dir, 'average_seed%d_attn%d.jpg' % (seed, i)), dpi=100) + plt.close('all') + return img + + +def get_average_attention_maps(attention_maps, save_dir, width, height, obj_tokens, seed=0, tokens_vis=None, + preprocess=False): + r"""Function to visualize attention maps. + Args: + save_dir (str): Path to save attention maps + batch_size (int): Batch size + sampler_order (int): Sampler order + """ + + # Split attention maps over steps + attention_maps_cond, _ = split_attention_maps_over_steps( + attention_maps + ) + + nsteps = len(attention_maps_cond) + hw_ori = width * height + + attention_maps = [] + for obj_token in obj_tokens: + attention_maps.append([]) + + for step_num in range(nsteps): + attention_maps_cur = attention_maps_cond[step_num] + + for layer in attention_maps_cur.keys(): + if step_num < 10 or layer not in CrossAttentionLayers: + continue + + attention_ind = attention_maps_cur[layer].cpu() + + # Attention maps are of shape [batch_size, nkeys, 77] + # since they are averaged out while collecting from hooks to save memory. + # Now split the heads from batch dimension + bs, hw, nclip = attention_ind.shape + down_ratio = np.sqrt(hw_ori // hw) + width_cur = int(width // down_ratio) + height_cur = int(height // down_ratio) + attention_ind = attention_ind.reshape( + bs, height_cur, width_cur, nclip) + for obj_id, obj_token in enumerate(obj_tokens): + if obj_token[0] == -1: + attention_map_prev = torch.stack( + [attention_maps[i][-1] for i in range(obj_id)]).sum(0) + attention_maps[obj_id].append( + attention_map_prev.max()-attention_map_prev) + else: + obj_attention_map = attention_ind[:, :, :, obj_token].max(-1, True)[ + 0].permute([3, 0, 1, 2]) + # obj_attention_map = attention_ind[:, :, :, obj_token].mean(-1, True).permute([3, 0, 1, 2]) + obj_attention_map = torchvision.transforms.functional.resize(obj_attention_map, (height, width), + interpolation=torchvision.transforms.InterpolationMode.BICUBIC, antialias=True) + attention_maps[obj_id].append(obj_attention_map) + + attention_maps_averaged = [] + for obj_id, obj_token in enumerate(obj_tokens): + if obj_id == len(obj_tokens) - 1: + attention_maps_averaged.append( + torch.cat(attention_maps[obj_id]).mean(0)) + else: + attention_maps_averaged.append( + torch.cat(attention_maps[obj_id]).mean(0)) + + attention_maps_averaged_normalized = [] + attention_maps_averaged_sum = torch.cat(attention_maps_averaged).sum(0) + for obj_id, obj_token in enumerate(obj_tokens): + attention_maps_averaged_normalized.append( + attention_maps_averaged[obj_id]/attention_maps_averaged_sum) + + if obj_tokens[-1][0] != -1: + attention_maps_averaged_normalized = ( + torch.cat(attention_maps_averaged)/0.001).softmax(0) + attention_maps_averaged_normalized = [ + attention_maps_averaged_normalized[i:i+1] for i in range(attention_maps_averaged_normalized.shape[0])] + + if preprocess: + selem = square(5) + selem = square(3) + selem = square(1) + attention_maps_averaged_eroded = [erosion(skimage.img_as_float( + map[0].numpy()*255), selem) for map in attention_maps_averaged_normalized[:2]] + attention_maps_averaged_eroded = [(torch.from_numpy(map).unsqueeze( + 0)/255. > 0.8).float() for map in attention_maps_averaged_eroded] + attention_maps_averaged_eroded.append( + 1 - torch.cat(attention_maps_averaged_eroded).sum(0, True)) + plot_attention_maps([attention_maps_averaged, attention_maps_averaged_normalized, + attention_maps_averaged_eroded], obj_tokens, save_dir, seed, tokens_vis) + attention_maps_averaged_eroded = [attn_mask.unsqueeze(1).repeat( + [1, 4, 1, 1]).cuda() for attn_mask in attention_maps_averaged_eroded] + return attention_maps_averaged_eroded + else: + plot_attention_maps([attention_maps_averaged, attention_maps_averaged_normalized], + obj_tokens, save_dir, seed, tokens_vis) + attention_maps_averaged_normalized = [attn_mask.unsqueeze(1).repeat( + [1, 4, 1, 1]).cuda() for attn_mask in attention_maps_averaged_normalized] + return attention_maps_averaged_normalized + + +def get_average_attention_maps_threshold(attention_maps, save_dir, width, height, obj_tokens, seed=0, threshold=0.02): + r"""Function to visualize attention maps. + Args: + save_dir (str): Path to save attention maps + batch_size (int): Batch size + sampler_order (int): Sampler order + """ + + _EPS = 1e-8 + # Split attention maps over steps + attention_maps_cond, _ = split_attention_maps_over_steps( + attention_maps + ) + + nsteps = len(attention_maps_cond) + hw_ori = width * height + + attention_maps = [] + for obj_token in obj_tokens: + attention_maps.append([]) + + # for each side prompt, get attention maps for all steps and all layers + for step_num in range(nsteps): + attention_maps_cur = attention_maps_cond[step_num] + for layer in attention_maps_cur.keys(): + attention_ind = attention_maps_cur[layer].cpu() + bs, hw, nclip = attention_ind.shape + down_ratio = np.sqrt(hw_ori // hw) + width_cur = int(width // down_ratio) + height_cur = int(height // down_ratio) + attention_ind = attention_ind.reshape( + bs, height_cur, width_cur, nclip) + for obj_id, obj_token in enumerate(obj_tokens): + if attention_ind.shape[1] > width//2: + continue + if obj_token[0] != -1: + obj_attention_map = attention_ind[:, :, :, + obj_token].mean(-1, True).permute([3, 0, 1, 2]) + obj_attention_map = torchvision.transforms.functional.resize(obj_attention_map, (height, width), + interpolation=torchvision.transforms.InterpolationMode.BICUBIC, antialias=True) + attention_maps[obj_id].append(obj_attention_map) + + # average of all steps and layers, thresholding + attention_maps_thres = [] + attention_maps_averaged = [] + for obj_id, obj_token in enumerate(obj_tokens): + if obj_token[0] != -1: + average_map = torch.cat(attention_maps[obj_id]).mean(0) + attention_maps_averaged.append(average_map) + attention_maps_thres.append((average_map > threshold).float()) + + # get the remaining region except for the original prompt + attention_maps_averaged_normalized = [] + attention_maps_averaged_sum = torch.cat(attention_maps_thres).sum(0) + _EPS + for obj_id, obj_token in enumerate(obj_tokens): + if obj_token[0] != -1: + attention_maps_averaged_normalized.append( + attention_maps_thres[obj_id]/attention_maps_averaged_sum) + else: + attention_map_prev = torch.stack( + attention_maps_averaged_normalized).sum(0) + attention_maps_averaged_normalized.append(1.-attention_map_prev) + + plot_attention_maps( + [attention_maps_averaged, attention_maps_averaged_normalized], save_dir, seed) + + attention_maps_averaged_normalized = [attn_mask.unsqueeze(1).repeat( + [1, 4, 1, 1]).cuda() for attn_mask in attention_maps_averaged_normalized] + # attention_maps_averaged_normalized = attention_maps_averaged_normalized.unsqueeze(1).repeat([1, 4, 1, 1]).cuda() + return attention_maps_averaged_normalized + + +def get_token_maps(selfattn_maps, crossattn_maps, n_maps, save_dir, width, height, obj_tokens, kmeans_seed=0, tokens_vis=None, + preprocess=False, segment_threshold=0.3, num_segments=5, return_vis=False, save_attn=False): + r"""Function to visualize attention maps. + Args: + save_dir (str): Path to save attention maps + batch_size (int): Batch size + sampler_order (int): Sampler order + """ + + target_height = np.ceil(height / 4).astype(int) + target_width = np.ceil(width / 4).astype(int) + target_hw = target_height * target_width + hw = width * height + # attn_maps_1024 = [attn_map for attn_map in selfattn_maps.values( + # ) if attn_map.shape[1] == resolution**2] + # attn_maps_1024 = torch.cat(attn_maps_1024).mean(0).cpu().numpy() + attn_maps_1024 = {target_hw: []} + for attn_map in selfattn_maps.values(): + resolution_map = attn_map.shape[1] + if resolution_map != target_hw: + continue + # attn_map = torch.nn.functional.interpolate(rearrange(attn_map, '1 c (h w) -> 1 c h w', h=resolution_map), (resolution, resolution), + # mode='bicubic', antialias=True) + # attn_map = rearrange(attn_map, '1 (h w) a b -> 1 (a b) h w', h=resolution_map) + attn_map = attn_map.reshape( + 1, target_height, target_width, resolution_map).permute([3, 0, 1, 2]).float() + attn_map = torch.nn.functional.interpolate(attn_map, (target_height, target_width), + mode='bicubic', antialias=True) + attn_maps_1024[resolution_map].append(attn_map.permute([1, 2, 3, 0]).reshape( + 1, target_hw, resolution_map)) + attn_maps_1024 = torch.cat([torch.cat(v).mean(0).cpu() + for v in attn_maps_1024.values() if len(v) > 0], -1).numpy() + if save_attn: + print('saving self-attention maps...', attn_maps_1024.shape) + torch.save(torch.from_numpy(attn_maps_1024), + 'results/maps/selfattn_maps.pth') + seed_everything(kmeans_seed) + # import ipdb;ipdb.set_trace() + # kmeans = KMeans(n_clusters=num_segments, + # n_init=10).fit(attn_maps_1024) + # clusters = kmeans.labels_ + # clusters = clusters.reshape(resolution, resolution) + # mesh = np.array(np.meshgrid(range(resolution), range(resolution), indexing='ij'), dtype=np.float32)/resolution + # dists = mesh.reshape(2, -1).T + # delta = 0.01 + # spatial_sim = rbf_kernel(dists, dists)*delta + sc = SpectralClustering(num_segments, affinity='precomputed', n_init=100, + assign_labels='kmeans') + clusters = sc.fit_predict(attn_maps_1024) + clusters = clusters.reshape(target_height, target_width) + fig = plt.figure() + plt.imshow(clusters) + plt.axis('off') + # plt.savefig(os.path.join(save_dir, 'segmentation_k%d_seed%d.jpg' % (num_segments, kmeans_seed)), + # bbox_inches='tight', pad_inches=0) + if return_vis: + canvas = fig.canvas + canvas.draw() + cav_width, cav_height = canvas.get_width_height() + segments_vis = np.frombuffer(canvas.tostring_rgb(), + dtype='uint8').reshape((cav_height, cav_width, 3)) + + plt.close() + + # label the segmentation mask using cross-attention maps + cross_attn_maps_1024 = [] + for attn_map in crossattn_maps.values(): + resolution_map = np.sqrt(attn_map.shape[1]).astype(int) + # if resolution_map != 16: + # continue + attn_map = attn_map.reshape( + 1, resolution_map, resolution_map, -1).permute([0, 3, 1, 2]).float() + attn_map = torch.nn.functional.interpolate(attn_map, (target_height, target_width), + mode='bicubic', antialias=True) + cross_attn_maps_1024.append(attn_map.permute([0, 2, 3, 1])) + + cross_attn_maps_1024 = torch.cat( + cross_attn_maps_1024).mean(0).cpu().numpy() + normalized_span_maps = [] + for token_ids in obj_tokens: + token_ids = torch.clip(token_ids, 0, 76) + span_token_maps = cross_attn_maps_1024[:, :, token_ids.numpy()] + normalized_span_map = np.zeros_like(span_token_maps) + for i in range(span_token_maps.shape[-1]): + curr_noun_map = span_token_maps[:, :, i] + normalized_span_map[:, :, i] = ( + # curr_noun_map - np.abs(curr_noun_map.min())) / curr_noun_map.max() + curr_noun_map - np.abs(curr_noun_map.min())) / (curr_noun_map.max()-curr_noun_map.min()) + normalized_span_maps.append(normalized_span_map) + foreground_token_maps = [np.zeros([clusters.shape[0], clusters.shape[1]]).squeeze( + ) for normalized_span_map in normalized_span_maps] + background_map = np.zeros([clusters.shape[0], clusters.shape[1]]).squeeze() + for c in range(num_segments): + cluster_mask = np.zeros_like(clusters) + cluster_mask[clusters == c] = 1. + is_foreground = False + for normalized_span_map, foreground_nouns_map, token_ids in zip(normalized_span_maps, foreground_token_maps, obj_tokens): + score_maps = [cluster_mask * normalized_span_map[:, :, i] + for i in range(len(token_ids))] + scores = [score_map.sum() / cluster_mask.sum() + for score_map in score_maps] + if max(scores) > segment_threshold: + foreground_nouns_map += cluster_mask + is_foreground = True + if not is_foreground: + background_map += cluster_mask + foreground_token_maps.append(background_map) + + # resize the token maps and visualization + resized_token_maps = torch.cat([torch.nn.functional.interpolate(torch.from_numpy(token_map).unsqueeze(0).unsqueeze( + 0), (height, width), mode='bicubic', antialias=True)[0] for token_map in foreground_token_maps]).clamp(0, 1) + + resized_token_maps = resized_token_maps / \ + (resized_token_maps.sum(0, True)+1e-8) + resized_token_maps = [token_map.unsqueeze( + 0) for token_map in resized_token_maps] + foreground_token_maps = [token_map[None, :, :] + for token_map in foreground_token_maps] + if preprocess: + selem = square(5) + eroded_token_maps = torch.stack([torch.from_numpy(erosion(skimage.img_as_float( + map[0].numpy()*255), selem))/255. for map in resized_token_maps[:-1]]).clamp(0, 1) + # import ipdb; ipdb.set_trace() + eroded_background_maps = (1-eroded_token_maps.sum(0, True)).clamp(0, 1) + eroded_token_maps = torch.cat([eroded_token_maps, eroded_background_maps]) + eroded_token_maps = eroded_token_maps / (eroded_token_maps.sum(0, True)+1e-8) + resized_token_maps = [token_map.unsqueeze( + 0) for token_map in eroded_token_maps] + + token_maps_vis = plot_attention_maps([foreground_token_maps, resized_token_maps], obj_tokens, + save_dir, kmeans_seed, tokens_vis) + resized_token_maps = [token_map.unsqueeze(1).repeat( + [1, 4, 1, 1]).to(attn_map.dtype).cuda() for token_map in resized_token_maps] + if return_vis: + return resized_token_maps, segments_vis, token_maps_vis + else: + return resized_token_maps diff --git a/gf/sd-webui-rich-text/scripts/models/utils/richtext_utils.py b/gf/sd-webui-rich-text/scripts/models/utils/richtext_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..b52144f6805b01a5b6f3e1c1f3ab9d48262e859f --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/models/utils/richtext_utils.py @@ -0,0 +1,234 @@ +import os +import json +import torch +import random +import numpy as np + +COLORS = { + 'brown': [165, 42, 42], + 'red': [255, 0, 0], + 'pink': [253, 108, 158], + 'orange': [255, 165, 0], + 'yellow': [255, 255, 0], + 'purple': [128, 0, 128], + 'green': [0, 128, 0], + 'blue': [0, 0, 255], + 'white': [255, 255, 255], + 'gray': [128, 128, 128], + 'black': [0, 0, 0], +} + + +def seed_everything(seed): + random.seed(seed) + os.environ['PYTHONHASHSEED'] = str(seed) + np.random.seed(seed) + torch.manual_seed(seed) + torch.cuda.manual_seed(seed) + + +def hex_to_rgb(hex_string, return_nearest_color=False): + r""" + Covert Hex triplet to RGB triplet. + """ + # Remove '#' symbol if present + hex_string = hex_string.lstrip('#') + # Convert hex values to integers + red = int(hex_string[0:2], 16) + green = int(hex_string[2:4], 16) + blue = int(hex_string[4:6], 16) + rgb = torch.FloatTensor((red, green, blue))[None, :, None, None]/255. + if return_nearest_color: + nearest_color = find_nearest_color(rgb) + return rgb.cuda(), nearest_color + return rgb.cuda() + + +def find_nearest_color(rgb): + r""" + Find the nearest neighbor color given the RGB value. + """ + if isinstance(rgb, list) or isinstance(rgb, tuple): + rgb = torch.FloatTensor(rgb)[None, :, None, None]/255. + color_distance = torch.FloatTensor([np.linalg.norm( + rgb - torch.FloatTensor(COLORS[color])[None, :, None, None]/255.) for color in COLORS.keys()]) + nearest_color = list(COLORS.keys())[torch.argmin(color_distance).item()] + return nearest_color + + +def font2style(font): + r""" + Convert the font name to the style name. + """ + return {'mirza': 'Claud Monet, impressionism, oil on canvas', + 'roboto': 'Ukiyoe', + 'cursive': 'Cyber Punk, futuristic, blade runner, william gibson, trending on artstation hq', + 'sofia': 'Pop Art, masterpiece, andy warhol', + 'slabo': 'Vincent Van Gogh', + 'inconsolata': 'Pixel Art, 8 bits, 16 bits', + 'ubuntu': 'Rembrandt', + 'Monoton': 'neon art, colorful light, highly details, octane render', + 'Akronim': 'Abstract Cubism, Pablo Picasso', }[font] + + +def parse_json(json_str): + r""" + Convert the JSON string to attributes. + """ + # initialze region-base attributes. + base_text_prompt = '' + style_text_prompts = [] + footnote_text_prompts = [] + footnote_target_tokens = [] + color_text_prompts = [] + color_rgbs = [] + color_names = [] + size_text_prompts_and_sizes = [] + + # parse the attributes from JSON. + prev_style = None + prev_color_rgb = None + use_grad_guidance = False + for span in json_str['ops']: + text_prompt = span['insert'].rstrip('\n') + base_text_prompt += span['insert'].rstrip('\n') + if text_prompt == ' ': + continue + if 'attributes' in span: + if 'font' in span['attributes']: + style = font2style(span['attributes']['font']) + if prev_style == style: + prev_text_prompt = style_text_prompts[-1].split('in the style of')[ + 0] + style_text_prompts[-1] = prev_text_prompt + \ + ' ' + text_prompt + f' in the style of {style}' + else: + style_text_prompts.append( + text_prompt + f' in the style of {style}') + prev_style = style + else: + prev_style = None + if 'link' in span['attributes']: + footnote_text_prompts.append(span['attributes']['link']) + footnote_target_tokens.append(text_prompt) + font_size = 1 + if 'size' in span['attributes'] and 'strike' not in span['attributes']: + font_size = float(span['attributes']['size'][:-2])/3. + elif 'size' in span['attributes'] and 'strike' in span['attributes']: + font_size = -float(span['attributes']['size'][:-2])/3. + elif 'size' not in span['attributes'] and 'strike' not in span['attributes']: + font_size = 1 + if 'color' in span['attributes']: + use_grad_guidance = True + color_rgb, nearest_color = hex_to_rgb( + span['attributes']['color'], True) + if prev_color_rgb == color_rgb: + prev_text_prompt = color_text_prompts[-1] + color_text_prompts[-1] = prev_text_prompt + \ + ' ' + text_prompt + else: + color_rgbs.append(color_rgb) + color_names.append(nearest_color) + color_text_prompts.append(text_prompt) + if font_size != 1: + size_text_prompts_and_sizes.append([text_prompt, font_size]) + return base_text_prompt, style_text_prompts, footnote_text_prompts, footnote_target_tokens,\ + color_text_prompts, color_names, color_rgbs, size_text_prompts_and_sizes, use_grad_guidance + + +def get_region_diffusion_input(model, base_text_prompt, style_text_prompts, footnote_text_prompts, + footnote_target_tokens, color_text_prompts, color_names): + r""" + Algorithm 1 in the paper. + """ + region_text_prompts = [] + region_target_token_ids = [] + base_tokens = model.tokenizer._tokenize(base_text_prompt) + # process the style text prompt + for text_prompt in style_text_prompts: + region_text_prompts.append(text_prompt) + region_target_token_ids.append([]) + style_tokens = model.tokenizer._tokenize( + text_prompt.split('in the style of')[0]) + for style_token in style_tokens: + region_target_token_ids[-1].append( + base_tokens.index(style_token)+1) + + # process the complementary text prompt + for footnote_text_prompt, text_prompt in zip(footnote_text_prompts, footnote_target_tokens): + region_target_token_ids.append([]) + region_text_prompts.append(footnote_text_prompt) + style_tokens = model.tokenizer._tokenize(text_prompt) + for style_token in style_tokens: + region_target_token_ids[-1].append( + base_tokens.index(style_token)+1) + + # process the color text prompt + for color_text_prompt, color_name in zip(color_text_prompts, color_names): + region_target_token_ids.append([]) + region_text_prompts.append(color_name+' '+color_text_prompt) + style_tokens = model.tokenizer._tokenize(color_text_prompt) + for style_token in style_tokens: + region_target_token_ids[-1].append( + base_tokens.index(style_token)+1) + + # process the remaining tokens without any attributes + region_text_prompts.append(base_text_prompt) + region_target_token_ids_all = [ + id for ids in region_target_token_ids for id in ids] + target_token_ids_rest = [id for id in range( + 1, len(base_tokens)+1) if id not in region_target_token_ids_all] + region_target_token_ids.append(target_token_ids_rest) + + region_target_token_ids = [torch.LongTensor( + obj_token_id) for obj_token_id in region_target_token_ids] + return region_text_prompts, region_target_token_ids, base_tokens + + +def get_attention_control_input(model, base_tokens, size_text_prompts_and_sizes): + r""" + Control the token impact using font sizes. + """ + word_pos = [] + font_sizes = [] + for text_prompt, font_size in size_text_prompts_and_sizes: + size_tokens = model.tokenizer._tokenize(text_prompt) + for size_token in size_tokens: + word_pos.append(base_tokens.index(size_token)+1) + font_sizes.append(font_size) + if len(word_pos) > 0: + word_pos = torch.LongTensor(word_pos).cuda() + font_sizes = torch.FloatTensor(font_sizes).cuda() + else: + word_pos = None + font_sizes = None + text_format_dict = { + 'word_pos': word_pos, + 'font_size': font_sizes, + } + return text_format_dict + + +def get_gradient_guidance_input(model, base_tokens, color_text_prompts, color_rgbs, text_format_dict, + guidance_start_step=999, color_guidance_weight=1): + r""" + Control the token impact using font sizes. + """ + color_target_token_ids = [] + for text_prompt in color_text_prompts: + color_target_token_ids.append([]) + color_tokens = model.tokenizer._tokenize(text_prompt) + for color_token in color_tokens: + color_target_token_ids[-1].append(base_tokens.index(color_token)+1) + color_target_token_ids_all = [ + id for ids in color_target_token_ids for id in ids] + color_target_token_ids_rest = [id for id in range( + 1, len(base_tokens)+1) if id not in color_target_token_ids_all] + color_target_token_ids.append(color_target_token_ids_rest) + color_target_token_ids = [torch.LongTensor( + obj_token_id) for obj_token_id in color_target_token_ids] + + text_format_dict['target_RGB'] = color_rgbs + text_format_dict['guidance_start_step'] = guidance_start_step + text_format_dict['color_guidance_weight'] = color_guidance_weight + return text_format_dict, color_target_token_ids diff --git a/gf/sd-webui-rich-text/scripts/rich-text-to-json-iframe.html b/gf/sd-webui-rich-text/scripts/rich-text-to-json-iframe.html new file mode 100644 index 0000000000000000000000000000000000000000..c83b3f8063109e442da9d8b86c23272175e68935 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/rich-text-to-json-iframe.html @@ -0,0 +1,341 @@ + + + + + Rich Text to JSON + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + +
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/gf/sd-webui-rich-text/scripts/rich-text-to-json.js b/gf/sd-webui-rich-text/scripts/rich-text-to-json.js new file mode 100644 index 0000000000000000000000000000000000000000..80aa38b524ff86eedccc8092c5b997e2a4f1e0a6 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/rich-text-to-json.js @@ -0,0 +1,349 @@ +class RichTextEditor extends HTMLElement { + constructor() { + super(); + this.loadExternalScripts(); + this.attachShadow({ mode: 'open' }); + this.shadowRoot.innerHTML = ` + ${RichTextEditor.header()} + ${RichTextEditor.template()} + `; + } + connectedCallback() { + this.myQuill = this.mountQuill(); + } + loadExternalScripts() { + const links = ["https://cdn.quilljs.com/1.3.6/quill.snow.css", "https://cdn.jsdelivr.net/npm/bulma@0.9.4/css/bulma.min.css", "https://fonts.googleapis.com/css?family=Mirza|Roboto|Slabo+27px|Sofia|Inconsolata|Ubuntu|Akronim|Monoton&display=swap"] + links.forEach(link => { + const css = document.createElement("link"); + css.href = link; + css.rel = "stylesheet" + document.head.appendChild(css); + }) + + } + static template() { + return ` +
+
+ + + + + + + + + + + + + + + + + + + +
+
+
+ `; + } + + static header() { + return ` + + + + `; + } + async mountQuill() { + // Register the customs format with Quill + const lib = await import("https://cdn.jsdelivr.net/npm/shadow-selection-polyfill"); + const getRange = lib.getRange; + + const Font = Quill.import('formats/font'); + Font.whitelist = ['mirza', 'roboto', 'sofia', 'slabo', 'inconsolata', 'ubuntu', 'cursive', 'Akronim', 'Monoton']; + const Link = Quill.import('formats/link'); + Link.sanitize = function (url) { + // modify url if desired + return url; + } + const SizeStyle = Quill.import('attributors/style/size'); + SizeStyle.whitelist = ['10px', '18px', '32px', '50px', '64px']; + Quill.register(SizeStyle, true); + Quill.register(Link, true); + Quill.register(Font, true); + const icons = Quill.import('ui/icons'); + const icon = `` + icons['link'] = icon; + const editorContainer = this.shadowRoot.querySelector('#editor-container') + const toolbarContainer = this.shadowRoot.querySelector('#toolbar-container') + const myQuill = new Quill(editorContainer, { + modules: { + toolbar: { + container: toolbarContainer, + }, + }, + theme: 'snow' + }); + const normalizeNative = (nativeRange) => { + + if (nativeRange) { + const range = nativeRange; + + if (range.baseNode) { + range.startContainer = nativeRange.baseNode; + range.endContainer = nativeRange.focusNode; + range.startOffset = nativeRange.baseOffset; + range.endOffset = nativeRange.focusOffset; + + if (range.endOffset < range.startOffset) { + range.startContainer = nativeRange.focusNode; + range.endContainer = nativeRange.baseNode; + range.startOffset = nativeRange.focusOffset; + range.endOffset = nativeRange.baseOffset; + } + } + + if (range.startContainer) { + return { + start: { node: range.startContainer, offset: range.startOffset }, + end: { node: range.endContainer, offset: range.endOffset }, + native: range + }; + } + } + + return null + }; + + myQuill.selection.getNativeRange = () => { + + const dom = myQuill.root.getRootNode(); + const selection = getRange(dom); + const range = normalizeNative(selection); + + return range; + }; + let fromEditor = false; + editorContainer.addEventListener("pointerup", (e) => { + fromEditor = false; + }); + editorContainer.addEventListener("pointerout", (e) => { + fromEditor = false; + }); + editorContainer.addEventListener("pointerdown", (e) => { + fromEditor = true; + }); + + document.addEventListener("selectionchange", () => { + if (fromEditor) { + myQuill.selection.update() + } + }); + + + myQuill.on('text-change', () => { + // keep qull data inside _data to communicate with Gradio + document.querySelector("#rich-text-root")._data = myQuill.getContents() + }) + return myQuill + } +} + +customElements.define('rich-text-editor', RichTextEditor); \ No newline at end of file diff --git a/gf/sd-webui-rich-text/scripts/rich_text_on_tab.py b/gf/sd-webui-rich-text/scripts/rich_text_on_tab.py new file mode 100644 index 0000000000000000000000000000000000000000..db3c04ca3e54d83cace3d366d26e1823b8e2cc82 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/rich_text_on_tab.py @@ -0,0 +1,314 @@ +import modules.scripts as scripts +import gradio as gr +import os + +from modules import script_callbacks +import math +import random +import os +import json +import time +import argparse +import torch +import numpy as np +from torchvision import transforms + +from scripts.models.utils.attention_utils import get_token_maps +from scripts.models.region_diffusion import RegionDiffusion +from scripts.models.region_diffusion_xl import RegionDiffusionXL +from scripts.models.utils.richtext_utils import seed_everything, parse_json, get_region_diffusion_input,\ + get_attention_control_input, get_gradient_guidance_input + + +import gradio as gr +from PIL import Image, ImageOps +from share_btn import community_icon_html, loading_icon_html, share_js, css + + +help_text = """ +If you are encountering an error or not achieving your desired outcome, here are some potential reasons and recommendations to consider: +1. If you format only a portion of a word rather than the complete word, an error may occur. +2. If you use font color and get completely corrupted results, you may consider decrease the color weight lambda. +3. Consider using a different seed. +""" + +canvas_html = """""" +get_js_data = """ +async (model_id, text_input, negative_prompt, num_segments, segment_threshold, inject_interval, inject_background, seed, color_guidance_weight, rich_text_input, height, width, steps, guidance_weights) => { + const richEl = document.getElementById("rich-text-root"); + const data = richEl? richEl.contentDocument.body._data : {}; + return [model_id, text_input, negative_prompt, num_segments, segment_threshold, inject_interval, inject_background, seed, color_guidance_weight, JSON.stringify(data), height, width, steps, guidance_weights]; +} +""" +set_js_data = """ +async (text_input) => { + const richEl = document.getElementById("rich-text-root"); + const data = text_input ? JSON.parse(text_input) : null; + if (richEl && data) richEl.contentDocument.body.setQuillContents(data); +} +""" + +get_window_url_params = """ +async (url_params) => { + const params = new URLSearchParams(window.location.search); + url_params = Object.fromEntries(params); + return [url_params]; +} +""" + +class RichText2Img(): + def __init__(self): + self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') + # self.model = RegionDiffusion(self.device, 'runwayml/stable-diffusion-v1-5') + self.model = None + + +def load_url_params(url_params): + if 'prompt' in url_params: + return gr.update(visible=True), url_params + else: + return gr.update(visible=False), url_params + + +def on_ui_tabs(): + device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') + # model = RegionDiffusion(device, 'runwayml/stable-diffusion-v1-5') + # model = RegionDiffusionXL() + richtext2img = RichText2Img() + + def generate( + model_id: str, + text_input: str, + negative_text: str, + num_segments: int, + segment_threshold: float, + inject_interval: float, + inject_background: float, + seed: int, + color_guidance_weight: float, + rich_text_input: str, + height: int, + width: int, + steps: int, + guidance_weight: float, + ): + model = richtext2img.model + if model_id == "runwayml/stable-diffusion-v1-5": + if model is None or model.model_id != model_id: + richtext2img.model = RegionDiffusion(device, model_id) + model = richtext2img.model + if width == 0 or height == 0: + width = 512 + height = 512 + elif model_id in ["stabilityai/stable-diffusion-xl-base-1.0", "Linaqruf/animagine-xl"]: + if model is None or model.model_id != model_id: + richtext2img.model = RegionDiffusionXL(model_id) + model = richtext2img.model + if width == 0 or height == 0: + width = 1024 + height = 1024 + else: + raise gr.Error("Please select a model.") + run_dir = 'results/' + os.makedirs(run_dir, exist_ok=True) + steps = 41 if not steps else steps + guidance_weight = 8.5 if not guidance_weight else guidance_weight + text_input = rich_text_input if rich_text_input != '' and rich_text_input != None else text_input + print('text_input', text_input, width, height, steps, guidance_weight, num_segments, segment_threshold, inject_interval, inject_background, color_guidance_weight, negative_text) + if (text_input == '' or rich_text_input == ''): + raise gr.Error("Please enter some text.") + # parse json to span attributes + base_text_prompt, style_text_prompts, footnote_text_prompts, footnote_target_tokens,\ + color_text_prompts, color_names, color_rgbs, size_text_prompts_and_sizes, use_grad_guidance = parse_json( + json.loads(text_input)) + + # create control input for region diffusion + region_text_prompts, region_target_token_ids, base_tokens = get_region_diffusion_input( + model, base_text_prompt, style_text_prompts, footnote_text_prompts, + footnote_target_tokens, color_text_prompts, color_names) + + # create control input for cross attention + text_format_dict = get_attention_control_input( + model, base_tokens, size_text_prompts_and_sizes) + + # create control input for region guidance + text_format_dict, color_target_token_ids = get_gradient_guidance_input( + model, base_tokens, color_text_prompts, color_rgbs, text_format_dict, color_guidance_weight=color_guidance_weight) + + seed_everything(seed) + + # get token maps from plain text to image generation. + begin_time = time.time() + if model.selfattn_maps is None and model.crossattn_maps is None: + model.remove_tokenmap_hooks() + model.register_tokenmap_hooks() + else: + model.remove_tokenmap_hooks() + model.remove_tokenmap_hooks() + if model_id == "runwayml/stable-diffusion-v1-5": + plain_img = model.produce_attn_maps([base_text_prompt], [negative_text], + height=height, width=width, num_inference_steps=steps, + guidance_scale=guidance_weight) + else: + plain_img = model.sample([base_text_prompt], negative_prompt=[negative_text], + height=height, width=width, num_inference_steps=steps, + guidance_scale=guidance_weight, run_rich_text=False) + print('time lapses to get attention maps: %.4f' % + (time.time()-begin_time)) + seed_everything(seed) + color_obj_masks, segments_vis, token_maps = get_token_maps(model.selfattn_maps, model.crossattn_maps, model.n_maps, run_dir, + height//8, width//8, color_target_token_ids[:-1], seed, + base_tokens, segment_threshold=segment_threshold, num_segments=num_segments, + return_vis=True) + seed_everything(seed) + model.masks, segments_vis, token_maps = get_token_maps(model.selfattn_maps, model.crossattn_maps, model.n_maps, run_dir, + height//8, width//8, region_target_token_ids[:-1], seed, + base_tokens, segment_threshold=segment_threshold, num_segments=num_segments, + return_vis=True) + color_obj_atten_all = torch.zeros_like(color_obj_masks[-1]) + for obj_mask in color_obj_masks[:-1]: + color_obj_atten_all += obj_mask + color_obj_masks = [transforms.functional.resize(color_obj_mask, (height, width), + interpolation=transforms.InterpolationMode.BICUBIC, + antialias=True) + for color_obj_mask in color_obj_masks] + text_format_dict['color_obj_atten'] = color_obj_masks + text_format_dict['color_obj_atten_all'] = color_obj_atten_all + model.remove_tokenmap_hooks() + + # generate image from rich text + begin_time = time.time() + seed_everything(seed) + if model_id == "runwayml/stable-diffusion-v1-5": + rich_img = model.prompt_to_img(region_text_prompts, [negative_text], + height=height, width=width, num_inference_steps=steps, + guidance_scale=guidance_weight, use_guidance=use_grad_guidance, + inject_selfattn=inject_interval, text_format_dict=text_format_dict, + inject_background=inject_background) + print('time lapses to generate image from rich text: %.4f' % + (time.time()-begin_time)) + return [plain_img[0], rich_img[0], segments_vis, token_maps] + else: + rich_img = model.sample(region_text_prompts, negative_prompt=[negative_text], + height=height, width=width, num_inference_steps=steps, + guidance_scale=guidance_weight, use_guidance=use_grad_guidance, + inject_selfattn=inject_interval, text_format_dict=text_format_dict, + inject_background=inject_background, run_rich_text=True) + print('time lapses to generate image from rich text: %.4f' % + (time.time()-begin_time)) + return [plain_img.images[0], rich_img.images[0], segments_vis, token_maps] + + with gr.Blocks(analytics_enabled=False) as ui_component: + url_params = gr.JSON({}, visible=False, label="URL Params") + gr.HTML("""

Expressive Text-to-Image Generation with Rich Text

+

[Website] | [Code] | [Paper]

+ """) + with gr.Row(): + with gr.Column(): + rich_text_el = gr.HTML(canvas_html, elem_id="canvas_html") + rich_text_input = gr.Textbox(value="", visible=False) + text_input = gr.Textbox( + label='Rich-text JSON Input', + visible=False, + max_lines=1, + placeholder='Example: \'{"ops":[{"insert":"a Gothic "},{"attributes":{"color":"#b26b00"},"insert":"church"},{"insert":" in a the sunset with a beautiful landscape in the background.\n"}]}\'', + elem_id="text_input" + ) + model_id = gr.Radio(choices=["runwayml/stable-diffusion-v1-5", "stabilityai/stable-diffusion-xl-base-1.0", "Linaqruf/animagine-xl"], value="runwayml/stable-diffusion-v1-5", label="Model ID", elem_id="model_id") + negative_prompt = gr.Textbox( + label='Negative Prompt', + max_lines=1, + placeholder='Example: poor quality, blurry, dark, low resolution, low quality, worst quality', + elem_id="negative_prompt" + ) + segment_threshold = gr.Slider(label='Token map threshold', + info='(See less area in token maps? Decrease this. See too much area? Increase this.)', + minimum=0, + maximum=1, + step=0.01, + value=0.45) + inject_interval = gr.Slider(label='Detail preservation', + info='(To preserve more structure from plain-text generation, increase this. To see more rich-text attributes, decrease this.)', + minimum=0, + maximum=1, + step=0.01, + value=0.3) + inject_background = gr.Slider(label='Unformatted token preservation', + info='(To affect less the tokens without any rich-text attributes, increase this.)', + minimum=0, + maximum=1, + step=0.01, + value=0.3) + color_guidance_weight = gr.Slider(label='Color weight', + info='(To obtain more precise color, increase this, while too large value may cause artifacts.)', + minimum=0, + maximum=2, + step=0.1, + value=0.5) + num_segments = gr.Slider(label='Number of segments', + minimum=2, + maximum=20, + step=1, + value=9) + seed = gr.Slider(label='Seed', + minimum=0, + maximum=100000, + step=1, + value=6, + elem_id="seed" + ) + with gr.Accordion('Other Parameters', open=False): + steps = gr.Slider(label='Number of Steps', + minimum=0, + maximum=500, + step=1, + value=41) + guidance_weight = gr.Slider(label='CFG weight', + minimum=0, + maximum=50, + step=0.1, + value=8.5) + width = gr.Slider(minimum=64, maximum=2048, step=8, label="Width", value=0) + height = gr.Slider(minimum=64, maximum=2048, step=8, label="Height", value=0) + + with gr.Row(): + with gr.Column(scale=1, min_width=100): + generate_button = gr.Button("Generate") + with gr.Column(): + richtext_result = gr.Image( + label='Rich-text', elem_id="rich-text-image") + richtext_result.style(height=784) + with gr.Row(): + plaintext_result = gr.Image( + label='Plain-text', elem_id="plain-text-image") + segments = gr.Image(label='Segmentation') + with gr.Row(): + token_map = gr.Image(label='Token Maps') + + generate_button.click(fn=lambda: gr.update(visible=False), inputs=None, queue=False).then( + fn=generate, + inputs=[ + model_id, + text_input, + negative_prompt, + num_segments, + segment_threshold, + inject_interval, + inject_background, + seed, + color_guidance_weight, + rich_text_input, + height, + width, + steps, + guidance_weight, + ], + outputs=[plaintext_result, richtext_result, segments, token_map], + _js=get_js_data + ).then( + fn=lambda: gr.update(visible=True), inputs=None, queue=False) + text_input.change( + fn=None, inputs=[text_input], outputs=None, _js=set_js_data, queue=False) + return [(ui_component, "rich-text2img", "rich_text2img")] + +script_callbacks.on_ui_tabs(on_ui_tabs) diff --git a/gf/sd-webui-rich-text/scripts/rich_text_settings.py b/gf/sd-webui-rich-text/scripts/rich_text_settings.py new file mode 100644 index 0000000000000000000000000000000000000000..a3f62691a0d521ad04198f2faf5391863b8d5546 --- /dev/null +++ b/gf/sd-webui-rich-text/scripts/rich_text_settings.py @@ -0,0 +1,20 @@ +import modules.scripts as scripts +import gradio as gr +import os + +from modules import shared +from modules import script_callbacks + +def on_ui_settings(): + section = ('template', "Rich-Text-to-Image") + shared.opts.add_option( + "option1", + shared.OptionInfo( + False, + "This is a placeholder for option. It is not used yet.", + gr.Checkbox, + {"interactive": True}, + section=section) + ) + +script_callbacks.on_ui_settings(on_ui_settings) diff --git a/gf/sd-webui-rich-text/share_btn.py b/gf/sd-webui-rich-text/share_btn.py new file mode 100644 index 0000000000000000000000000000000000000000..1c1c77321d8d8cdb92a3a52606ab78a18d519d9a --- /dev/null +++ b/gf/sd-webui-rich-text/share_btn.py @@ -0,0 +1,116 @@ +community_icon_html = """""" + +loading_icon_html = """""" + +share_js = """async () => { + async function uploadFile(file){ + const UPLOAD_URL = 'https://huggingface.co/uploads'; + const response = await fetch(UPLOAD_URL, { + method: 'POST', + headers: { + 'Content-Type': file.type, + 'X-Requested-With': 'XMLHttpRequest', + }, + body: file, /// <- File inherits from Blob + }); + const url = await response.text(); + return url; + } + async function getInputImageFile(imageEl){ + const res = await fetch(imageEl.src); + const blob = await res.blob(); + const imageId = Date.now(); + const fileName = `rich-text-image-${{imageId}}.png`; + return new File([blob], fileName, { type: 'image/png'}); + } + const gradioEl = document.querySelector("gradio-app").shadowRoot || document.querySelector('body > gradio-app'); + const richEl = document.getElementById("rich-text-root"); + const data = richEl? richEl.contentDocument.body._data : {}; + const text_input = JSON.stringify(data); + const negative_prompt = gradioEl.querySelector('#negative_prompt input').value; + const seed = gradioEl.querySelector('#seed input').value; + const richTextImg = gradioEl.querySelector('#rich-text-image img'); + const plainTextImg = gradioEl.querySelector('#plain-text-image img'); + const text_input_obj = JSON.parse(text_input); + const plain_prompt = text_input_obj.ops.map(e=> e.insert).join(''); + const linkSrc = `https://huggingface.co/spaces/songweig/rich-text-to-image?prompt=${encodeURIComponent(text_input)}`; + + const titleTxt = `RT2I: ${plain_prompt.slice(0, 50)}...`; + const shareBtnEl = gradioEl.querySelector('#share-btn'); + const shareIconEl = gradioEl.querySelector('#share-btn-share-icon'); + const loadingIconEl = gradioEl.querySelector('#share-btn-loading-icon'); + if(!richTextImg){ + return; + }; + shareBtnEl.style.pointerEvents = 'none'; + shareIconEl.style.display = 'none'; + loadingIconEl.style.removeProperty('display'); + + const richImgFile = await getInputImageFile(richTextImg); + const plainImgFile = await getInputImageFile(plainTextImg); + const richImgURL = await uploadFile(richImgFile); + const plainImgURL = await uploadFile(plainImgFile); + + const descriptionMd = ` +### Plain Prompt +${plain_prompt} + +🔗 Shareable Link + Params: [here](${linkSrc}) + +### Rich Tech Image + + +### Plain Text Image + + +`; + const params = new URLSearchParams({ + title: titleTxt, + description: descriptionMd, + }); + const paramsStr = params.toString(); + window.open(`https://huggingface.co/spaces/songweig/rich-text-to-image/discussions/new?${paramsStr}`, '_blank'); + shareBtnEl.style.removeProperty('pointer-events'); + shareIconEl.style.removeProperty('display'); + loadingIconEl.style.display = 'none'; +}""" + +css = """ + #share-btn-container { + display: flex; + padding-left: 0.5rem !important; + padding-right: 0.5rem !important; + background-color: #000000; + justify-content: center; + align-items: center; + border-radius: 9999px !important; + width: 13rem; + margin-top: 10px; + margin-left: auto; + flex: unset !important; + } + #share-btn { + all: initial; + color: #ffffff; + font-weight: 600; + cursor: pointer; + font-family: 'IBM Plex Sans', sans-serif; + margin-left: 0.5rem !important; + padding-top: 0.25rem !important; + padding-bottom: 0.25rem !important; + right:0; + } + #share-btn * { + all: unset !important; + } + #share-btn-container div:nth-child(-n+2){ + width: auto !important; + min-height: 0px !important; + } + #share-btn-container .wrap { + display: none !important; + } +""" diff --git a/gf/sd-webui-sdxl-refiner-hack/.gitignore b/gf/sd-webui-sdxl-refiner-hack/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..18e5f36ccf678abee94262fd8843294c84449a62 --- /dev/null +++ b/gf/sd-webui-sdxl-refiner-hack/.gitignore @@ -0,0 +1 @@ +scripts/__pycache__ \ No newline at end of file diff --git a/gf/sd-webui-sdxl-refiner-hack/LICENSE b/gf/sd-webui-sdxl-refiner-hack/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..ef10bf9f8afc2ea6c61d89ee029febd5094ff9dc --- /dev/null +++ b/gf/sd-webui-sdxl-refiner-hack/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2023 Tom Haelbich + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/gf/sd-webui-sdxl-refiner-hack/README.md b/gf/sd-webui-sdxl-refiner-hack/README.md new file mode 100644 index 0000000000000000000000000000000000000000..3968327ca9ecae24a7763895aaf674c24c30f253 --- /dev/null +++ b/gf/sd-webui-sdxl-refiner-hack/README.md @@ -0,0 +1,44 @@ +# This extension becomes unnecessary with the Realse 1.6 of A11111 ! + +There will be native sdxl support and in the hires. fix the model can be changed. See: + +[realse notes ](https://github.com/AUTOMATIC1111/stable-diffusion-webui/releases/tag/v1.6.0-RC) + +


+ + + +# ~~SD WebUI SDXL Refiner Hack~~ + +~~Use the SDXL Refiner Model for the high-res fix pass. Hopefully, we won't need this extension for a long time.~~ + +### ~~This is not the same as the Refiner used in Comfy UI!~~ + +
+ +~~You need to enable high-res fix.~~ + +### ~~System Requirements~~ + +- ~~32GB RAM (Not VRAM)~~ + +~~SDXL requires a lot of RAM so the more the better. With 32GB, it will use **ALL** of your RAM. Close **everything** you don't need to have open. `
` +Also, use [this pull request](https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/11958) with 32GB RAM to reduce RAM usage.~~ + +### ~~recommendations~~ + +- ~~Use [this](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix) VAE (fp16)~~ +- ~~Set batch count to 1 (for 32GB RAM)~~ +- ~~Other settings as you would normally~~ + +~~My settings:~~ + +- ~~Upscaler: Latent~~ +- ~~Upscale by: 1.25~~ +- ~~Denoising: 0.53~~ + +
+ +~~The default values can be changed in the settings~~ + +~~Of course, this extension can be just used to use a different checkpoint for the high-res fix pass for non-SDXL models.~~ diff --git a/gf/sd-webui-sdxl-refiner-hack/scripts/__pycache__/main.cpython-310.pyc b/gf/sd-webui-sdxl-refiner-hack/scripts/__pycache__/main.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..8181572c87015aeca6c57abbf01db5e146e72319 Binary files /dev/null and b/gf/sd-webui-sdxl-refiner-hack/scripts/__pycache__/main.cpython-310.pyc differ diff --git a/gf/sd-webui-sdxl-refiner-hack/scripts/main.py b/gf/sd-webui-sdxl-refiner-hack/scripts/main.py new file mode 100644 index 0000000000000000000000000000000000000000..e5ce021ec8eaa25ad584a3bc9bcab7ac5f34919f --- /dev/null +++ b/gf/sd-webui-sdxl-refiner-hack/scripts/main.py @@ -0,0 +1,78 @@ +import sys +import time +import modules.scripts as scripts +import modules +import modules.shared as shared +import gradio as gr +from modules.ui_components import FormColumn, FormRow +from modules import script_callbacks + + +def on_ui_settings(): + section = ("sdxlHiresHack ", "SDXL Refinder Hack") + shared.opts.add_option( + key = "sdxl_base_model", + info = shared.OptionInfo( + "sd_xl_base_1.0.safetensors", + "SDXL Base model", + section=section) + ) + + shared.opts.add_option( + key = "sdxl_refiner_model", + info = shared.OptionInfo( + "sd_xl_refiner_1.0.safetensors", + "SDXL refiner model", + section=section) + ) + +script_callbacks.on_ui_settings(on_ui_settings) + +class sdxlRefinderHack(scripts.Script): + + def __init__(self): + self.info_base = None + self.info_hr = None + self.first_pass = True + + def title(self): + return "SDXL Refinder Hack" + + def show(self, is_img2img): + return scripts.AlwaysVisible + + def ui(self, is_img2img): + with gr.Accordion(self.title(), open=False): + gr.Markdown("will become unnecessary in the 1.6 release of A1111") + if is_img2img: + gr.Markdown("will not do anything in img2img") + else: + + with FormRow(): + is_enabled = gr.Checkbox(value=False, label="Enable") + with FormColumn(): + base_model = gr.inputs.Textbox(lines=1, label="SDXL base model name", default=getattr(shared.opts, "sdxl_base_model", "")) + refinder_model = gr.inputs.Textbox(lines=1, label="SDXL refinder model name", default=getattr(shared.opts, "sdxl_refiner_model", "")) + return [base_model, refinder_model, is_enabled] + + + def before_process_batch(self, p,*args, **kwargs): + print(f"\nEnabled: {args[2]}\n\ncheckpoint: {args[0]}") + if args[2]: + if self.first_pass: + self.first_pass = False + else: + modules.sd_models.unload_model_weights(shared.sd_model, self.info_hr) + self.info_base = modules.sd_models.get_closet_checkpoint_match(args[0]) + modules.sd_models.reload_model_weights(shared.sd_model, self.info_base) + p.override_settings['sd_model_checkpoint'] = self.info_base.name + + def before_hr(self, p, *args, **kwargs): + if args[2]: + modules.sd_models.unload_model_weights(shared.sd_model, self.info_base) + self.info_hr = modules.sd_models.get_closet_checkpoint_match(args[1]) + modules.sd_models.reload_model_weights(shared.sd_model, self.info_hr) + p.override_settings['sd_model_checkpoint'] = self.info_hr.name + p.extra_generation_params['base model'] = self.info_base.name + + diff --git a/gf/sd-webui-split-prompt/.gitattributes b/gf/sd-webui-split-prompt/.gitattributes new file mode 100644 index 0000000000000000000000000000000000000000..dfe0770424b2a19faf507a501ebfc23be8f54e7b --- /dev/null +++ b/gf/sd-webui-split-prompt/.gitattributes @@ -0,0 +1,2 @@ +# Auto detect text files and perform LF normalization +* text=auto diff --git a/gf/sd-webui-split-prompt/.gitignore b/gf/sd-webui-split-prompt/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..3eb60f9469dcc352f12865aa6254fe51ec81bc19 --- /dev/null +++ b/gf/sd-webui-split-prompt/.gitignore @@ -0,0 +1 @@ +分類/ \ No newline at end of file diff --git a/gf/sd-webui-split-prompt/scripts/__pycache__/split.cpython-310.pyc b/gf/sd-webui-split-prompt/scripts/__pycache__/split.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..6e7c5896ee7ca39e8fcf2fc3f2993cc073f6bc93 Binary files /dev/null and b/gf/sd-webui-split-prompt/scripts/__pycache__/split.cpython-310.pyc differ diff --git a/gf/sd-webui-split-prompt/scripts/split.py b/gf/sd-webui-split-prompt/scripts/split.py new file mode 100644 index 0000000000000000000000000000000000000000..b1f8e2fa8dbbc142a40eda25a43dd51fe9737aac --- /dev/null +++ b/gf/sd-webui-split-prompt/scripts/split.py @@ -0,0 +1,154 @@ +import os +import os.path +import re +import gradio as gr +from modules import script_callbacks, shared +from modules.ui import create_refresh_button + + +def add_tab(): + with gr.Blocks(analytics_enabled=False) as ui: + with gr.Row().style(equal_height=False): + with gr.Column(variant='panel'): + gr.HTML( + value="

Differentiate prompts according to different types.") + + replace_underscore = gr.Checkbox( + value=lambda: True, + label="Replace underscores with spaces", + elem_id="replace_underscore_checkbox") + + need_split_prompts = gr.Textbox( + lines=6, + label="Prompts", + elem_id="need_split_prompts") + + button_split_prompts = gr.Button( + elem_id="button_split_prompts", + value="Split", + variant='primary') + + with gr.Column(variant='panel'): + submit_result = gr.Textbox( + label="Result", + elem_id="split_result", + interactive=False, + show_copy_button=True) + + button_split_prompts.click( + fn=do_split, + inputs=[ + need_split_prompts, + replace_underscore, + ], + outputs=[submit_result] + ) + + return [(ui, "Split prompts", "split_prompts")] + + +def load_txt_to_dictionary(filePath, main_folder, file_contents): + with open(os.path.join(filePath), "r", encoding="utf-8") as f: + # Add the contents of the txt file to the main folder's list + file_contents[main_folder].extend( + [word.lower() for word in f.read().splitlines()]) + + +def load_classification_files(): + base_path = os.path.join(os.path.dirname( + os.path.dirname(os.path.realpath(__file__))), "分類") + + # A dictionary to hold the contents of the txt files classified by the main folders + file_contents = {} + + # Iterate through the main folders in the "分類" directory + for main_folder in os.listdir(base_path): + main_folder_path = os.path.join(base_path, main_folder) + + # Only process directories (main folders) + if os.path.isdir(main_folder_path): + file_contents[main_folder] = [] + + # Iterate through all subdirectories and txt files inside the main folder + for dirpath, dirnames, filenames in os.walk(main_folder_path): + for file in filenames: + if file.endswith(".txt"): + load_txt_to_dictionary( + os.path.join(dirpath, file), main_folder, file_contents) + elif os.path.isfile(main_folder_path) and main_folder.endswith(".txt"): + file_contents[main_folder.replace(".txt", "")] = [] + load_txt_to_dictionary( + main_folder_path, main_folder.replace(".txt", ""), file_contents) + + return file_contents + + +def simplify_word(word): + # 使用正则表达式提取词部分 + match = re.match(r'^\(([\w\s]+):\d+\.\d+\)$|^\(+([\w\s]+)\)+$', word) + if match: + processed_word = match.group(1) or match.group(2) + return processed_word + else: + return word + +# Modify do_split function to use the classification files + + +def do_split(need_split_prompts, replace_underscore): + classifications = load_classification_files() + results = {key: [] for key in classifications.keys()} + + if "其他" not in results: + results["其他"] = [] + + if "lora" not in results: + results["lora"] = [] + + prompts = re.split(r'[,|\n]', need_split_prompts) + for prompt in prompts: + prompt = prompt.strip().lower() + + if not prompt: + continue + + if prompt.startswith(" +;p \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_emotions_and_expressions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_emotions_and_expressions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..762d551d9bfa3bd6f357c2c48feb9f1a4fc451c5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_emotions_and_expressions.txt" @@ -0,0 +1,17 @@ +> < +X3 +XD +DX +O o +0 0 +3 3 +6 9 +@ @ +^ ^ +^o^ +| | +|| || += = ++ + +. . + \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8f95b077309fa1ff198673784283086283bfa1c8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_misc.txt" @@ -0,0 +1,20 @@ +akanbe +asymmetrical irises +big eyes +blind +cross-eyed +drawn on eyes +eyeball +eye beam +eye poke +eye pop +half-closed eyes +persona eyes +shading eyes +squinting +staring +uneven eyes +upturned eyes +wall-eyed +wide-eyed +wince \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_number.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_number.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1578f8fdf9e0dd20c5fae7158ea1f8390a52a2d8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_number.txt" @@ -0,0 +1,5 @@ +missing eye +one-eyed +third eye +extra eyes +no eyes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_pupils.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_pupils.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3a32da01d33cf60e3e0e87eea45e6c41fe070fb5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_pupils.txt" @@ -0,0 +1,12 @@ +constricted pupils +dilated pupils +horizontal pupils +no pupils +slit pupils +symbol-shaped pupils ++ + +diamond-shaped pupils +flower-shaped pupils +heart-shaped pupils +star-shaped pupils +mismatched pupils \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_sclera.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_sclera.txt" new file mode 100644 index 0000000000000000000000000000000000000000..03500a3af1898ae58e4bf43aafa1359326b68d11 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/eyes_sclera.txt" @@ -0,0 +1,10 @@ +blue sclera +black sclera +blank eyes +bloodshot eyes +green sclera +mismatched sclera +no sclera +orange sclera +red sclera +yellow sclera \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6d1486ec53ab390eb2a309d5c8a8c60f06f33c87 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_colors.txt" @@ -0,0 +1,13 @@ +aqua eyes +black eyes +blue eyes +brown eyes +green eyes +grey eyes +orange eyes +purple eyes +pink eyes +red eyes +white eyes +yellow eyes +amber eyes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_form.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_form.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d4489196b6de0b5ec2ab9c3995544d24a26cb461 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_form.txt" @@ -0,0 +1,4 @@ +@ @ +dashed eyes +Pac-man eyes +ringed eyes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_multiple_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_multiple_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..23ef4a52a7aceaf39f153688526ee9eaf9ea9f2c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/iris/eyes_iris_multiple_colors.txt" @@ -0,0 +1,2 @@ +heterochromia +multicolored eyes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_animal.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_animal.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c5cf2faecd47396488fdcdf177e54ff8f7533823 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_animal.txt" @@ -0,0 +1,5 @@ +amphibian eyes +button eyes +cephalopod eyes +compound eyes +horizontal pupils \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c9f5d88ceb0df8fd5be5c10cb2b7fb1655d91e03 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_other.txt" @@ -0,0 +1,2 @@ +eye reflection +text in eyes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_series_specific.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_series_specific.txt" new file mode 100644 index 0000000000000000000000000000000000000000..423015c91415a9f766423c52ca1988fe80d31af7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_series_specific.txt" @@ -0,0 +1,3 @@ +Geass +Sharingan +Mangekyou Sharingan \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_stylistic.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_stylistic.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b454dc45427e93980433d92cbaf2f1a1a123f7b3 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/eyes/more_appearance/eyes_more_appearance_stylistic.txt" @@ -0,0 +1,12 @@ +crazy eyes +empty eyes +dashed eyes +heart-shaped eyes +Nonowa +solid circle eyes +o o +0 0 +jitome +tareme +tsurime +sanpaku \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b378aa2366bb8088429a95249c8d320bdeb7a768 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face.txt" @@ -0,0 +1,12 @@ +laughing +giggling +smile +embarrassed +smug +kubrick stare +trembling +scared +;o +:p +;d +flustered \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_drawing_styles.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_drawing_styles.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d84c7693983a2df5e1f96fd4c001e064f7cac5ce --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_drawing_styles.txt" @@ -0,0 +1,12 @@ +constricted pupils +cross-eyed +dot mouth +dot nose +mouth drool +no nose +no mouth +rectangular mouth +sideways mouth +split mouth +wavy mouth +wide-eyed \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_emotes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_emotes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d60475a1f56c83c4c3afa5742de3937587f2ee18 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_emotes.txt" @@ -0,0 +1,57 @@ +;) +:d +;d +xd +d: +:3 +;3 +x3 +3: +uwu +:p +;p +:q +;q +>:) +>:( +:t +:i +:/ +:| +:x +:c +c: +:< +;< +:<> +:> +:>= +:o +;o +o3o +(-3-) +>3< +o o +0 0 +| | +. . +solid circle eyes +heart-shaped eyes +^ ^ +^o^ +\(^o^)/ +^q^ +> < +xd [ XD ] +x3 +>o< +@ @ +> @ ++ + ++ - += = +=^= +=v= + +<|> <|> +Drawing \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_emotions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_emotions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3f5e3ec1f31cdbd82ffe1fd2880c1510d250c19f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_emotions.txt" @@ -0,0 +1,61 @@ +angry +anger vein +annoyed +clenched teeth +annoyed +blush +blush stickers +embarrassed +full-face blush +nose blush +bored +closed eyes +confused +crazy +determined +disappointed +disdain +disgust +despair +drunk +envy +expressionless +evil +facepalm +flustered +frustrated +furrowed brow +grimace +guilt +happy +kubrick stare +lonely +nervous +nosebleed +one eye closed +open mouth +parted lips +pain +pout +raised eyebrow +rape face +rolling eyes +sad +depressed +frown +gloom (expression) +tears +scared +panicking +worried +serious +sigh +sleepy +tired +sulking +surprised +thinking +pensive +v-shaped eyebrows +wince +upset \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_looking_at.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_looking_at.txt" new file mode 100644 index 0000000000000000000000000000000000000000..12473fe887f1d30c555aa8c9dcf19661b8a3c12c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_looking_at.txt" @@ -0,0 +1,4 @@ +food awe +breast awe +penis awe +staring \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c4148e1601281850ebf5238dbfea8a54d2ebf34c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_main.txt" @@ -0,0 +1 @@ +expressions \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..09693dbde873317bb57e2814231ec82a26a3c660 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_misc.txt" @@ -0,0 +1,10 @@ +awesome face +face of the people who sank all their money into the fx +foodgasm +henohenomoheji +nonowa +portrait +profile +smiley face +troll face +uso da \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_sexual.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_sexual.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0d3512dbca29c68d216495fa4e169b97d7a074a5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_sexual.txt" @@ -0,0 +1,9 @@ +afterglow +ahegao +fucked silly +aroused +fucked silly +in heat +naughty face +seductive smile +torogao \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smile.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smile.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0bedc0f4ff15285583d9d036c6c8b270cc7dfcd9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smile.txt" @@ -0,0 +1,11 @@ +crazy smile +evil smile +fingersmile +forced smile +glasgow smile +grin +evil grin +light smile +sad smile +seductive smile +stifled laugh \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smile_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smile_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..36d697f0bedb4243011969ebfd89c246d3a66f60 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smile_main.txt" @@ -0,0 +1 @@ +smile \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smug.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smug.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a5c71f077a398140e5bcfc134348973031cc1111 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smug.txt" @@ -0,0 +1,4 @@ +doyagao +smirk +smug +troll face \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smug_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smug_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..76f5daaa2b58759c9f698a2f1db00bbee1ee5377 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/face_smug_main.txt" @@ -0,0 +1 @@ +smug \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/surprised_scared_sad/face_surprised_scared_sad.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/surprised_scared_sad/face_surprised_scared_sad.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2bba8514e784b4ae4428e48213f6657d3797e765 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/surprised_scared_sad/face_surprised_scared_sad.txt" @@ -0,0 +1,12 @@ +^^^ +color drain +depressed +despair +gloom (expression) +horrified +screaming +sobbing +turn pale +trembling +wavy mouth +upset \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/surprised_scared_sad/face_surprised_scared_sad_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/surprised_scared_sad/face_surprised_scared_sad_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..684c3a0ffe199e39f92bbe7e8f4221cd689d2880 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/face/surprised_scared_sad/face_surprised_scared_sad_main.txt" @@ -0,0 +1,3 @@ +surprised +scared +sad \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gen-modifier.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gen-modifier.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1211a665719aa782fb1e42495b9f2c467319cb64 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gen-modifier.txt" @@ -0,0 +1,68 @@ +60s kitsch and psychedelia +abstract illusionism +ambrotype +american romanticism +anaglyph effect +anaglyph filter +blueprint +brutalism +chillwave +cottagecore +crayon art +cyanotype +dark academia +digital art +dutch golden age +fractalism +fresco +futuresynth +german romanticism +glitch art +gothic art +graffiti +hyperrealism +infrared +Lovecraftian +magic realism +multiple exposure +narrative realism +photorealism +realism +retrowave +Sabattier effect +Sabattier filter +still life +street art +surrealism +synthwave +vaporwave +afrofuturism +anime +avant-garde +calotype +camcorder effect +cubism +daguerreotype +doge +expressionism +figurativism +fisheye lens +gothic +holography +impressionism +inverted colors +long exposure +mac and cheese +macro lens +manga +modernism +naturalism +optical illusion +positivism +stuckism +symbolism +telephoto lens +the matrix +tilt-shift +tintype +tonalism diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gender-ext.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gender-ext.txt" new file mode 100644 index 0000000000000000000000000000000000000000..47e873445856dbb4f1c7bdf2333e7c4f6ba41cbb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gender-ext.txt" @@ -0,0 +1,54 @@ +girl +boy +female +male +agender +androgyne +androgynous +bigender +cis +cisgender +cis female +cis male +cis man +cis woman +cisgender female +cisgender male +cisgender man +cisgender woman +female to male +ftm +gender fluid +gender nonconforming +gender questioning +gender variant +genderqueer +intersex +male to female +mtf +neither +neutrois +non-binary +other +pangender +trans +trans female +trans male +trans man +trans person +trans woman +transfeminine +transgender +transgender female +transgender male +transgender man +transgender person +transgender woman +transmasculine +transsexual +transsexual female +transsexual male +transsexual man +transsexual person +transsexual woman +two-spirit diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gender.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gender.txt" new file mode 100644 index 0000000000000000000000000000000000000000..191d05e40df113523f192370e30ff76152147f6a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/gender.txt" @@ -0,0 +1,5 @@ +girl +boy +female +male +non-binary diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/groups_by_gender_boys.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/groups_by_gender_boys.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5b94be6f7e5d356d5e1020ecf9751d1927def92e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/groups_by_gender_boys.txt" @@ -0,0 +1,7 @@ +multiple boys +1girl +2boys +3boys +4boys +5boys +6+boys \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/groups_by_gender_girls.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/groups_by_gender_girls.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b2f33a839db45804487d5169559890306cca5be8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/groups_by_gender_girls.txt" @@ -0,0 +1,7 @@ +multiple girls +1girl +2girls +3girls +4girls +5girls +6+girls \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d1e55efd9e8341edc0f057dbe8bd9c032e590ff1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair.txt" @@ -0,0 +1,46 @@ +undercut +buzzcut +dreadlocks +braids +wavy hair +curly hair +short hair +very short hair +long hair +very long hair +spiral hair +full hair +layered hair +thick hair +face framing hair +hair covering eyes +straight hair +braided hair +curling iron hair +hime hair +ojousama hair +hair buns +punk hair +slicked back hair +side-swept hair +royal hair +updo hair +drill bits hair +long bangs +wispy bangs +blinding bangs +ponytail +pigtails +twintails +messy hair +crazy hair +wild hair +bed head hair +shaggy hair +bob haircut +towering hair +ahoge hair +odango hair +spiky hair +edgy hair +floating hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b129bc395378996fdeb727047e3433f1186f7978 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color.txt" @@ -0,0 +1,18 @@ +aqua hair +black hair +blonde hair +blue hair +light blue hair +dark blue hair +brown hair +light brown hair +green hair +dark green hair +light green hair +grey hair +orange hair +pink hair +purple hair +light purple hair +red hair +white hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7c9f5b22021f26df2f4414d39c04280734e72f7f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color_misc.txt" @@ -0,0 +1,8 @@ +multicolored hair +colored inner hair +gradient hair +rainbow hair +split-color hair +streaked hair +two-tone hair +colored tips \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color_multiple.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color_multiple.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7c9f5b22021f26df2f4414d39c04280734e72f7f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/color/hair_color_multiple.txt" @@ -0,0 +1,8 @@ +multicolored hair +colored inner hair +gradient hair +rainbow hair +split-color hair +streaked hair +two-tone hair +colored tips \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_actions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_actions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..73613057f2065f8738c42638f10e3a6113f7f484 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_actions.txt" @@ -0,0 +1,16 @@ +grabbing another's hair +hairdressing +cutting hair +brushing hair +heartbreak haircut +hand in own hair +adjusting hair +bunching hair +hair flip +grabbing another's hair +hair tucking +holding hair +ruffling hair +hair twirling +hair sex +whipping hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_facial.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_facial.txt" new file mode 100644 index 0000000000000000000000000000000000000000..888aa89ba03880199e6d4156ee79833f50bf4716 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_facial.txt" @@ -0,0 +1,7 @@ +facial hair +beard +bearded girl +goatee +mustache +fake mustache +stubble \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_fantasy.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_fantasy.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3a4ae160709a120fdb7373b80de30e1e751211bd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_fantasy.txt" @@ -0,0 +1,10 @@ +Liquid Hair +Cloud Hair +Chocolate Hair +Plant Hair +Crystal Hair +Fiery hair +Prehensile hair +helicopter hair +tentacle hair +living hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a3387ed7fd68645a8d7b79e258e218d21ce87fbb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/hair_misc.txt" @@ -0,0 +1,5 @@ +detached hair +severed hair +floating hair +hair spread out +wet hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/objects/hair_objects_care.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/objects/hair_objects_care.txt" new file mode 100644 index 0000000000000000000000000000000000000000..27312470aee9c83f7539161e4c8301f5ef43f744 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/objects/hair_objects_care.txt" @@ -0,0 +1,3 @@ +hair brush +hair dryer +shampoo \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_back.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_back.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1762d0600b5c2afad64c81acdffe95a7696ad02b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_back.txt" @@ -0,0 +1 @@ +lone nape hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_front.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_front.txt" new file mode 100644 index 0000000000000000000000000000000000000000..65494bc7bfaa24412f981518fa5fbc9e4d1f6077 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_front.txt" @@ -0,0 +1,12 @@ +bangs +asymmetrical bangs +blunt bangs +hair over eyes +hair over one eye +parted bangs +swept bangs +hair between eyes +hair intakes +single hair intake +sidelocks +widow's peak \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_length.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_length.txt" new file mode 100644 index 0000000000000000000000000000000000000000..056021cadb3eb388f05a19549e03cf0badbf91dd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_length.txt" @@ -0,0 +1,9 @@ +very short hair +short hair +medium hair +long hair +very long hair +absurdly long hair +big hair +bald +bald girl \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_long.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_long.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c96085ab3cbca8949fb1f64e7c15d4159c274224 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_long.txt" @@ -0,0 +1,4 @@ +cornrows +dreadlocks +hime cut +mullet \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_medium.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_medium.txt" new file mode 100644 index 0000000000000000000000000000000000000000..83581b3477408f8dfd0562f36cd502c3342985c8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_medium.txt" @@ -0,0 +1,2 @@ +bob cut +flipped hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..588e921ab5b4329d332fac2922490962559c4880 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_misc.txt" @@ -0,0 +1,5 @@ +alternate hairstyle +hair down +hair up +asymmetrical hair +sidecut \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_over_the_body.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_over_the_body.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4f6db96a7723de708f2df12f458bb5af59fa2279 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_over_the_body.txt" @@ -0,0 +1,8 @@ +hair bikini +hair censor +hair in mouth +hair over breasts +hair over one breast +hair over crotch +hair over shoulder +hair scarf \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_short.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_short.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fea2f7b5ad5b81d124ae0a6d1bf3cf24b0432906 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_short.txt" @@ -0,0 +1,8 @@ +bowl cut +buzz cut +chonmage +crew cut +flattop +okappa +pixie cut +undercut \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_tall.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_tall.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f837857ff8375bf31db0812ee36e119965a8ff1c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_tall.txt" @@ -0,0 +1,6 @@ +afro +huge afro +beehive hairdo +pompadour +quiff +shouten pegasus mix mori \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_texture.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_texture.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b10ddee052217679025fccf80115714633531cd0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_texture.txt" @@ -0,0 +1,9 @@ +curly hair +drill hair +twin drills +hair flaps +messy hair +pointy hair +ringlets +spiked hair +wavy hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_tied.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_tied.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b115f2153e7409d29b583e584efe868bdf5ceb0f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_tied.txt" @@ -0,0 +1,46 @@ +bow-shaped hair +braid +braided bangs +front braid +side braid +french braid +crown braid +single braid +multiple braids +twin braids +low twin braids +tri braids +quad braids +hair bun +braided bun +single hair bun +double bun +cone hair bun +doughnut hair bun +heart hair bun +triple bun +cone hair bun +hair rings +half updo +one side up +two side up +low-braided long hair +low-tied long hair +mizura +multi-tied hair +nihongami +ponytail +folded ponytail +front ponytail +high ponytail +short ponytail +side ponytail +split ponytail +topknot +twintails +low twintails +short twintails +uneven twintails +tri tails +quad tails +quin tails \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_top.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_top.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b5acecacf89b5e25bcd395b35f8bf6281352a3e4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/hair/styles/hair_styles_top.txt" @@ -0,0 +1,8 @@ +ahoge +heart ahoge +huge ahoge +antenna hair +comb over +hair pulled back +hair slicked back +mohawk \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity-adult.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity-adult.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1f53a4abb7582a8e1697c4bdb14849ffedf74fd9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity-adult.txt" @@ -0,0 +1,10 @@ +old woman +old man +older woman +older man +middle-aged woman +middle-aged man +woman +man +young woman +young man diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity-young.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity-young.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9e1a63318cea80627d93fa9f4934bfdb05abf9d9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity-young.txt" @@ -0,0 +1,7 @@ +teenage girl +teenage boy +girl +boy +young girl +young boy +baby diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity.txt" new file mode 100644 index 0000000000000000000000000000000000000000..50888fcea6cc2120df57a1d69ab81c43cf274e9e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/identity.txt" @@ -0,0 +1,17 @@ +old woman +old man +older woman +older man +middle-aged woman +middle-aged man +woman +man +young woman +young man +teenage girl +teenage boy +girl +boy +young girl +young boy +baby diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/new.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/new.txt" new file mode 100644 index 0000000000000000000000000000000000000000..920cfe5dae9692506a0493ee12c29057be673d64 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/new.txt" @@ -0,0 +1,2 @@ +closed mouth +solo \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin-color.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin-color.txt" new file mode 100644 index 0000000000000000000000000000000000000000..898fad0c7f886883b4384cbdd37060d50e0076a6 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin-color.txt" @@ -0,0 +1,26 @@ +pale +light +fair +dark +brown +black +tan +golden +olive +white +alabaster +ivory +cream +beige +taupe +espresso +chocolate +mahogany +caramel +bronze +copper +peach +pink +salmon +ruby +wine diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_abnormal.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_abnormal.txt" new file mode 100644 index 0000000000000000000000000000000000000000..088675032c83ccf4bfd5714d736167b652dc2889 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_abnormal.txt" @@ -0,0 +1,10 @@ +Black skin +Blue skin +Green skin +Grey skin +Orange skin +Pink skin +Purple skin +Red skin +White skin +Yellow skin \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ab9ee965045549c89893ced6a9159ceaad983883 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_misc.txt" @@ -0,0 +1,2 @@ +Colored skin +Shiny skin \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_normal.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_normal.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e3ad81a14aec18512af952f555063a52519893dc --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\344\272\272/skin_color/skin_color_normal.txt" @@ -0,0 +1,5 @@ +Dark skin +Pale skin +Tan +Tanlines +Sun tattoo \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/3d-terms.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/3d-terms.txt" new file mode 100644 index 0000000000000000000000000000000000000000..203f0ed6565093036566b7ca07e7924f4a399990 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/3d-terms.txt" @@ -0,0 +1,38 @@ +2D Sprite +Ambient Lighting +Ambient Occlusion +Anti-aliasing +Augmented Reality +Bitmap +Cel shading +Chromatic Abberation +Direct Lighting +Flat shading +FXAA +Global Illumination +Indirect Illumination +Lumen Global Illumination +Manhattan Distance +Microdisplacement +Nanite Geometry +Path Traced +PBR Material +Phong Shading +Physically based rendering (PBR) +Post Processed +Procedural Generation +Procedural Texture +Raster graphics +Ray Traced +Screen space ambient occlusion +Skybox +Terrain rendering +Tesselation +Virtual reality +Volumetric Lighting +Voronoi Manhattan +Voronoi Minkowski +Voxel Engine +Voxel Geometry +Wireframe Model +Wireframe Rendering diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/Subjects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/Subjects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b5df30d28460dcdfc609ec32d67ed8a40e78ff41 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/Subjects.txt" @@ -0,0 +1,80 @@ +Knight +Futuristic Knight +Medieval Knight +Cyborg +Damaged Cyborg +Robot +Damaged Robot +Futuristic Soldier +Soldier +Beautiful lady +Lady +Georgeous young woman +Masculine man +Bearded man +Amazon +Sorceress +Witch +Priestess +Barbarian +Demon hunter +Necromancer +Wizard +Alien +Monster +Princess +Animal +Cat +Dog +Fox +Steampunk Robot +Scientist +Ghost +Demon +Technician +Car +Bicycle +Motorbike +Tree +Death +Life +Humanoid +Dragon +Minotaur +Golem +Vampire +Machine +Creature +Shiba Inu +Elemental +Frozen monster +Fire monster +Dinosaur +Skeleton +Bounty hunter +Pirate +Officer +Astronaut +Mech +Battle robot +Cute robot +Hero +Superhero +Hologram +Fashion Model +Panda +Tiger +Bear +Statue +Prince +Marine +Handsome man +Handsome woman +Cyborg cat +Cyborg dog +Ninja +Buff man +Buff woman +Attractive woman +Attractive man +Husky \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/adj-general.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/adj-general.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a7ecef96b187f01b899fe4d22c3a9951de44ee09 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/adj-general.txt" @@ -0,0 +1,1347 @@ +abandoned +able +absolute +adorable +adventurous +academic +acceptable +acclaimed +accomplished +accurate +aching +acidic +acrobatic +active +actual +adept +admirable +admired +adolescent +adorable +adored +advanced +afraid +affectionate +aged +aggravating +aggressive +agile +agitated +agonizing +agreeable +ajar +alarmed +alarming +alert +alienated +alive +all +altruistic +amazing +ambitious +ample +amused +amusing +anchored +ancient +angelic +angry +anguished +animated +annual +another +antique +anxious +any +apprehensive +appropriate +apt +arctic +arid +aromatic +artistic +ashamed +assured +astonishing +athletic +attached +attentive +attractive +austere +authentic +authorized +automatic +avaricious +average +aware +awesome +awful +awkward +babyish +bad +back +baggy +bare +barren +basic +beautiful +belated +beloved +beneficial +better +best +bewitched +big +big-hearted +biodegradable +bite-sized +bitter +black +black-and-white +bland +blank +blaring +bleak +blind +blissful +blond +blue +blushing +bogus +boiling +bold +bony +boring +bossy +both +bouncy +bountiful +bowed +brave +breakable +brief +bright +brilliant +brisk +broken +bronze +brown +bruised +bubbly +bulky +bumpy +buoyant +burdensome +burly +bustling +busy +buttery +buzzing +calculating +calm +candid +canine +capital +carefree +careful +careless +caring +cautious +cavernous +celebrated +charming +cheap +cheerful +cheery +chief +chilly +chubby +circular +classic +clean +clear +clear-cut +clever +close +closed +cloudy +clueless +clumsy +cluttered +coarse +cold +colorful +colorless +colossal +comfortable +common +compassionate +competent +complete +complex +complicated +composed +concerned +concrete +confused +conscious +considerate +constant +content +conventional +cooked +cool +cooperative +coordinated +corny +corrupt +costly +courageous +courteous +crafty +crazy +creamy +creative +creepy +criminal +crisp +critical +crooked +crowded +cruel +crushing +cuddly +cultivated +cultured +cumbersome +curly +curvy +cute +cylindrical +damaged +damp +dangerous +dapper +daring +darling +dark +dazzling +dead +deadly +deafening +dear +dearest +decent +decimal +decisive +deep +defenseless +defensive +defiant +deficient +definite +definitive +delayed +delectable +delicious +delightful +delirious +demanding +dense +dental +dependable +dependent +descriptive +deserted +detailed +determined +devoted +different +difficult +digital +diligent +dim +dimpled +dimwitted +direct +disastrous +discrete +disfigured +disgusting +disloyal +dismal +distant +downright +dreary +dirty +disguised +dishonest +dismal +distant +distinct +distorted +dizzy +dopey +doting +double +downright +drab +drafty +dramatic +dreary +droopy +dry +dual +dull +dutiful +each +eager +earnest +early +easy +easy-going +ecstatic +edible +educated +elaborate +elastic +elated +elderly +electric +elegant +elementary +elliptical +embarrassed +embellished +eminent +emotional +empty +enchanted +enchanting +energetic +enlightened +enormous +enraged +entire +envious +equal +equatorial +essential +esteemed +ethical +euphoric +even +evergreen +everlasting +every +evil +exalted +excellent +exemplary +exhausted +excitable +excited +exciting +exotic +expensive +experienced +expert +extraneous +extroverted +extra-large +extra-small +fabulous +failing +faint +fair +faithful +fake +false +familiar +famous +fancy +fantastic +far +faraway +far-flung +far-off +fast +fat +fatal +fatherly +favorable +favorite +fearful +fearless +feisty +feline +female +feminine +few +fickle +filthy +fine +finished +firm +first +firsthand +fitting +fixed +flaky +flamboyant +flashy +flat +flawed +flawless +flickering +flimsy +flippant +flowery +fluffy +fluid +flustered +focused +fond +foolhardy +foolish +forceful +forked +formal +forsaken +forthright +fortunate +fragrant +frail +frank +frayed +free +French +fresh +frequent +friendly +frightened +frightening +frigid +frilly +frizzy +frivolous +front +frosty +frozen +frugal +fruitful +full +fumbling +functional +funny +fussy +fuzzy +gargantuan +gaseous +general +generous +gentle +genuine +giant +giddy +gigantic +gifted +giving +glamorous +glaring +glass +gleaming +gleeful +glistening +glittering +gloomy +glorious +glossy +glum +golden +good +good-natured +gorgeous +graceful +gracious +grand +grandiose +granular +grateful +grave +gray +great +greedy +green +gregarious +grim +grimy +gripping +grizzled +gross +grotesque +grouchy +grounded +growing +growling +grown +grubby +gruesome +grumpy +guilty +gullible +gummy +hairy +half +handmade +handsome +handy +happy +happy-go-lucky +hard +hard-to-find +harmful +harmless +harmonious +harsh +hasty +hateful +haunting +healthy +heartfelt +hearty +heavenly +heavy +hefty +helpful +helpless +hidden +hideous +high +high-level +hilarious +hoarse +hollow +homely +honest +honorable +honored +hopeful +horrible +hospitable +hot +huge +humble +humiliating +humming +humongous +hungry +hurtful +husky +icky +icy +ideal +idealistic +identical +idle +idiotic +idolized +ignorant +ill +illegal +ill-fated +ill-informed +illiterate +illustrious +imaginary +imaginative +immaculate +immaterial +immediate +immense +impassioned +impeccable +impartial +imperfect +imperturbable +impish +impolite +important +impossible +impractical +impressionable +impressive +improbable +impure +inborn +incomparable +incompatible +incomplete +inconsequential +incredible +indelible +inexperienced +indolent +infamous +infantile +infatuated +inferior +infinite +informal +innocent +insecure +insidious +insignificant +insistent +instructive +insubstantial +intelligent +intent +intentional +interesting +internal +international +intrepid +ironclad +irresponsible +irritating +itchy +jaded +jagged +jam-packed +jaunty +jealous +jittery +joint +jolly +jovial +joyful +joyous +jubilant +judicious +juicy +jumbo +junior +jumpy +juvenile +kaleidoscopic +keen +key +kind +kindhearted +kindly +klutzy +knobby +knotty +knowledgeable +knowing +known +kooky +kosher +lame +lanky +large +last +lasting +late +lavish +lawful +lazy +leading +lean +leafy +left +legal +legitimate +light +lighthearted +likable +likely +limited +limp +limping +linear +lined +liquid +little +live +lively +livid +loathsome +lone +lonely +long +long-term +loose +lopsided +lost +loud +lovable +lovely +loving +low +loyal +lucky +lumbering +luminous +lumpy +lustrous +luxurious +mad +made-up +magnificent +majestic +major +male +mammoth +married +marvelous +masculine +massive +mature +meager +mealy +mean +measly +meaty +medical +mediocre +medium +meek +mellow +melodic +memorable +menacing +merry +messy +metallic +mild +milky +mindless +miniature +minor +minty +miserable +miserly +misguided +misty +mixed +modern +modest +moist +monstrous +monthly +monumental +moral +mortified +motherly +motionless +mountainous +muddy +muffled +multicolored +mundane +murky +mushy +musty +muted +mysterious +naive +narrow +nasty +natural +naughty +nautical +near +neat +necessary +needy +negative +neglected +negligible +neighboring +nervous +new +next +nice +nifty +nimble +nippy +nocturnal +noisy +nonstop +normal +notable +noted +noteworthy +novel +noxious +numb +nutritious +nutty +obedient +obese +oblong +oily +oblong +obvious +occasional +odd +oddball +offbeat +offensive +official +old +old-fashioned +only +open +optimal +optimistic +opulent +orange +orderly +organic +ornate +ornery +ordinary +original +other +our +outlying +outgoing +outlandish +outrageous +outstanding +oval +overcooked +overdue +overjoyed +overlooked +palatable +pale +paltry +parallel +parched +partial +passionate +past +pastel +peaceful +peppery +perfect +perfumed +periodic +perky +personal +pertinent +pesky +pessimistic +petty +phony +physical +piercing +pink +pitiful +plain +plaintive +plastic +playful +pleasant +pleased +pleasing +plump +plush +polished +polite +political +pointed +pointless +poised +poor +popular +portly +posh +positive +possible +potable +powerful +powerless +practical +precious +present +prestigious +pretty +precious +previous +pricey +prickly +primary +prime +pristine +private +prize +probable +productive +profitable +profuse +proper +proud +prudent +punctual +pungent +puny +pure +purple +pushy +putrid +puzzled +puzzling +quaint +qualified +quarrelsome +quarterly +queasy +querulous +questionable +quick +quick-witted +quiet +quintessential +quirky +quixotic +quizzical +radiant +ragged +rapid +rare +rash +raw +recent +reckless +rectangular +ready +real +realistic +reasonable +red +reflecting +regal +regular +reliable +relieved +remarkable +remorseful +remote +repentant +required +respectful +responsible +repulsive +revolving +rewarding +rich +rigid +right +ringed +ripe +roasted +robust +rosy +rotating +rotten +rough +round +rowdy +royal +rubbery +rundown +ruddy +rude +runny +rural +rusty +sad +safe +salty +same +sandy +sane +sarcastic +sardonic +satisfied +scaly +scarce +scared +scary +scented +scholarly +scientific +scornful +scratchy +scrawny +second +secondary +second-hand +secret +self-assured +self-reliant +selfish +sentimental +separate +serene +serious +serpentine +several +severe +shabby +shadowy +shady +shallow +shameful +shameless +sharp +shimmering +shiny +shocked +shocking +shoddy +short +short-term +showy +shrill +shy +sick +silent +silky +silly +silver +similar +simple +simplistic +sinful +single +sizzling +skeletal +skinny +sleepy +slight +slim +slimy +slippery +slow +slushy +small +smart +smoggy +smooth +smug +snappy +snarling +sneaky +sniveling +snoopy +sociable +soft +soggy +solid +somber +some +spherical +sophisticated +sore +sorrowful +soulful +soupy +sour +Spanish +sparkling +sparse +specific +spectacular +speedy +spicy +spiffy +spirited +spiteful +splendid +spotless +spotted +spry +square +squeaky +squiggly +stable +staid +stained +stale +standard +starchy +stark +starry +steep +sticky +stiff +stimulating +stingy +stormy +straight +strange +steel +strict +strident +striking +striped +strong +studious +stunning +stupendous +stupid +sturdy +stylish +subdued +submissive +substantial +subtle +suburban +sudden +sugary +sunny +super +superb +superficial +superior +supportive +sure-footed +surprised +suspicious +svelte +sweaty +sweet +sweltering +swift +sympathetic +tall +talkative +tame +tan +tangible +tart +tasty +tattered +taut +tedious +teeming +tempting +tender +tense +tepid +terrible +terrific +testy +thankful +that +these +thick +thin +third +thirsty +this +thorough +thorny +those +thoughtful +threadbare +thrifty +thunderous +tidy +tight +timely +tinted +tiny +tired +torn +total +tough +traumatic +treasured +tremendous +tragic +trained +tremendous +triangular +tricky +trifling +trim +trivial +troubled +true +trusting +trustworthy +trusty +truthful +tubby +turbulent +twin +ugly +ultimate +unacceptable +unaware +uncomfortable +uncommon +unconscious +understated +unequaled +uneven +unfinished +unfit +unfolded +unfortunate +unhappy +unhealthy +uniform +unimportant +unique +united +unkempt +unknown +unlawful +unlined +unlucky +unnatural +unpleasant +unrealistic +unripe +unruly +unselfish +unsightly +unsteady +unsung +untidy +untimely +untried +untrue +unused +unusual +unwelcome +unwieldy +unwilling +unwitting +unwritten +upbeat +upright +upset +urban +usable +used +useful +useless +utilized +utter +vacant +vague +vain +valid +valuable +vapid +variable +vast +velvety +venerated +vengeful +verifiable +vibrant +vicious +victorious +vigilant +vigorous +villainous +violet +violent +virtual +virtuous +visible +vital +vivacious +vivid +voluminous +wan +warlike +warm +warmhearted +warped +wary +wasteful +watchful +waterlogged +watery +wavy +wealthy +weak +weary +webbed +wee +weekly +weepy +weighty +weird +welcome +well-documented +well-groomed +well-informed +well-lit +well-made +well-off +well-to-do +well-worn +wet +which +whimsical +whirlwind +whispered +white +whole +whopping +wicked +wide +wide-eyed +wiggly +wild +willing +wilted +winding +windy +winged +wiry +wise +witty +wobbly +woeful +wonderful +wooden +woozy +wordy +worldly +worn +worried +worrisome +worse +worst +worthless +worthwhile +worthy +wrathful +wretched +writhing +wrong +wry +yawning +yearly +yellow +yellowish +young +youthful +yummy +zany +zealous +zesty +zigzag diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/adj-horror.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/adj-horror.txt" new file mode 100644 index 0000000000000000000000000000000000000000..96d9f6b986e44b634524af1a5b46a9fa2bfb7e88 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/adj-horror.txt" @@ -0,0 +1,46 @@ +alarming +dire +direful +dread +dreadful +fearful +fearsome +forbidding +foreboding +formidable +frightening +hair-raising +horrendous +horrible +horrifying +intimidating +redoubtable +shocking +spine-chilling +terrible +terrifying +daunting +demoralizing +disconcerting +discouraging +dismaying +disquieting +distressing +disturbing +perturbing +startling +threatening +troubling +trying +unnerving +atrocious +awful +dreadful +frightful +ghastly +grisline +gruesome +hideous +horrid +monstrosity +repulsive diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2a461fbba1b9ff5c3dde57a8632ef919512debae --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals.txt" @@ -0,0 +1,1878 @@ +Aardvark +Aardwolf +Abyssinian +Achrioptera Manga +Addax +Adelie Penguin +Aesculapian snake +Affenpinscher +Afghan Hound +African Bullfrog +African Bush Elephant +African Civet +African Clawed Frog +African Fish Eagle +African Forest Elephant +African Golden Cat +African Grey Parrot +African Jacana +African Palm Civet +African Penguin +African Tree Toad +African Wild Dog +Agama Lizard +Aidi +Ainu +Airedale Terrier +Airedoodle +Akbash +Akita +Akita Shepherd +Alabai +Alaskan Husky +Alaskan Klee Kai +Alaskan Malamute +Alaskan Pollock +Alaskan Shepherd +Albacore Tuna +Albatross +Aldabra Giant Tortoise +Alligator Gar +Allosaurus +Alpaca +Alpine Dachsbracke +Alpine Goat +Alusky +Amargasaurus +Amazon Parrot +Amazon River Dolphin (Pink Dolphin) +Amazon Tree Boa +Ambrosia Beetle +American Alligator +American Alsatian +American Bulldog +American Cocker Spaniel +American Cockroach +American Coonhound +American Dog Tick +American Eskimo Dog +American Foxhound +American Hairless Terrier +American Leopard Hound +American Paddlefish +American Pit Bull Terrier +American Pugabull +American Pygmy Goat +American Robin +American Staffordshire Terrier +American Toad +American Water Spaniel +American Wirehair Traits +Amethystine Python (Scrub Python) +Amur Leopard +Anaconda +Anatolian Shepherd Dog +Anchovies +Angelfish +Angelshark +Anglerfish +Angora Ferret +Angora Goat +Anole Lizard +Anomalocaris +Ant +Antarctic scale worm +Anteater +Antelope +Ape +Appenzeller Dog +Apple Head Chihuahua +Apple Moth +Arabian Cobra +Arafura File Snake +Arapaima +Archaeopteryx +Archaeotherium +Archelon Turtle +Arctic Char +Arctic Fox +Arctic Hare +Arctic Wolf +Arctodus +Arctotherium +Arizona Bark Scorpion +Arizona Black Rattlesnake +Arizona Coral Snake +Armadillo +Armadillo Lizard +Armyworm +Aruba Rattlesnake +Asian Arowana +Asian Cockroach +Asian Elephant +Asian Giant Hornet +Asian Lady Beetle +Asian Palm Civet +Asian Vine Snake +Asiatic Black Bear +Asp +Assassin Bug +Atlantic Salmon +Atlas Moth +Aurochs +Aussiedoodle +Aussiedor +Australian Bulldog +Australian Cattle Dog +Australian Cockroach +Australian Flathead Perch +Australian Gecko +Australian Kelpie Dog +Australian Labradoodle +Australian Mist +Australian Retriever +Australian Shepherd +Australian Terrier +Australopithecus +Avocet +Axolotl +Aye Aye +Babirusa +Baboon +Bactrian Camel +Badger +Baiji +Baird’s Rat Snake +Bald Eagle +Baleen Whale +Balinese +Balkan Lynx +Ball Python +Bamboo Worms +Banana Ball Python +Banana Eel +Banana Spider +Banded Krait +Banded Palm Civet +Banded Water Snake +Bandicoot +Banjo Catfish +Barb +Barbet +Barn Owl +Barn Swallow +Barnacle +Barracuda +Barramundi Fish +Barred Owl +Basenji Dog +Basilisk Lizard +Basilosaurus +Basking Shark +Bassador +Basset Fauve de Bretagne +Basset Hound +Bassetoodle +Bat +Batfish +Bavarian Mountain Hound +Baya +Bea-Tzu +Beabull +Beagador +Beagle +Beagle Shepherd +Beaglier +Beago +Bear +Bearded Collie +Bearded Dragon +Bearded Vulture +Beaski +Beauceron +Beauty rat snake +Beaver +Bed Bugs +Bedlington Terrier +Bee +Beefalo +Beetle +Belgian Shepherd +Belgian Tervuren +Belted Kingfisher +Beluga Sturgeon +Bengal Tiger +Bergamasco +Berger Picard +Bernedoodle +Bernese Mountain Dog +Bernese Shepherd +Betta Fish (Siamese Fighting Fish) +Bichir +Bichon Frise +Bichpoo +Biewer Terrier +Bighorn Sheep +Bilby +Binturong +Bird +Bird Of Paradise +Bird Snake +Birman +Biscuit Beetle +Bismarck Ringed Python +Bison +Black And Tan Coonhound +Black and White Warbler +Black Mamba +Black Marlin +Black Rat Snake +Black Rhinoceros +Black Russian Terrier +Black Wasp +Black Widow Spider +Black Witch Moth +Black-Capped Chickadee +Black-Footed Ferret +Black-headed python +Blacknose Shark +Blackpoll Warbler +Blacktip Reef Shark +Bladefin Basslet +Blanket Octopus +Blind Snake +Blister Beetle +Blobfish +Blood Python +Bloodhound +Blue Belly Lizard +Blue Catfish +Blue Eyed Pleco +Blue Gray Gnatcatcher +Blue grosbeak +Blue Iguana +Blue Jay +Blue Lacy Dog +Blue Picardy Spaniel +Blue Racer +Blue Shark +Blue Tit +Blue Whale +Blue-Ringed Octopus +Bluefin Tuna +Bluegill +Bluetick Coonhound +Boas +Bobcat +Bobolink +Boelen’s python +Boggle +Boglen Terrier +Boiga +Bolivian Anaconda +Bolognese Dog +Bombay +Bongo +Bonito Fish +Bonnethead Shark +Bonobo +Booby +Boomslang +Borador +Border Collie +Border Terrier +Bordoodle +Borkie +Bornean Orang-utan +Borneo Elephant +Boston Terrier +Bottlenose Dolphin +Bouvier Des Flandres +Bowfin +Bowhead Whale +Box Tree Moth +Box Turtle +Boxachi Dog Breed Complete Guide +Boxador +Boxer Dog +Boxerdoodle +Boxfish +Boxsky +Boxweiler +Boykin Spaniel +Brachiosaurus +Brahminy Blindsnake +Brazilian Terrier +Bredl’s Python +British Timber +Brittany +Brontosaurus +Bronze Whaler Shark +Brookesia Micra +Brown Bear +Brown Dog Tick +Brown Hyena +Brown Snake +Brown Water Snake +Brown-banded Cockroach +Brug +Brussels Griffon +Budgerigar +Buffalo +Bull Shark +Bull Terrier +Bull Trout +Bulldog +Bullfrog +Bullmastiff +Bullsnake +Bumblebee +Burmese +Burmese Python +Burrowing Frog +Burrowing Owl +Bush Baby +Bush Viper +Bushmaster Snake +Butterfly +Butterfly Fish +Cabbage Moth +Cactus Moth +Cactus Mouse +Cactus Wren +Caecilian +Caiman +Caiman Lizard +Cairn Terrier +California Condor +California Kingsnake +Camel +Camel Cricket +Camel Spider +Canaan Dog +Canada Lynx +Canada Warbler +Canadian Eskimo Dog +Canadian Horse +Cane Corso +Cantil +Cape Lion +Capybara +Caracal +Caribbean Reef Shark +Caribou +Carolina Dog +Carolina Parakeet +Carp +Carpenter Ant +Carpet Python +Carpet Viper +Cascabel +Cashmere Goat +Cassowary +Cat +Cat Snake +Cat-Eyed Snake +Catahoula Leopard +Catalan Sheepdog +Caterpillar +Catfish +Cava Tzu +Cavador +Cavalier King Charles Spaniel +Cavapoo +Cave Bear +Cave Lion +Cecropia Moth +Centipede +Central Ranges Taipan +Ceratosaurus +Cesky Fousek +Cesky Terrier +Chameleon +Chamois +Chartreux +Cheagle +Checkered Garter Snake +Cheetah +Chesapeake Bay Retriever +Chestnut-Sided Warbler +Chickadee +Chicken +Chicken Snake +Chihuahua +Children’s python +Chimaera +Chimpanzee +Chinchilla +Chinese Alligator +Chinese Cobra +Chinese Crested Dog +Chinese Geese +Chinese Paddlefish +Chinese Shar-Pei +Chinook +Chinook Salmon +Chinstrap Penguin +Chipmunk +Chipoo +Chipping Sparrow +Chiweenie +Chorkie +Chow Chow +Chow Shepherd +Christmas Island Red Crab +Cicada +Cichlid +Cinereous Vulture +Cinnamon Ferret +Clothes Moth +Clouded Leopard +Clownfish +Clumber Spaniel +Coachwhip Snake +Coastal Carpet Python +Coastal Taipan +Coati +Cobia Fish +Cobras +Cockalier +Cockapoo +Cockatiel +Cockatoo +Cocker Spaniel +Cockroach +Codfish +Codling Moth +Coelacanth +Collared Peccary +Collett’s Snake +Collie +Colossal Squid +Comet Moth +Common Buzzard +Common Carp +Common European Adder +Common Frog +Common Furniture Beetle +Common House Spider +Common Loon +Common Raven +Common Toad +Compsognathus +Conger Eel +Congo Snake +Cookiecutter Shark +Cooper’s Hawk +Copperhead +Coral +Coral Snake +Corgidor +Corgipoo +Corkie +Corman Shepherd +Corn Rex Cat (Cornish Rex) +Corn Snake +Cory Catfish +Coton de Tulear +Cotton-top Tamarin +Cottonmouth +Cougar +Cow +Coyote +Crab +Crab Spider +Crab-Eating Macaque +Crabeater Seal +Crane +Crappie Fish +Crested Gecko +Crested Penguin +Cricket +Crocodile +Crocodile Monitor +Cross Fox +Cross River Gorilla +Crow +Crucian Carp +Cuban Boa +Cuban Cockroach +Cuckoo +Curly Coated Retriever +Cuscus +Cuttlefish +Czechoslovakian Wolfdog +Dachsador +Dachshund +Daeodon +Dalmadoodle +Dalmador +Dalmatian +Danios +Dapple Dachshund +Dark-Eyed Junco +Darwin’s fox +Darwin’s Frog +Death Adder +Death’s Head Cockroach +Deathwatch Beetle +Deer +Deer Head Chihuahua +Deer Tick +Desert Kingsnake +Desert Locust +Desert Rain Frog +Desert Tortoise +Deutsche Bracke +Devils Coach Horse Beetle +Devon Rex +Dhole +Diamond python +Diamondback Moth +Dik-Dik +Dingo +Dinopithecus +Diplodocus +Dire Wolf +Discus +Doberman Pinscher +Dodo +Dog +Dog Tick +Dogo Argentino +Dogue De Bordeaux +Dolphin +Donkey +Dorgi +Dorkie +Dormouse +Double Doodle +Douc +Downy Woodpecker +Doxiepoo +Doxle +Draco Volans Lizard +Dragon Eel +Dragonfish +Dragonfly +Dreadnoughtus +Drever +Drum Fish +Dubia Cockroach +Duck +Dugong +Dumeril’s Boa +Dung Beetle +Dunker +Dunnock +Dusky Dolphin +Dusky Shark +Dwarf Boa +Dwarf Crocodile +Dwarf Hamster +Eagle +Earthworm +Earwig +Eastern Barred Bandicoot +Eastern Bluebird +Eastern Brown Snake +Eastern Chipmunk +Eastern Coral Snake +Eastern Diamondback Rattlesnake +Eastern Fence Lizard +Eastern Glass Lizard +Eastern Gorilla +Eastern Gray Squirrel +Eastern Green Mamba +Eastern Hognose Snake +Eastern Indigo Snake +Eastern Lowland Gorilla +Eastern Phoebe +Eastern Racer +Eastern Rat snake +Eastern Woodrat +Echidna +Eclectus Parrot +Edible Frog +Eel +Egyptian Cobra (Egyptian Asp) +Egyptian Goose +Egyptian Mau +Eland +Elasmosaurus +Elasmotherium +Electric Catfish +Electric Eel +Elegant Tern +Elephant +Elephant Seal +Elephant Shrew +Elk +Emerald Tree Boa +Emerald Tree Monitor +Emperor Penguin +Emperor Tamarin +Emu +English Bulldog +English Cocker Spaniel +English Cream Golden Retriever +English Foxhound +English Pointer +English Setter +English Shepherd +English Springer Spaniel +Entlebucher Mountain Dog +Epagneul Pont Audemer +Equatorial Spitting Cobra +Ermine +Escolar +Eskimo Dog +Eskipoo +Estrela Mountain Dog +Eurasian Beaver +Eurasian Jay +Eurasian Lynx +Eurasian Nuthatch +Eurasian Wolf +European Bee-Eater +European Corn Borer +European Polecat +European Robin +European Starling +European Wildcat +Evening Bat +Eyelash Viper +Fairy-Wren +Falcon +Fallow deer +False Cobra +False coral snake +False Killer Whale +False Water Cobra +False Widow Spider +Fangtooth +Feist +Fennec Fox +Fer-de-lance Snake +Ferret +Ferruginous Hawk +Fiddler Crab +Field Spaniel +Fierce Snake +Fila Brasileiro +Fin Whale +Finnish Spitz +Fire Eel +Fire salamander +Fire-Bellied Toad +Firefly +Fish +Fisher Cat +Fishing Cat +Flamingo +Flat-Coated Retriever +Flea +Florida Gar +Florida Panther +Florida Woods Cockroach +Flounder +Flowerhorn Fish +Fluke Fish (summer flounder) +Fly +Flying Fish +Flying Lemur +Flying Snake +Flying Squirrel +Football Fish +Forest Cobra +Formosan Mountain Dog +Fossa +Fox +Fox Snakes +Fox Squirrel +Fox Terrier +French Bulldog +Frenchton +Frengle +Freshwater Crocodile +Freshwater Eel +Freshwater Jellyfish +Frigatebird +Frilled Lizard +Frilled Shark +Frog +Frogfish +Fruit Bat +Fruit Fly +Fur Seal +Gaboon Viper +Galapagos Penguin +Galapagos Shark +Galapagos Tortoise +Gar +Garden Eel +Gargoyle Gecko +Garter Snake +Gazelle +Gecko +Genet +Gentoo Penguin +Geoffroys Tamarin +Gerberian Shepsky +Gerbil +German Cockroach +German Pinscher +German Shepherd Guide +German Sheppit +German Sheprador +German Shorthaired Pointer +German Spitz +Gharial +Ghost Catfish +Ghost Crab +Giant African Land Snail +Giant Armadillo +Giant Clam +Giant Desert Centipede +Giant Leopard Moth +Giant Panda Bear +Giant Salamander +Giant Schnauzer +Giant Schnoodle +Giant Weta +Gibbon +Gigantopithecus +Gila Monster +Giraffe +Glass Frog +Glass Lizard +Glechon +Glen Of Imaal Terrier +Glow Worm +Gnat +Goat +Goberian +Goblin Shark +Goby Fish +Goldador +Golden Dox +Golden Lancehead +Golden Lion Tamarin +Golden Masked Owl +Golden Newfie +Golden Oriole +Golden Pyrenees +Golden Retriever +Golden Saint +Golden Shepherd +Golden Tortoise Beetle +Golden-Crowned Flying Fox +Golden-Crowned Kinglet +Goldendoodle +Goldfish +Goliath Beetle +Goliath Frog +Goliath Tigerfish +Gomphotherium +Goonch Catfish +Goose +Gopher +Gopher Snake +Gopher Tortoise +Gordon Setter +Gorilla +Goshawk +Gouldian Finch +Grass Carp +Grass Snake +Grasshopper +Grasshopper Mouse +Gray Fox +Gray Tree Frog +Great Blue Heron +Great Crested Flycatcher +Great Dane +Great Danoodle +Great Egret +Great Hammerhead Shark +Great Plains Rat Snake +Great Potoo Bird +Great Pyrenees +Great White Shark +Greater Swiss Mountain Dog +Green Anaconda +Green Anole +Green Aphids +Green Bee-Eater +Green Frog +Green Heron +Green Mamba +Green Rat Snake +Green Snake +Green Tree Frog +Green Tree Python +Greenland Dog +Greenland Shark +Grey Heron +Grey Mouse Lemur +Grey Reef Shark +Grey Seal +Greyhound +Griffon Vulture +Griffonshire +Grizzly Bear +Groenendael +Ground Snake +Groundhog (Woodchuck) +Grouper +Grouse +Guinea Fowl +Guinea Pig +Guppy +Habu Snake +Haddock +Hagfish +Hairy Woodpecker +Halibut +Hamburg Chicken +Hammerhead Shark +Hamster +Harbor Seal +Hardhead Catfish +Hare +Harlequin Snake +Harp Seal +Harpy Eagle +Harrier +Harris Hawk +Havanese +Havapoo +Havashire +Hawaiian Crow +Hawaiian Goose +Hawaiian Monk Seal +Hawk +Hawk Moth Caterpillar +Hedgehog +Hellbender +Hercules Beetle +Hercules Moth +Hermit Crab +Heron +Herring +Highland Cattle +Himalayan +Hippopotamus +Hoary Bat +Hobo Spider +Hognose snake +Hokkaido +Honduran White Bat +Honey Badger +Honey Bee +Honey Buzzard +Hooded Oriole +Hooded Seal +Hook-Nosed Sea Snake +Hoopoe +Horgi +Horn Shark +Hornbill +Horned Adder +Horned Frog +Horned Lizard +Horned Viper +Horse +Horsefly +Horseshoe Crab +House Finch +House Sparrow +House wren +Housefly +Howler Monkey +Human +Humboldt Penguin +Humboldt Squid +Hummingbird +Hummingbird Hawk-Moth +Humpback Whale +Huntaway +Huntsman Spider +Huskador +Huskita +Husky +Husky Jack +Huskydoodle +Hyaenodon +Hyena +Ibex +Ibis +Ibizan Hound +Icelandic Sheepdog +Ichthyosaurus +Iguana +Iguanodon +Immortal Jellyfish +Impala +Imperial Moth +Indian Cobra +Indian Elephant +Indian Giant Squirrel +Indian Palm Squirrel +Indian python +Indian Rhinoceros +Indian Star Tortoise +Indianmeal Moth +Indigo Snake +Indochinese Tiger +Indri +Inland Taipan +Insect +Insects +Irish Doodle +Irish Setter +Irish Terrier +Irish Water Spaniel +Irish WolfHound +Italian Greyhound +Ivory-billed woodpecker +Jacana +Jack Russell +Jackabee +Jackal +Jackdaw +Jackrabbit +Jackson’s Chameleon +Jaguar +Jaguarundi Cat +Jamaican Boa +Japanese Beetle +Japanese Chin +Japanese Macaque +Japanese rat snake +Japanese Spitz +Japanese Squirrel +Japanese Terrier +Javan Rhinoceros +Javanese +Jellyfish +Jerboa +Jewel Beetle +John Dory +Jonah Crab +Joro Spider +Jumping Spider +Jungle Carpet Python +Junglefowl +Kai Ken +Kakapo +Kangal +Kangaroo +Kangaroo Rat +Keel-Billed Toucan +Keelback +Keeshond +Kenyan Sand Boa +Kerry Blue Terrier +Kestrel +Keta Salmon +Key Deer +Kiko Goat +Killdeer +Killer Whale +Kinabalu Giant Red Leech +Kinder Goat +King Cobra +King Crab +King Penguin +King Rat Snake +King Shepherd +King Snake +King Vulture +Kingfisher +Kinkajou +Kirtland’s snake +Kishu +Kit Fox +Kitefin Shark +Kiwi +Klipspringer +Koala +Kodkod +Koi Fish +Komodo Dragon +Kooikerhondje +Kookaburra +Koolie +Kori Bustard +Krait +Krill +Kudu +Kuvasz +Labahoula +Labmaraner +Labradane +Labradoodle +Labrador Retriever +Labraheeler +Ladybug +Lake Sturgeon +Lakeland Terrier +LaMancha Goat +Lamprey +Lancashire Heeler +Lappet-faced Vulture +Lapponian Herder +Larder Beetle +Lavender Albino Ball Python +Lawnmower Blenny +Lazarus Lizard +Leaf-Tailed Gecko +Leatherback Sea Turtle +Leech +Leedsichthys +Lemming +Lemon Shark +Lemur +Leonberger +Leopard +Leopard Cat +Leopard Frog +Leopard Gecko +Leopard Lizard +Leopard Seal +Leopard Shark +Leopard Tortoise +Lhasa Apso +Lhasapoo +Liger +Linnet +Lion +Lion’s Mane Jellyfish +Lionfish +Little Brown Bat +Little Penguin +Livyatan +Lizard +Lizardfish +Llama +Loach +Lobster +Locust +Lone Star Tick +Long-Eared Owl +Long-Haired Rottweiler +Long-Tailed Tit +Longfin Mako Shark +Longnose Gar +Lorikeet +Lowchen +Lumpfish +Lungfish +Lynx +Lyrebird +Macaque +Macaroni Penguin +Macaw +Machaeroides +Mackenzie Wolf +Madagascar Hissing Cockroach +Madagascar Tree Boa +Magellanic Penguin +Maggot +Magpie +Mahi Mahi (Dolphin Fish) +Maiasaura +Maine Coon +Malayan Civet +Malayan Krait +Malayan Tiger +Malchi +Mallard +Malteagle +Maltese +Maltese Shih Tzu +Maltipoo +Mamba +Mamushi Snake +Man of War Jellyfish +Manatee +Manchester Terrier +Mandarin Rat Snake +Mandrill +Maned Wolf +Mangrove Snake +Manta Ray +Mantella Frog +Marabou Stork +Marble Fox +Marine Iguana +Marine Toad +Markhor +Marmoset +Marmot +Marsh Frog +Masked Angelfish +Masked Palm Civet +Massasauga +Mastador +Mastiff +Mayfly +Meagle +Mealybug +Meerkat +Megalania +Megalodon +Megamouth Shark +Megatherium +Mekong Giant Catfish +Mexican Alligator Lizard +Mexican Black Kingsnake +Mexican Eagle (Northern crested caracara) +Mexican Free-Tailed Bat +Mexican Mole Lizard +Microraptor +Midget Faded Rattlesnake +Miki +Milk Snake +Milkfish +Millipede +Mini Labradoodle +Miniature Bull Terrier +Miniature Husky +Miniature Pinscher +Mink +Minke Whale +Mississippi Kite +Moccasin Snake +Mojarra +Mojave Rattlesnake +Mola mola (Ocean Sunfish) +Mole +Mole Cricket +Mole Snake +Molly +Monarch Butterfly +Mongoose +Mongrel +Monitor Lizard +Monkey +Monkfish +Monocled Cobra +Monte Iberia Eleuth +Moon Jellyfish +Moorhen +Moose +Moray Eel +Morkie +Mosasaurus +Moscow Watchdog +Mosquito +Moth +Mountain Bluebird +Mountain Cur +Mountain Feist +Mountain Gorilla +Mountain Lion +Mourning Dove +Mouse +Mozambique Spitting Cobra +Mud Snake +Mudi +Mudpuppy +Mule +Mulga Snake +Muntjac +Muskox +Muskrat +Mussurana Snake +Muttaburrasaurus +Myna Bird +Naked Mole Rat +Narwhal +Natterjack +Neanderthal +Neapolitan Mastiff +Nebelung +Needlefish +Neon Tetra +Neptune Grouper +Newfoundland +Newfypoo +Newt +Nicobar pigeon +Nigerian Goat +Night Adder +Night Heron +Night Snake +Nightingale +Nile Crocodile +Nilgai +No See Ums +Norfolk Terrier +North American Black Bear +Northern Alligator Lizard +Northern Cardinal +Northern Flicker +Northern Fur Seal +Northern Inuit Dog +Northern Water Snake +Norwegian Buhund +Norwegian Elkhound +Norwegian Forest +Norwegian Lundehund +Norwich Terrier +Nose-horned viper +Nova Scotia Duck Tolling Retriever +Nubian Goat +Nudibranch +Numbat +Nurse Shark +Nyala +Oak Toad +Oceanic Whitetip Shark +Ocelot +Octopus +Oenpelli python +Okapi +Old English Sheepdog +Oleander Hawk Moth +Olive Baboon +Olive python +Olive Sea Snake +Olm +Opah +Opossum +Orang-utan +Orange-Crowned Warbler +Orb Weaver +Oriental Cockroach +Ornithomimus +Ortolan Bunting +Oscar Fish +Ostrich +Otter +Otterhound +Oviraptor +Owl +Ox +Oyster +Oyster Toadfish +Pacific Sleeper Shark +Paddlefish +Pademelon +Painted Turtle +Pangolin +Panther +Papillon +Paradise Flying Snake +Parakeet +Parasaurolophus +Parrot +Parrot Snake +Parrotfish +Parrotlet +Parson Russell Terrier +Patagotitan +Patas Monkey +Patterdale Terrier +Pea Puffer +Peacock +Peacock Butterfly +Peacock Spider +Peagle +Peekapoo +Pekingese +Pelican +Pembroke Welsh Corgi +Penguin +Pennsylvania Wood Cockroach +Peppered Moth +Peppermint Angelfish +Pere Davids Deer +Peregrine Falcon +Peringuey’s Adder +Perro De Presa Canario +Persian +Peruvian Inca Orchid +Pesquet’s Parrot (Dracula Parrot) +Petit Basset Griffon Vendéen +Petite Goldendoodle +Pheasant +Philippine Cobra +Phorusrhacos +Pied Ball Python +Pied Tamarin +Pig +Pigeon +Pika +Pike Fish +Pileated Woodpecker +Pine Marten +Pine Snake +Pine Snake +Pink Fairy Armadillo +Pipe Snake +Pipefish +Piranha +Pit Bull +Pit Viper +Pitador +Pitsky +Plains Hognose Snake +Platinum Arowana +Platybelodon +Platypus +Pocket Beagle +Pocket Pitbull +Pointer +Poison Dart Frog +Polar Bear +Polecat +Polish Lowland Sheepdog +Polka Dot Stingray +Polyphemus moth +Pomapoo +Pomchi +Pomeagle +Pomeranian +Pomsky +Pond Skater +Poochon +Poodle +Poogle +Pool Frog +Porbeagle Shark +Porcupine +Porpoise +Possum +Potoo +Potoroo +Prairie Dog +Prairie Rattlesnake +Prawn +Praying Mantis +Proboscis Monkey +Procoptodon +Pronghorn +Psittacosaurus +Pteranodon +Pterodactyl +Pudelpointer +Puff Adder +Pufferfish +Puffin +Pug +Pugapoo +Puggle +Pugshire +Puma +Pumi +Purple Emperor Butterfly +Purple Finch +Puss Moth +Pygmy Hippopotamus +Pygmy Marmoset (Finger Monkey) +Pygmy python +Pygmy Shark +Pygora Goat +Pyjama Shark +Pyrador +Pyredoodle +Python +Quagga +Quail +Queen snake +Quetzal +Quokka +Quoll +Rabbit +Raccoon +Raccoon Dog +Racer Snake +Radiated Tortoise +Ragamuffin +Ragdoll +Raggle +Rainbow Boa +Rainbow Shark +Rat +Rat Snakes +Rat Terrier +Rattlesnake +Red Diamondback Rattlesnake +Red Finch +Red Fox +Red Kite +Red Knee Tarantula +Red Panda +Red Racer Snake +Red Spitting Cobra +Red Squirrel +Red Tail Boa (common boa) +Red Wolf +Red-Bellied Black Snake +Red-Billed Quelea Bird +Red-Eared Slider +Red-Eyed Tree Frog +Red-Footed Tortoise +Red-handed Tamarin +Red-Headed Vulture +Red-Lipped Batfish +Red-Shouldered Hawk +Red-winged blackbird +Redback Spider +Redbone Coonhound +Redtail Catfish +Reef Shark +Reindeer +Repenomamus +Reticulated python +Rhino Viper +Rhinoceros +Rhombic Egg-Eater Snake +Ribbon Eel +Ribbon Snake +Ring-billed Gull +Ringed Kingfisher +Rinkhals Snake +River Otter +River Turtle +Robin +Rock Crab +Rock Hyrax +Rock Python +Rockfish +Rockhopper Penguin +Rodents +Roe Deer +Rooster +Rose-breasted Grosbeak +Roseate Spoonbill +Rosy Boa +Rottsky +Rottweiler +Rough Green Snake +Royal Penguin +Rubber Boa +Ruby-Throated Hummingbird +Ruddy Turnstone +Russel’s Viper +Russell Terrier +Russian Bear Dog +Russian Blue +Russian Tortoise +Saanen Goat +Saarloos Wolfdog +Saber-Toothed Tiger +Sable +Sable Black German Shepherd +Sable Ferret +Saiga +Saint Berdoodle +Saint Bernard +Saint Shepherd +Salamander +Salmon +Salmon Shark +Saluki +Sambar +Samoyed +San Francisco Garter Snake +Sand Crab +Sand Lizard +Sand Tiger Shark +Sand Viper +Sandhill Crane +Saola +Sardines +Satanic leaf-tailed gecko +Sauropoda +Savanna Goat +Savannah Sparrow +Savu Python +Saw-scaled viper +Sawfish +Scarab Beetle +Scarlet Kingsnake +Scarlet Macaw +Schipperke +Schneagle +Schnoodle +Scimitar-horned Oryx +Scorpion +Scorpion Fish +Scottish Terrier +Scrotum Frog +Sculpin +Sea Anemone +Sea Dragon +Sea Eagle +Sea Lion +Sea Otter +Sea Roach +Sea Slug +Sea Snake +Sea Squirt +Sea Turtle +Sea Urchin +Seagull +Seahorse +Seal +Sealyham Terrier +Sei Whale +Senegal Parrot +Serval +Shark +Sharp-Shinned Hawk +Sharp-Tailed Snake +Sheep +Sheepadoodle +Shepadoodle +Shepkita +Shepweiler +Shiba Inu +Shih Poo +Shih Tzu +Shoebill Stork +Shollie +Short-Faced Bear +Shortfin Mako Shark +Shrew +Shrimp +Siamese +Siberian +Siberian Husky +Siberian Ibex +Siberian Retriever +Siberian Tiger +Siberpoo +Sidewinder +Silkie Chicken +Silky Shark +Silky Terrier +Silver Dollar +Silver Labrador +Sinosauropteryx +Sixgill shark +Skate Fish +Skink Lizard +Skipjack Tuna +Skunk +Skye Terrier +Sleeper Shark +Sloth +Slow Worm +Slug +Smokybrown Cockroach +Smooth Fox Terrier +Smooth Green Snake +Smooth Snake +Snail +Snake +Snapping Turtle +Snook Fish +Snorkie +Snouted Cobra +Snow Crab +Snow Leopard +Snowberry Clearwing Moth +Snowshoe +Snowshoe Hare +Snowy Owl +Sockeye Salmon +Somali +Song Sparrow +Song Thrush +South China Tiger +Southern Black Racer +Southern Hognose Snake +Southern Pacific Rattlesnake +Spadefoot Toad +Spanador +Spanish Goat +Spanish Mastiff +Sparrow +Sparrowhawk +Speckled Kingsnake +Spectacled Bear +Sperm Whale +Sphynx Traits: What to Know Before You Buy +Spider +Spider Ball Python +Spider Monkey +Spider Wasp +Spider-Tailed Horned Viper +Spinner Shark +Spinone Italiano +Spinosaurus +Spiny bush viper +Spiny Dogfish +Spiny Hill Turtle +Spitting Cobra +Spixs Macaw +Sponge +Spotted Gar +Spotted Lanternfly +Spotted python +Springador +Springbok +Springerdoodle +Squash Beetle +Squid +Squirrel +Squirrel Monkey +Squirrelfish +Sri Lankan Elephant +Stabyhoun +Staffordshire Bull Terrier +Stag Beetle +Standard Schnauzer +Star-nosed mole +Starfish +Stargazer Fish +Steelhead Salmon +Steller’s Sea Cow +Stick Insect +Stiletto Snake +Stingray +Stoat +Stone Crab +Stork +Strawberry Hermit Crab +Striped Hyena +Striped Rocket Frog +Sturgeon +Styracosaurus +Suchomimus +Sucker Fish +Sugar Glider +Sulcata Tortoise +Sumatran Elephant +Sumatran Orang-utan +Sumatran Rhinoceros +Sumatran Tiger +Sun Bear +Sunbeam Snake +Surgeonfish +Swai Fish +Swan +Swedish Vallhund +Syrian Hamster +Taco Terrier +Taipan +Takin +Tamaskan +Tang +Tapanuli Orang-utan +Tapir +Tarantula Hawk +Tarbosaurus +Tarpon +Tarsier +Tasmanian Devil +Tasmanian Tiger +Tawny Owl +Teddy Bear Hamster +Teddy Roosevelt Terrier +Telescope Fish +Tennessee Walking Horse +Tentacled Snake +Tenterfield Terrier +Termite +Terrier +Terror Bird +Tetra +Texas Blind Snake +Texas Coral Snake +Texas Garter Snake +Texas Heeler +Texas Indigo Snake +Texas Night Snake +Texas Rat Snake +Texas Spiny Lizard +Thai Ridgeback +Thorny Devil +Thresher Shark +Thrush +Thylacoleo +Tibetan Fox +Tibetan Mastiff +Tibetan Spaniel +Tibetan Terrier +Tick +Tiffany +Tiger +Tiger Beetle +Tiger Moth +Tiger Rattlesnake +Tiger Salamander +Tiger Shark +Timber Rattlesnake (Canebrake Rattlesnake) +Timor python +Tire Track Eel +Titanoboa +Toadfish +Tomato Hornworm +Torkie +Tortoise +Toucan +Toy Fox Terrier +Toy Poodle +Transylvanian Hound +Tree Frog +Tree Kangaroo +Tree Snake +Tree swallow +Tree Viper (Bamboo Viper) +Treecreeper +Treeing Tennessee Brindle +Treeing Walker Coonhound +Troodon +Tropicbird +Trout +Tsetse Fly +Tuatara +Tufted Titmouse +Tuna +Turkey +Turkish Angora +Turtles +Twig Snake +Tyrannosaurus Rex +Uakari +Uguisu +Uinta Ground Squirrel +Uintatherium +Umbrellabird +Unau (Linnaeus’s Two-Toed Sloth) +Ural owl +Urutu Snake +Utonagan +Vampire Bat +Vampire Crab +Vampire Squid +Vaquita +Velociraptor +Venus Flytrap +Vermilion Flycatcher +Vervet Monkey +Vicuña +Vine Snake +Vinegaroon +Viper +Viper Boa +Viper shark (dogfish) +Viperfish +Virgin Islands Dwarf Gecko +Vizsla +Volpino Italiano +Vulture +Wahoo Fish +Waimanu +Walking Catfish +Wallaby +Walleye Fish +Walrus +Wandering Albatross +Warthog +Wasp +Water Buffalo +Water Dragon +Water Vole +Wax Moth +Weasel +Weaver Bird +Weimaraner +Weimardoodle +Wels Catfish +Welsh Corgi +Welsh Terrier +West Highland Terrier +Western Blind Snake +Western Diamondback Rattlesnake +Western Gorilla +Western Green Mamba +Western Hognose Snake +Western Lowland Gorilla +Western Rat Snake +Westiepoo +Whale Shark +Wheaten Terrier +Whimbrel +Whinchat +Whippet +Whiptail Lizard +White Butterfly +White Ferret / Albino Ferrets +White Rhinoceros +White Shark +White Tiger +White-Eyed Vireo +White-Faced Capuchin +White-tail deer +White-Tailed Eagle +Whiting +Whoodle +Whooping Crane +Wild Boar +Wildebeest +Willow Warbler +Winter Moth +Wire Fox Terrier +Wirehaired Pointing Griffon +Wolf +Wolf Eel +Wolf Snake +Wolf Spider +Wolffish +Wolverine +Woma python +Wombat +Wood Bison +Wood Frog +Wood Tick +Wood Turtle +Woodlouse +Woodlouse Spider +Woodpecker +Woodrat +Woolly Mammoth +Woolly Monkey +Woolly Rhinoceros +Worm +Worm Snake +Wrasse +Writing Spider +Wrought Iron Butterflyfish +Wyoming Toad +X-Ray Tetra +Xerus +Xoloitzcuintli +Yak +Yakutian Laika +Yarara +Yellow Anaconda +Yellow Cobra +Yellow Sac Spider +Yellow Spotted Lizard +Yellow Tang +Yellow-Bellied Sea Snake +Yellow-Eyed Penguin +Yellowfin Tuna +Yeti Crab +Yoranian +Yorkie Bichon +Yorkie-poo +Yorkshire Terrier +Zebra +Zebra Finch +Zebra Mussels +Zebra Pleco +Zebra Shark +Zebra Snake +Zebra Spitting Cobra +Zebu +Zonkey +Zorse +Zuchon diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/animals_categories.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/animals_categories.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bb9afa7659fcbc5bbbf1f92243d8531f7b3baeb4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/animals_categories.txt" @@ -0,0 +1,9 @@ +Bird +Bug +Cat +Crustacean +Dog +Fish +Bug +Reptile +Whale \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_cambrian.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_cambrian.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b8875f46490dbf052237ec8a4eb37ba8bd0b4f95 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_cambrian.txt" @@ -0,0 +1,5 @@ +Anomalocaris +Hallucigenia +Opabinia +Pambdelurion +Xidazoon \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_crustacean.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_crustacean.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6ce40ce399f8137c38cb3f46165aa4aee8a727d7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_crustacean.txt" @@ -0,0 +1,8 @@ +Crab +Crayfish +Hermit crab +Horseshoe crab +Isopod +Cymothoa exigua +Lobster +Shrimp \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_fish.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_fish.txt" new file mode 100644 index 0000000000000000000000000000000000000000..60cdb7d17a6667f722a872ed72d7f6376621b165 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_fish.txt" @@ -0,0 +1,44 @@ +Alligator gar +Angelfish +Anglerfish +Arapaima +Arowana +Bichir +Butterflyfish +Catfish +Carp +Goldfish +Koi +Clownfish +Coelacanth +Dunkleosteus +Eel +Leopard eel +Moray eel +Flying fish +Lamprey +Leafy Seadragon +Manta ray +Needlefish +Oarfish +Oilfish +Puffer fish +Sailfish +Saury +Sawfish +Seahorse +Shark +Edestus +Goblin shark +Great white shark +Hammerhead shark +Helicoprion +Saw shark +Whale shark +Siamese fighting fish +Swordfish +Sunfish +Surgeonfish +Regal blue tang +Yellow tang +Tuna \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_invertebrate.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_invertebrate.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ba7673487b2593f8bf404e5c9f373ea06a3af4cb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_invertebrate.txt" @@ -0,0 +1,12 @@ +Abalone +Clam +Coral +Cuttlefish +Octopus +Oyster +Sea anemone +Sea cucumber +Sea slug +Sea urchin +Squid +Starfish \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_whale.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_whale.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4308edb29b0a2cd564379533b4408bf2ad809b23 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_aquatic_whale.txt" @@ -0,0 +1,6 @@ +Beluga whale +Blue whale +Dolphin +Humpback whale +Narwhal +Orca \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_cat.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_cat.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4e291c60edddc4e48a183994739012d209080471 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_cat.txt" @@ -0,0 +1,7 @@ +Black cat +Kitten +Cheetah +Leopard +Lion +Panther +Tiger \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_dog.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_dog.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e671960aecf2cc9efd6617629644fa05e90390cd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_dog.txt" @@ -0,0 +1,2 @@ +Puppy +Wolf \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_insects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_insects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..793a23ead5afc1b6b26d5764ce9c0aa72e757221 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_insects.txt" @@ -0,0 +1,29 @@ +Ant +Bee +Beetle +Butterfly +Caterpillar +Centipede +Cicada +Cockroach +Cricket +Cave cricket +Dragonfly +Firefly +Fireflies +Fly +Grasshopper +Ladybug +Leucochloridium paradoxum +Millipede +Mosquito +Moth +Pillbug +Wasp +Arachnid +Spider +Scorpion +Camel spider +Amblypygi +Larva +Maggot \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_large.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_large.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d316314707d43fc97dfcbdafcd60d4ee609fae4d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_large.txt" @@ -0,0 +1,12 @@ +Alpaca +Bear +Bull +Cow +Deer +Elephant +Giraffe +Gorilla +Horse +Kangaroo +Mammoth +Panda \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_medium.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_medium.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7170d13e89648fc2a207b96843d7871ec64937e1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_medium.txt" @@ -0,0 +1,14 @@ +Boar +Echidna +Fox +Red fox +Goat +Hedgehog +Monkey +Platypus +Pig +Piglet +Rabbit +Angora rabbit +Sheep +Lamb \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_reptile_and_amphibian.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_reptile_and_amphibian.txt" new file mode 100644 index 0000000000000000000000000000000000000000..238e3702a37e7b34aec33a2857360923ce604327 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_reptile_and_amphibian.txt" @@ -0,0 +1,11 @@ +Crocodile +Frog +Tadpole +Iguana +Lizard +Frill-necked lizard +Snake +Turtle +Salamander +Axolotl +Hellbender \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_rodent.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_rodent.txt" new file mode 100644 index 0000000000000000000000000000000000000000..31fdc0f34a117b820df36a0c47253bdffb99ae4c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/animals/types/animals_types_rodent.txt" @@ -0,0 +1,10 @@ +Beaver +Capybara +Chipmunk +Ferret +Guinea pig +Hamster +Mouse +Naked mole rat +Squirrel +Weasel \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist-botanical.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist-botanical.txt" new file mode 100644 index 0000000000000000000000000000000000000000..82214cbfca3815df1e77852418d5b36a1fc16902 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist-botanical.txt" @@ -0,0 +1,30 @@ +Marianne North +Margaret Mee +Elizabeth Blackwell +Anne Pratt +Pierre Jean Francois Turpin +Georg Dionysius Ehret +Sydney Parkinson +Margaret Flockton +Wendy Hollender +Ferdinand Bauer +Olivia Marie Braida-Chiusano +Clarissa Munger Badger +Jenny Brasier +Giovanna Garzoni +Walter Hood Fitch +Alexander Marshal +Lise Cloquet +Pancrace Bessa +Claude Aubriet +Susannah Blaxill +Arthur Harry Church +Gillian Condy +Emilie von Buttner +Johanna Helena Herolt +Elsie Louise Shaw +Royal Charles Steadman +Mary Daisy Arnold +Margaret Warriner Buck +Ellen Isham Schutt +William Jackson Hooker diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist-surreal.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist-surreal.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bbd5e9fe70ddb15d9305f7292cd317cc9277fc86 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist-surreal.txt" @@ -0,0 +1,68 @@ +Art et Liberte +Aase Berg +Yutaka Bito +Bodo +Victor Brauner +Emmy Bridgwater +Robert Campbell +Giorgio de Chirico +William Copley +Joao Artur da Silva +Elba Damast +Paul Delvaux +Karina Eibatova +Harold Elliott +Max Ernst +Ewa Juszkiewicz +Agustin Fernandez +Ichiro Fukuzawa +Abdel Hadi Al Gazzar +Emma Geary +Alberto Giacometti +Arshile Gorky +Jane Graverol +Henriette Grindat +Moosa Al Halyan +Irene Hamoir +Gilberto Hernandez Ortega +Kati Horna +Tetsuya Ishida +Alpo Jaakola +Frida Kahlo +Ida Kar +Yuki Katsura +Hiroshi Katsuragawa +Rita Kernn-Larsen +Greta Knutson +Harue Koga +Wifredo Lam +Rene Magritte +Joyce Mansour +Roberto Matta +Lise Mayer +Emila Medkova +Ahmed Morsi +Polly Nor +Meret Oppenheim +Olga Orozco +Kiyoji Otsuji +Wolfgang Paalen +Ihor Podolchak +Joyce Reopel +Aminollah Rezaei +Guia Risari +Erna Rosenstein +Pablo Runyan +Xanti Schawinsky +Eva Svankmajerova +Alina Szapocznikow +Enrique Tabara +Dorothea Tanning +Herve Telemaque +Blanca Varela +Rinat Voligamsi +Kikuji Yamashita +Ramses Younan +Haifa Zangana +Nele Zirnite +Unica Zurn diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist.txt" new file mode 100644 index 0000000000000000000000000000000000000000..00056b07c8a7ef012f04e976e044c644bd6442bc --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/artist.txt" @@ -0,0 +1,2265 @@ +Serge Birault +A.J.Casson +Aaron Douglas +Aaron Horkey +Aaron Jasinski +Aaron Siskind +Abbott Fuller Graves +Abbott Handerson Thayer +Abdel Hadi Al Gazzar +Abed Abdi +Abigail Larson +Abraham Mintchine +Abraham Pether +Abram Efimovich Arkhipov +Adam Elsheimer +Adam Hughes +Adam Martinakis +Adam Paquette +Adi Granov +Adolf Hirémy Hirschl +Adolf Hirémy-Hirschl +Adolph Gottlieb +Adolph Menzel +Adonna Khare +Adriaen van Ostade +Adriaen van Outrecht +Adrian Donoghue +Adrian Ghenie +Adrian Paul Allinson +Adrian Smith +Adrian Tomine +Adrianus Eversen +Afarin Sajedi +Affandi +Aggi Erguna +Agnes Cecile +Agnes Lawrence Pelton +Agnes Martin +Agostino Arrivabene +Agostino Tassi +Ai Weiwei +Ai Yazawa +Akihiko Yoshida +Akira Toriyama +Akos Major +Akseli Gallen-Kallela +Al Capp +Al Feldstein +Al Williamson +Alain Laboile +Alan Bean +Alan Davis +Alan Kenny +Alan Lee +Alan Moore +Alan Parry +Alan Schaller +Alasdair McLellan +Alastair Magnaldo +Alayna Lemmer +Albert Benois +Albert Bierstadt +Albert Bloch +Albert Dubois-Pillet +Albert Eckhout +Albert Edelfelt +Albert Gleizes +Albert Goodwin +Albert Joseph Moore +Albert Koetsier +Albert Kotin +Albert Lynch +Albert Marquet +Albert Pinkham Ryder +Albert Robida +Albert Servaes +Albert Tucker +Albert Watson +Alberto Biasi +Alberto Burri +Alberto Giacometti +Alberto Magnelli +Alberto Seveso +Alberto Sughi +Alberto Vargas +Albrecht Anker +Albrecht Durer +Alec Soth +Alejandro Burdisio +Alejandro Jodorowsky +Aleksey Savrasov +Aleksi Briclot +Alena Aenami +Alessandro Allori +Alessandro Barbucci +Alessandro Gottardo +Alessio Albi +Alex Alemany +Alex Andreev +Alex Colville +Alex Figini +Alex Garant +Alex Grey +Alex Gross +Alex Hirsch +Alex Horley +Alex Howitt +Alex Katz +Alex Maleev +Alex Petruk +Alex Prager +Alex Ross +Alex Russell Flint +Alex Schomburg +Alex Timmermans +Alex Toth +Alexander Archipenko +Alexander Bogen +Alexander Fedosav +Alexander Jansson +Alexander Kanoldt +Alexander McQueen +Alexander Millar +Alexander Milne Calder +Alexandr Averin +Alexandre Antigna +Alexandre Benois +Alexandre Cabanel +Alexandre Calame +Alexandre Jacovleff +Alexandre-Evariste Fragonard +Alexandre-Évariste Fragonard +Alexei Harlamoff +Alexej von Jawlensky +Alexey Kurbatov +Alexis Gritchenko +Alfred Augustus Glendening +Alfred Cheney Johnston +Alfred Eisenstaedt +Alfred Guillou +Alfred Heber Hutty +Alfred Henry Maurer +Alfred Kelsner +Alfred Kubin +Alfred Munnings +Alfred Parsons +Alfred Sisley +Alfred Stevens +Alfredo Jaar +Algernon Blackwood +Alice Bailly +Alice Neel +Alice Pasquini +Alice Rahon +Alison Bechdel +Aliza Razell +Allen Williams +Allie Brosh +Allison Bechdel +Alma Thomas +Alois Arnegger +Alphonse Mucha +Alphonse Osbert +Alpo Jaakola +Alson Skinner Clark +Alvar Aalto +Alvaro Siza +Alvin Langdon Coburn +Alyssa Monks +Amadeo De Souza-Cardoso +Amadou Opa Bathily +Amanda Clark +Amanda Sage +Amandine Van Ray +Ambrosius Benson +Ambrosius Bosschaert +Amédée Guillemin +Amedee Ozenfant +Amedeo Modigliani +Amiet Cuno +Aminollah Rezaei +Amir Zand +Amy Earles +Amy Judd +Amy Sillman +Anato Finnstark +Anatoly Metlan +Anders Zorn +Ando Fuchs +Andre De Dienes +Andre Derain +Andre Kertesz +Andre Kohn +Andre Leblanc +André Lhote +André Masson +Andre Norton +Andre-Charles Boulle +Andrea Kowch +Andrea Mantegna +Andreas Achenbach +Andreas Franke +Andreas Gursky +Andreas Levers +Andreas Rocha +Andreas Vesalius +Andrei Arinouchkine +Andréi Arinouchkine +Andrei Markin +Andrew Atroshenko +Andrew Ferez +Andrew Hem +Andrew Macara +Andrew Robinson +Andrew Wyeth +Andrey Remnev +Android Jones +Andrzej Sykut +Andy Fairhurst +Andy Goldsworthy +Andy Kehoe +Andy Warhol +Angela Barrett +Angela Sung +Angus McKie +Anish Kapoor +Anita Malfatti +Anja Millen +Anja Percival +Anka Zhuravleva +Ann Stookey +Anna Ancher +Anna and Elena Balbusso +Anna Bocek +Anna Dittmann +Anna Razumovskaya +Anne Bachelier +Anne Brigman +Anne Dewailly +Anne Mccaffrey +Anne Packard +Anne Rothenstein +Anne Stokes +Anne Sudworth +Anne Truitt +Anne-Louis Girodet +Anni Albers +Annibale Carracci +Annick Bouvattier +Annie Soudain +Annie Swynnerton +Ansel Adams +Anselm Kiefer +Antanas Sutkus +Anthony Gerace +Anthony Thieme +Anthony van Dyck +Anto Carte +Antoine Blanchard +Antoine Verney-Carron +Anton Corbijn +Anton Domenico Gabbiani +Anton Fadeev +Anton Mauve +Anton Otto Fischer +Anton Pieck +Anton Raphael Mengs +Anton Semenov +Antonello da Messina +Antoni Gaudi +Antonio Canova +Antonio Donghi +Antonio J. Manzanedo +Antonio Mancini +Antonio Mora +Antonio Roybal +Antony Gormley +Apollinary Vasnetsov +Apollonia Saintclair +Aquirax Uno +Archibald Thorburn +Aries Moross +Arik Brauer +Aristarkh Lentulov +Aristide Maillol +Arkhyp Kuindzhi +Arman Manookian +Armand Guillaumin +Armand Point +Arnold Bocklin +Arnold Böcklin +Arnold Schoenberg +Aron Demetz +Aron Wiesenfeld +Arshile Gorky +Art Fitzpatrick +Art Frahm +Art Spiegelman +Artem Chebokha +Artemisia Gentileschi +Artgerm +Arthur Adams +Arthur Boyd +Arthur Dove +Arthur Garfield Dove +Arthur Hacker +Arthur Hughes +Arthur Lismer +Arthur Rackham +Arthur Radebaugh +Arthur Sarnoff +Arthur Stanley Wilkinson +Arthur Streeton +Arthur Tress +Arthur Wardle +Artur Bordalo +Arturo Souto +Artus Scheiner +Ary Scheffer +Asaf Hanuka +Asger Jorn +Asher Brown Durand +Ashley Willerton +Ashley Wood +Atay Ghailan +Atelier Olschinsky +Atey Ghailan +Aubrey Beardsley +Audrey Kawasaki +August Friedrich Schenck +August Macke +August Sander +August von Pettenkofen +Auguste Herbin +Auguste Mambour +Auguste Toulmouche +Augustus Edwin Mulready +Augustus Jansson +Augustus John +Austin Osman Spare +Axel Törneman +Ayami Kojima +Ayan Nag +Aykut Aydogdu +Bakemono Zukushi +Balthus +Banksy +Barbara Hepworth +Barbara Kruger +Barbara Stauffacher Solomon +Barbara Takenaga +Barclay Shaw +Barkley L. Hendricks +Barnett Newman +Barry McGee +Barry Windsor Smith +Bart Sears +Barthel Bruyn the Elder +Barthel Bruyn the Younger +Bartolome Esteban Murillo +Basil Gogos +Bastien Lecouffe-Deharme +Bayard Wu +Beatrix Potter +Beauford Delaney +Becky Cloonan +Beeple +Bella Kotak +Ben Aronson +Ben Goossens +Ben Hatke +Ben Nicholson +Ben Quilty +Ben Shahn +Ben Templesmith +Ben Wooten +Benedetto Caliari +Benedick Bana +Benoit B. Mandelbrot +Berend Strik +Bernard Aubertin +Bernard Buffet +Bernardo Bellotto +Bernardo Strozzi +Berndnaut Smilde +Bernie Wrightson +Bert Hardy +Bert Stern +Berthe Morisot +Bertil Nilsson +Bess Hamiti +Beth Conklin +Bettina Rheims +Bhupen Khakhar +Bijou Karman +Bill Brandt +Bill Brauer +Bill Carman +Bill Durgin +Bill Gekas +Bill Henson +Bill Jacklin +Bill Medcalf +Bill Sienkiewicz +Bill Traylor +Bill Viola +Bill Ward +Bill Watterson +Billy Childish +Bjarke Ingels +Blek Le Rat +Bo Bartlett +Bo Chen +Bob Byerley +Bob Eggleton +Bob Ross +Bojan Jevtic +Bojan Koturanovic +Bordalo II +Boris Grigoriev +Boris Groh +Boris Kustodiev +Boris Vallejo +Botero +Brad Kunkle +Brad Rigney +Brandon Mably +Brandon Woelfel +Brenda Zlamany +Brent Cotton +Brent Heighton +Brett Weston +Brett Whiteley +Brian Bolland +Brian Despain +Brian Froud +Brian K. Vaughan +Brian Kesinger +Brian M. Viveros +Brian Mashburn +Brian Oldham +Brian Stelfreeze +Brian Sum +Briana Mora +Brice Marden +Bridget Bate Tichenor +Bridget Riley +Briton Rivière +Brooke DiDonato +Brooke Shaden +Brothers Grimm +Brothers Hildebrandt +Bruce Coville +Bruce Munro +Bruce Nauman +Bruce Pennington +Bruce Timm +Bruno Catalano +Bruno Munari +Bruno Walpoth +Bryan Hitch +Burt Glinn +Butcher Billy +C. R. W. Nevinson +Cagnaccio Di San Pietro +Cai Guo-Qiang +Callie Fink +Camille Claudel +Camille Corot +Camille Pissarro +Camille Vivier +Camille Walala +Camille-Pierre Pambu Bodo +Canaletto +Candido Portinari +Caras Ionut +Carel Willink +Carl Barks +Carl Eugen Keel +Carl Gustav Carus +Carl Holsoe +Carl Kleiner +Carl Larsson +Carl Moll +Carl Rungius +Carl Spitzweg +Carlo Bocchio +Carlo Crivelli +Carlo Dolci +Carlo Scarpa +Carlos Cruz-Diez +Carlos Schwabe +Carmen Saldana +Carne Griffiths +Carolina Herrera +Carolyn Blish +Carrie Ann Baade +Carrie Graber +Carrie Mae Weems +Carson Grubaugh +Carsten Holler +Carsten Meyerdierks +Casey Baugh +Casey Childs +Casey Weldon +Caspar David Friedrich +Cassius Marcellus Coolidge +Catherine Hyde +Catherine Nolin +Cathy Wilkes +Catrin Welz-Stein +Cecil Beaton +Cecilia Beaux +Cecily Brown +Cedric Peyravernay +Cedric Seaut +Cerith Wyn Evans +CFA Voysey +Chad Knight +Chaïm Soutine +Chantal Joffe +Charles Addams +Charles Angrand +Charles Blackman +Charles Burns +Charles Camoin +Charles Courtney Curran +Charles Dana Gibson +Charles Demuth +Charles Dwyer +Charles E. Burchfield +Charles Ellison +Charles Filiger +Charles Ginner +Charles Gwathmey +Charles Le Brun +Charles Liu +Charles Maurice Detmold +Charles Mellin +Charles Philipon +Charles Reiffel +Charles Rennie Mackintosh +Charles Robinson +Charles Schridde +Charles Schulz +Charles Sheeler +Charles Spencelayh +Charles Tunnicliffe +Charles Vess +Charles Victor Thirion +Charles White +Charles Willson Peale +Charles Wysocki +Charles-Amable Lenoir +Charles-Francois Daubigny +Charley Harper +Charlie Bowater +Charline von Heyl +Charly Amani +Chase Stone +Chen Zhen +Chesley Bonestell +Chie Yoshii +Chiharu Shiota +Chiho Aoshima +Childe Hassam +Ching Yeh +Chip Zdarsky +Chris Bachalo +Chris Claremont +Chris Cold +Chris Cunningham +Chris Dyer +Chris Foss +Chris Friel +Chris LaBrooy +Chris Leib +Chris Mars +Chris Moore +Chris Ofili +Chris Riddell +Chris Samnee +Chris Saunders +Chris Turnham +Chris Uminga +Chris Van Allsburg +Chris Ware +Christian Boltanski +Christian Dimitrov +Christian Grajewski +Christian Griepenkerl +Christian Schad +Christian Schloe +Christian Wilhelm Allers +Christine Ellger +Christoffer Relander +Christophe Jacrot +Christophe Staelens +Christophe Vacher +Christopher Balaskas +Christopher Jin Baron +Christopher Ryan McKenney +Christopher Wool +Chuah Thean Teng +Chuanxing Zeng +Chuck Close +Chung Shek +Cicely Mary Barker +Cildo Meireles +Cindy Sherman +Claes Oldenburg +Claire Hummel +Clara Ledesma +Clara Miller Burd +Clara Peeters +Clarence Gagnon +Clarence Holbrook Carter +Claude Cahun +Claude Lefèbvre +Claude Lorrain +Claude Melan +Claude Monet +Claudia Tremblay +Clay Mann +Clayton Crain +Clemens Ascher +Cleon Peterson +Cliff Chiang +Clifford Coffin +Clint Cearley +Clive Barker +Clive Madgwick +Clovis Trouille +Clyde Caldwell +Clyfford Still +Coby Whitmore +Coles Phillips +Colin Campbell Cooper +Colin Elder +Colin Geller +Collin Elder +Conor Harrington +Conrad Felixmüller +Conrad Roset +Conrad Shawcross +Constant Permeke +Constantin Brancusi +Constantin Joffe +Cornelia Parker +Cornelis Springer +Cory Arcangel +Cory Loftis +Costa Dvorezky +Craig Davison +Craig McCracken +Craig Mullins +Craig Thompson +Craig Wylie +Craigie Aitchison +Craola +Cuno Amiet +Cyril Pontet +Cyril Rolando +Daan Roosegaarde +Daido Moriyama +Dain Yoon +Dale Chihuly +Damien Hirst +Dan Colen +Dan Flavin +Dan Hillier +Dan Kelby +Dan Matutina +Dan McPharlin +Dan Mumford +Dan Piraro +Dan Witz +Dana Schutz +Danh Víµ +Danh Vō +Daniel Arsham +Daniel Buren +Daniel Clowes +Daniel F. Gerhartz +Daniel Garber +Daniel Jaems +Daniel Libeskind +Daniel Lieske +Daniel Merriam +Daniel Ridgway Knight +Daniela Uhlig +Daniele Afferni +Danielle Eckhardt +Danny Flynn +Danny Lyon +Danny Roberts +Dante Gabriel Rossetti +Dao Le Trong +Darek Zabrocki +Daria Endresen +Dariusz Klimczak +Dariusz Zawadzki +Darren Bacon +Darwyn Cooke +Daryl Mandryk +Dave Coverly +Dave Dorman +Dave Gibbons +Dave Heath +Dave McKean +Dave Rapoza +Dave Stevens +David A. Hardy +David Aja +David Alfaro Siqueiros +David Aronson +David B. Mattingly +David Bailey +David Bates +David Bomberg +David Bowie +David Brayne +David Brown Milne +David Burdeny +David Burliuk +David Chipperfield +David Choe +David Cronenberg +David Downton +David Driskell +David Finch +David Goldblatt +David Hammons +David Hettinger +David Hockney +David Inshaw +David LaChapelle +David Ligare +David Lynch +David Malkin +David Mann +David McClellan +David Michael Bowers +David Mould +David Nordahl +David Normal +David Palumbo +David Plowden +David Renshaw +David Shrigley +David Sims +David Spriggs +David Teniers the Younger +David Tindle +David Tutwiler +David Walker +David Welker +David Wiesner +David Yarrow +Davide Sasselli +Davide Sorrenti +Davis Marc +Davis Stuart +De Es Schwertberger +Dean Cornwell +Dean Ellis +Death Burger +Debbie Criswell +Debbie Fleming Caffery +Deborah Azzopardi +Deborah Turbeville +Dee Nickerson +Deirdre Sullivan-Beeman +Del Kathryn Barton +Delphin Enjolras +Demizu Posuka +Denis Sarazhin +Dennis Stock +Denys Lasdun +Derek Boshier +Derek Gores +Desmond Morris +Diane Arbus +Diane Dillon +Dick Bickenbach +Didier Barra +Didier Lourenço +Diego Dayer +Diego Rivera +Diego Velázquez +Dima Dmitriev +Dimitra Milan +Dimitry Roulland +Dino Valls +Dmitri Danish +Dmitry Kustanovich +Dmitry Spiros +Dmitry Vishnevsky +Do Ho Suh +Dod Procter +Dom Qwek +Don Bergland +Don Blanding +Don Bluth +Don Lawrence +Don Maitz +Don Northrup +Donald Judd +Donald Pass +Donato Giancola +Donna Huanca +Dora Carrington +Dora Maar +Dorina Costras +Dorothea Lange +Dorothea Sharp +Dorothea Tanning +Dorothy Johnstone +Dorothy Lathrop +Doug Aitken +Doug Chiang +Douglas Smith +Dr. Seuss +Drew Struzan +Duffy Sheridan +Duncan Fegredo +Dusan Djukaric +Dustin Nguyen +Duy Huynh +E. H. Shepard +Eadweard Muybridge +Earl Norem +Eastman Johnson +Ebru Sidar +Echo Chernik +Ed Benedict +Ed Binkley +Ed Brubaker +Ed Emshwiller +Ed Freeman +Ed Mell +Ed Myers +Ed Piskor +Ed Roth +Ed Valigursky +Eddie Campbell +Eddie Colla +Eddie Jones +Eddie Mendoza +Edgar Degas +Edgar Maxence +Edith Rimmington +Edmund Dulac +Edmund Leighton +Edmund Tarbell +Edmundo Senatore +Edna Reindel +Edoardo Tresoldi +Edogawa Ranpo +Edouard Manet +Édouard Manet +Edouard Riou +Édouard Riou +Édouard Vuillard +Eduard Cortes +Eduard Gaertner +Eduard Veith +Eduard Wiiralt +Eduardo Kobra +Edvard Munch +Edward Atkinson Hornel +Edward Bawden +Edward Blair Wilkins +Edward Burne-Jones +Edward Cucuel +Edward Gorey +Edward Henry Potthast +Edward Hersey +Edward Hopper +Edward John Poynter +Edward Julius Detmold +Edward Lear +Edward Moran +Edward Okuń +Edward Poynter +Edward Robert Hughes +Edward Seago +Edward Steichen +Edward Wadsworth +Edward Weston +Edwin Austin Abbey +Edwin Deakin +Edwin Henry Landseer +Edwin Lord Weeks +Edwin Wurm +Eero Saarinen +Egon Schiele +Eiichiro Oda +Eiko Ojala +Eileen Agar +Eileen Gray +Eilif Peterssen +Einar Hákonarson +Einer Johansen +Ekaterina Panikanova +El Anatsui +El Greco +El Lissitzky +Elaine de Kooning +Elba Damast +Eleanor Fortescue-Brickdale +Eleanor Vere Boyle +Elena Paraskeva +Elenore Abbott +Elfriede Lohse-Wächtler +Elias Ravanetti +Elihu Vedder +Elina Karimova +Eliott Lilly +Elisa Breton +Elisabeth Sonrel +Élisabeth Vigée Le Brun +Élisabeth-Louise Vigée-Le Brun +Eliseu Visconti +Elizabeth Catlett +Elizabeth Gadd +Elizabeth Shippen Green +Elke Vogelsang +Ellen Gallagher +Ellen Jewett +Ellen von Unwerth +Elliot Lilly +Elliott Erwitt +Ellsworth Kelly +Elsa Beskow +Elsa Bleda +Elsie Driggs +Emek Golan +Emerico Imre Toth +Emerson Silva +Emery Hawkins +Emil Alzamora +Emil Carlsen +Emil Ferris +Emil Nolde +Emila Medková +Émile Bernard +Emile Claus +Emile Galle +Émile Gallé +Emilia Wilk +Emiliano Di Cavalcanti +Emiliano Ponzi +Emilio Giuseppe Dossena +Emily Balivet +Emily Carr +Emily Kame Kngwarreye +Emma Geary +Emma Ríos +Emmanuel de Critz +Emmanuel Shiu +Emmanuelle Moureaux +Emmy Bridgwater +Emory Douglas +Enki Bilal +Enoch Bolles +Enrique Tábara +Ephraim Moses Lilien +Eric Canete +Eric Carle +Eric Deschamps +Eric Fischl +Eric Ravilious +Eric Wallis +Eric Zener +Erica Hopper +Erich Heckel +Erik Johansson +Erik Jones +Erin Hanson +Erna Rosenstein +Ernest Crichlow +Ernest Lawson +Ernest Meissonier +Ernest Zacharevic +Ernesto Neto +Ernie Barnes +Ernst Barlach +Ernst Fuchs +Ernst Haas +Ernst Haeckel +Ernst Ludwig Kirchner +Ernst Wilhelm Nay +Erol Otus +Erwin Blumenfeld +Esaias van de Velde +Esao Andrews +Esteban Vicente +Esther Rolick +Etel Adnan +Ethan Van Sciver +Étienne Adolphe Piot +Etienne Hebinger +Étienne Maurice Falconet +Étienne-Louis Boullée +Étienne-Maurice Falconet +Ettore Sottsass +Ettore Tito +Euan Uglow +Eugen Bracht +Eugène Atget +Eugène Boudin +Eugene Delacroix +Eugene Galien-Laloue +Eugène Girardet +Eugène Grasset +Eugene Leroy +Eugene Tertychnyi +Eugene von Guerard +Eustache Le Sueur +Eva Å vankmajerová +Eva Hesse +Eva Rothschild +Eva Švankmajerová +Eve Arnold +Eve Ventrue +Evelyn De Morgan +Evelyn Dunbar +Everett Raymond Kinstler +Everett Shinn +Evgeni Gordiets +Evgeny Lushpin +Ewald Heinrich Rübsaamen +Ewald Rübsamen +Eyvind Earle +Ezra Stoller +F Scott Hess +Fabian Perez +Fabio Hurtado +Fairfield Porter +Faith 47 +Faith Ringgold +Fan Ho +Fang Lijun +Farel Dalrymple +Fay Godwin +Fay Helfer +Felice Casorati +Felicia Simion +Felicien Rops +Felipe Pantone +Felix Gonzalez-Torres +Felix Kelly +Félix Labisse +Felix Philipp Kanitz +Felix Vallotton +Fenghua Zhong +Ferdinand du Puigaudeau +Ferdinand Hodler +Ferdinand Keller +Ferdinand Knab +Ferdinand Van Kessel +Ferenc Pinter +Fern Isabel Coppedge +Fernand Cormon +Fernand Fonssagrives +Fernand Khnopff +Fernand Leger +Fernand Toussaint +Fernando Amorsolo +Fernando Botero +Fernando Herenu +Ferris Plock +Filip Hodas +Filippino Lippi +Filippo Balbi +Filippo Brunelleschi +Fintan Magee +Firmin Baes +Fletcher Sibthorp +Flora Borsi +Florence Harrison +Florian Nicolle +Florianne Becker +Floris Arntzenius +Floris Jespers +Ford Madox Brown +Fra Angelico +Frances MacDonald +Francesca Woodman +Francesco Albani +Francesco Borromini +Francesco Clemente +Francesco Guardi +Francesco Hayez +Francesco Solimena +Francine Van Hove +Francis Bacon +Francis Coates Jones +Francis Davis Millet +Francis Newton Souza +Francis Picabia +Francisco De Goya +Francisco de Zurbaran +Francisco Goya +Francisco Martín +Franciszek Starowieyski +Franck Tacito +Franco Fontana +François Boquet +François Boucher +François De Nomé +François Marcela-Froideval +Francois Schuiten +Françoise Basset +Frank Auerbach +Frank Bramley +Frank Cadogan Cowper +Frank Cho +Frank Frazetta +Frank Gehry +Frank Holl +Frank Lloyd Wright +Frank McCarthy +Frank Miller +Frank Quitely +Frank Stella +Frank Thorne +Frank Tinsley +Frank Weston Benson +Frank Xavier Leyendecker +Franklin Booth +Franklin Carmichael +Frans Floris +Frans Francken the Younger +Frans Hals +Frans Snyders +František Dvořák +Frantisek Kupka +Franz Kline +Franz Lefler +Franz Marc +Franz Sedlacek +Franz Stuck +Franz Vohwinkel +Franz West +Franz Xaver Winterhalter +Fred Calleri +Fred Stein +Fred Tomaselli +Frederic Bazille +Frederic Church +Frédéric Fiebig +Frederic Remington +Frederick Arthur Bridgman +Frederick Arthur Verner +Frederick Carl Frieseke +Frederick Cayley Robinson +Frederick Goodall +Frederick Judd Waugh +Frederick Lord Leighton +Frederick McCubbin +Frederick Sandys +Frida Kahlo +Friedensreich Hundertwasser +Frieke Janssens +Frits Thaulow +Frits Van den Berghe +Fritz Bleyl +Fritz Henle +Fritz Scholder +Fujii Yoshitoyo +Fujishima Takeji +Fuyuko Matsui +Gabriel Dawe +Gabriele Münter +Gaetano Pesce +Galan Pang +Gareth Pugh +Gary Larson +Gaston Bussière +Gediminas Pranckevicius +Genndy Tartakovsky +Geof Darrow +Georg Jensen +Georg Karl Pfahler +George Ault +George Cruikshank +George Dionysus Ehret +George Frederic Watts +George French Angas +George Grosz +George Herriman +George Inness +George Lucas +George Luks +George Stubbs +George Tooker +Georges de La Tour +Georges Rouault +Georges Seurat +Georgia O'Keeffe +Gerald Brom +Gerda Wegener +Gerhard Munthe +Gerhard Richter +Gertrude Abercrombie +Giacomo Balla +Gianluca Foli +Gifford Beal +Gil Elvgren +Gilbert Stuart +Gilberto "Soren" Zaragoza +Giorgio De Chirico +Giotto Di Bondone +Giovanni Battista Bracelli +Giovanni Battista Gaulli +Giovanni Battista Piranesi +Giovanni Battista Venanzi +Giovanni da Udina +Giuseppe Arcimboldo +Giuseppe de Nittis +Gjon Mili +Glen Orbik +Glenn Fabry +Gloria Stoll Karn +Go Nagai +Gordon Browne +Gordon Parks +Goro Fujita +Grace Cossington Smith +Grace Popp +Grandma Moses +Grant Wood +Grayson Perry +Greg Girard +Greg Hildebrandt +Greg Rutkowski +Greg Simkins +Gregory Crewdson +Guerrilla Girls +Guido Borelli Da Caluso +Guido Crepax +Guillermo del Toro +Guo Pei +Gustaf Tenggren +Gustav Klimt +Gustave Buchet +Gustave Courbet +Gustave Doré +Gustave Moreau +Gustave Van de Woestijne +Guy Billout +Gwen John +Gwenda Morgan +H. R. (Hans Ruedi) Giger +H. R. Giger +H.P. Lovecraft +Haddon Sundblom +Hajime Sorayama +Hal Foster +Hale Woodruff +Hanna-Barbera +Hannah Hoch +Hans Arnold +Hans Baldung +Hans Baluschek +Hans Bellmer +Harold McCauley +Haroon Mirza +Harriet Backer +Harry Clarke +Hasui Kawase +Hayao Miyazaki +Hayv Kahraman +Hein Gorny +Heinrich Kley +Heinrich Lefler +Heinz Edelmann +Helen Frankenthaler +Helene Knoop +Helene Schjerfbeck +Helio Oiticica +Helmut Newton +Hendrick Avercamp +Hendrick Cornelisz Vroom +Hendrick Goltzius +Hendrik Kerstens +Henri De Toulouse Lautrec +Henri Fantin Latour +Henri Matisse +Henri Rousseau +Henri-Edmond Cross +Henriette Grindat +Henry Asencio +Henry Fuseli +Henry Moore +Henry Moret +Henry Ossawa Tanner +Henry Raleigh +Herbert List +Herve Groussin +Hervé Guibert +Hethe Srodawa +Hieronymus Bosch +Hikari Shimoda +Hilma AF Klint +Hirohiko Araki +Hiroshi Nagai +Hiroshi Sugimoto +Hiroshi Yoshida +Honor C. Appleton +Honoré Daumier +Hope Gangloff +Horace Vernet +Hou China +Howard Chandler Christy +Howard Finster +Howard Hodgkin +Howard Pyle +Hsiao-Ron Cheng +Hubert Robert +Hugh Ferriss +Hugh Kretschmer +Hundertwasser +Hyacinthe Rigaud +Iain Faulkner +Ian McQue +Ian Miller +Ida Rentoul Outhwaite +Igor Morski +Igor Wolski +Igor Zenin +Ilya Kuvshinov +Ilya Repin +Incarcerated Jerkfaces +Ingrid Baars +Inio Asano +Irma Stern +Iryna Yermolova +Isaac Cordal +Isaac Levitan +Ismail Inceoglu +Issac Levitan +Istvan Banyai +Itō Jakuchū +Ivan Aivazovski +Ivan Aivazóvski +Ivan Albright +Ivan Bilibin +Ivan Shishkin +Iwan Baan +J. J. Grandville +J.C. Leyendecker +J.M.W. Turner +Jacek Yerka +Jack Butler Yeats +Jack Davis +Jack Gaughan +Jack Kirby +Jackson Pollock +Jacob Hashimoto +Jacob Lawrence +Jacob van Ruisdael +Jacques Le Moyne +Jacques Nathan-Garamond +Jake Parker +Jakub Rozalski +James Abbott McNeill Whistler +James C Christensen +James Ensor +James Gilleard +James Gillray +James Gurney +James Jean +James Montgomery Flagg +James Paick +James Stokoe +James Thomas Watts +James Tissot +James Turrell +Jamie Baldridge +Jamie Hawkesworth +Jamie Hewlett +Jamie McKelvie +Jamini Roy +Jan Brett +Jan Luyken +Jan Pietersz Saenredam +Jan Van Eyck +Jan van Kessel the Elder +Jane Graverol +Jane Newland +Janek Sedlar +Jasmine Becket-Griffith +Jason A. Engle +Jason Chan +Jason Edmiston +Jasper Johns +Jaume Plensa +Jaya Suberg +JC Leyendecker +Jean Arp +Jean Auguste Dominique Ingres +Jean Bourdichon +Jean Delville +Jean Dubuffet +Jean Fouquet +Jean Giraud +Jean Jullien +Jean Marc Nattier +Jean Metzinger +Jean Nouvel +Jean-Antoine Watteau +Jean-Baptiste Monge +Jean-François Millet +Jean-Honoré Fragonard +Jean-Léon Gérôme +Jean-Louis Prevost +Jean-Michel Basquiat +Jean-Paul Riopelle +Jeanloup Sieff +Jeannette Guichard-Bunel +Jed Henry +Jef Wu +Jeff Easley +Jeff Goldblum +Jeff Kinney +Jeff Koons +Jeff Legg +Jeff Lemire +Jeff Simpson +Jeff Wall +Jeffrey Catherine Jones +Jeffrey Cruz (theCHAMBA) +Jeffrey Smith art +Jeffrey T. Larson +Jenny Saville +JennyBird Alcantara +Jeremiah Ketner +Jeremy Geddes +Jeremy Lipking +Jeremy Mann +Jerry Pinkney +Jerry Siegel +Jerzy Duda-Gracz +Jesper Ejsing +Jessica Rossier +Jessica Woulfe +Jessie Willcox Smith +Jhonen Vasquez +Jillian Tamaki +Jim Burns +Jim Davis +Jim Lee +Jim Mahfood +Jim Woodring +Jimmy Ernst +Jimmy Lawlor +Joachim Brohm +Joan Miró +Joan Tuset +Joanna Kustra +Joao Ruas +Joaquín Sorolla +Joe Bowler +Joe De Mers +Joe Fenton +Joe Jusko +Joe Madureira +Joe Webb +Joel Meyerowitz +Joel Sternfeld +Joey Chou +Johann Wolfgang von Goethe +Johannes Itten +Johannes Vermeer +Johannes Voss +Johfra Bosschart +John Anster Fitzgerald +John Atherton +John Atkinson Grimshaw +John Bauer +John Berkey +John Blanche +John Bratby +John Cassaday +John Constable +John Currin +John Duncan +John Frederick Kensett +John French Sloan +John Harris +John Howe +John Hoyland +John James Audubon +John Kenn Mortensen +John La Farge +John Lavery +John Martin +John Perceval +John Philip Falter +John Salminen +John Singer Sargent +John Singleton Copley +John Stezaker +John Totleben +John Wayne Gacy +John Whitcomb +John Wilhelm +John William Waterhouse +Jon Klassen +Jon McCoy +Jon Whitcomb +Jordan Grimmer +Jorge Jacinto +Josan Gonzalez +José Clemente Orozco +Josef Albers +Joseph Cornell +Joseph Ducreux +Joseph Lorusso +Joseph Mallord William Turner +Joseph Stella +Josephine Wall +Josh Kao +Josh Keyes +Jovana Rikalo +Józef Mehoffer +Juan Gris +Judy Chicago +Juergen Teller +Jules Bastien-Lepage +Julia Contacessi +Julian Calle +Juliana Huxtable +Julie Bell +Julie Blackmon +Julie Mehretu +Julien Delval +Julius Horsthuis +Jun Kaneko +Junji Ito +Justin Gerard +Kadir Nelson +Kaethe Butcher +Kapwani Kiwanga +Karel Appel +Karel Thole +Karen Wallis +Karl Blossfeldt +Karl Schmidt-Rottluff +Karol Bak +Kasia Nowowiejska +Kate Beaton +Kate Greenaway +Käthe Kollwitz +Kathryn Morris Trotter +Kati Horna +Katsuhiro Otomo +Katsushika Hokusai +Kawanabe Kyōsai +Kaws +Kay Nielsen +Kay Sage +Kazimir Malevich +Kazuo Koike +Kehinde Wiley +Keith Haring +Keith Negley +Keith Parkinson +Kelly Freas +Kelly Mckernan +Kelly Sue Deconnick +Kelly Vivanco +Ken Fairclough +Ken Kelly +Ken Sugimori +Kengo Kuma +Kenne Gregoire +Kent Monkman +Kentaro Miura +Kevin Gnutzmans +Kevin Sloan +Kieron Gillen +Kilian Eng +Kim Jung Gi +Kim Keever +Kitagawa Utamaro +Kitty Lange Kielland +Klaus Burgle +Klaus Janson +Klaus Wittmann +Kobayashi Kiyochika +Konstantin Korovin +Konstantin Yuon +Koson Ohara +Krenz Cushart +Kris Kuksi +Kuang Hong +Kunisada +Kuno Veeber +Kurzgesagt +L. Birge Harrison +Lady Pink +Larry Elmore +Larry Poons +Larry Sultan +László Moholy-Nagy +Laurel Burch +Laurent Grasso +Laurie Greasley +Laurie Lipton +Lawren Harris +Lee Krasner +Lee Madgwick +Lee Quinones +Leiji Matsumoto +Leon Kossoff +Leonardo Da Vinci +Leonetto Cappiello +Leonid Afremov +Leonora Carrington +Les Edwards +Lesley Vance +Leticia Gillett +Liam Wong +Liang Mark +Lisa Frank +Lisa Keene +Liu Ye +Liubov Sergeevna Popova +Lois van Baarle +Loish +Lorena Alvarez Gómez +Lorenz Hideyoshi +Loretta Lux +Lori Earley +Louis Comfort Tiffany +Louis Glackens +Louis Icart +Louis Janmot +Louis Rhead +Louis Wain +Louise Bourgeois +Louise Dahl-Wolfe +Lovis Corinth +Luca Boni +Lucas Cranach the Elder +Lucian Freud +Lucy Madox Brown +Ludwig Mies van der Rohe +Luis Royo +Luisa Russo +Lynd Ward +Lynda Barry +Lynda Benglis +Lyonel Feininger +Lyubov Popova +M.C. Escher +M.W. Kaluta +Mab Graves +Maginel Wright Enright Barney +Magnus Enckell +Makoto Shinkai +Malcolm Liepke +Man Ray +Mandy Disher +Mao Hamaguchi +Marat Latypov +Marc Chagall +Marc Davis +Marc Samson +Marc Simonetti +Marcin Jakubowski +Marco Mazzoni +Marcus Selmer +Marek Okon +Margaret Brundage +Margaret Macdonald Mackintosh +Margaret Mee +Margaux Valonia +Maria Kreyn +Maria Pascual Alberich +Maria Sibylla Merian +Marianne North +Marianne von Werefkin +Marie Guillemine Benoist +Marie Spartali Stillman +Marina Abramović +Marius Borgeaud +Marjane Satrapi +Mark Arian +Mark Briscoe +Mark Brooks +Mark Keathley +Mark Lovett +Mark Rothko +Mark Ryden +Mark Seliger +Marsden Hartley +Martin Ansin +Martin Deschambault +Martin John Heade +Martin Johnson Heade +Martin Kippenberger +Martine Johanna +Martiros Saryan +Mary Anning +Mary Blair +Mary Cassatt +Masaaki Masamoto +Masaaki Sasamoto +Masamune Shirow +Mat Collishaw +Mati Klarwein +Matias Hannecke +Matt Bors +Matt Fraction +Matt Groening +Matthias Grünewald +Matthias Jung +Matti Suuronen +Maurice Sendak +Max Beckmann +Max Dupain +Max Ernst +Max Pechstein +Max Weber +Maxfield Parrish +Maximilian Pirner +Maximilien Luce +Maxwell Boas +Mead Schaeffer +Méret Oppenheim +Meryl McMaster +Michael Carson +Michael Cheval +Michael Deforge +Michael Heizer +Michael Hutter +Michael Parkes +Michael Sowa +Michael Whelan +Michal Karcz +Michal Lisowski +Michelangelo Buonarroti +Michelangelo Merisi Da Caravaggio +Mickalene Thomas +Miho Hirano +Mikalojus Konstantinas Ciurlionis +Mike Campau +Mike Deodato +Mike Mayhew +Mike Mignola +Mike Winkelmann (Beeple) +Mike Worrall +Mikhail Larionov +Mikhail Nesterov +Mikhail Vrubel +Mikko Lagerstedt +Milo Manara +Milton Avery +Milton Caniff +Milton Glaser +Miriam Schapiro +Moebius +Mordecai Ardon +Mort Kunstler +Muxxi +Nagel Patrick +Nan Goldin +Naoki Urasawa +Naoko Takeuchi +Naomi Okubo +Naoto Hattori +Natalia Goncharova +Nathan Coley +Nathan Wirth +NC Wyeth +Neil Boyle +Neil Welliver +Nele Zirnite +NHK Animation +Ni Chuanjing +Nicholas Roerich +Nick Knight +Nick Sharratt +Nick Silva +Nicola Samori +Nicolas de Stael +Nicolas Delort +Nicolas Mignard +Nikolai Ge +Nikolina Petolas +Noah Bradley +Nobuyoshi Araki +Noelle Stevenson +Noriyoshi Ohrai +Norman Ackroyd +Norman Bluhm +Norman Foster +Norman Rockwell +Octavio Ocampo +Odd Nerdrum +Odilon Redon +Ogawa Kazumasa +Ohara Koson +Olafur Eliasson +Oleg Oprisco +Olga Skomorokhova +Olivier Bonhomme +Olivier Valsecchi +Ollie Hoff +Os Gemeos +Os Gêmeos +Osamu Tezuka +Oscar Domínguez +Óscar Domínguez +OSGEMEOS +Oskar Fischinger +Oskar Kokoschka +Ossip Zadkine +Otto Dix +Otto Marseus van Schrieck +Pablo Picasso +Pamela Colman Smith +Paolo Roversi +Paolo Veronese +Pascal Blanche +Pascale Campion +Patrice Murciano +Patricia Polacco +Patrick Brown +Patrick Caulfield +Patrick Dougherty +Patrick Heron +Patrick Woodroffe +Paul Barson +Paul Cézanne +Paul Chadeisson +Paul Corfield +Paul Delvaux +Paul Gauguin +Paul Gustav Fischer +Paul Henry +Paul Klee +Paul Laffoley +Paul Lehr +Paul Ranson +Paul Strand +Paul Wonner +Paula Modersohn-Becker +Paulus Potter +Pawel Kuczynski +Peter Andrew Jones +Peter Bagge +Peter De Seve +Peter Doig +Peter Elson +Peter Gric +Peter Holme III +Peter Howson +Peter Kemp +Peter Max +Peter Milligan +Peter Mohrbacher +Peter Paul Rubens +Peter Sculthorpe +Peter Wileman +Peter Zumthor +Phil Foglio +Phil Jimenez +Phil Koch +Phil Noto +Philip Guston +Philippe Druillet +Philippe Parreno +Pierre Bonnard +Pierre Puvis de Chavannes +Pierre-Auguste Renoir +Piet Hein Eek +Piet Mondrian +Pieter Aertsen +Pieter Bruegel The Elder +Pieter Claesz +Pieter de Hooch +Pieter Jansz Saenredam +Piotr Jabłoński +Pipilotti Rist +Pixar Concept Artists +Pixar +Posuka Demizu +Qian Xuan +Qing Han +Quentin Blake +Quentin Tarantino +Quint Buchholz +Rafael Albuquerque +Rafał Olbiński +Raffaello Sanizo +Raina Telgemeier +Raja Ravi Varma +Ralph Horsley +Ralph McQuarrie +Ralph Steadman +Ramon Casas +Randolph Caldecott +Raphael Lacoste +Raphael +Raphaelle Peale +Ravi Zupa +Ray Caesar +Ray Donley +Raymond Briggs +Raymond Duchamp-Villon +Raymond Leech +Raymond Swanland +Rayner Alencar +Rebeca Saray +Rebecca Guay +Rebecca Louise Law +Rebecca Sugar +Reginald Marsh +Rembrandt Van Rijn +Remedios Varo +René Lalique +Rene Laloux +Rene Magritte +RETNA (Marquis Lewis) +Reylia Slaby +RHADS +Rich Davies +Richard Burlet +Richard Corben +Richard Dadd +Richard Deacon +Richard Diebenkorn +Richard Doyle +Richard Eurich +Richard Hamilton +Richard Lindner +Richard McGuire +Richard Misrach +Richard S. Johnson +Richard Scarry +Rick Guidice +ROA +Rob Gonsalves +Rob Liefeld +Robby Cavanaugh +Robert Antoine Pinchon +Robert Chew +Robert Childress +Robert Crumb +Robert Farkas +Robert Hagan +Robert Irwin +Robert M Cunningham +Robert Maguire +Robert McCall +Robert Mcginnis +Robert Motherwell +Robert Neubecker +Robert Rauschenberg +Robert S. Duncanson +Robert Stivers +Robert Vonnoh +Robert William Hume +Robert Williams +Roberto Ferri +Roberto Matta +Roberto Parada +Rockwell Kent +Rodney Matthews +Rodríguez ARS +Roger Ballen +Roger de La Fresnaye +Roger Dean +Rolf Armstrong +Romero Britto +Ron Mueck +Ron Walotsky +Ronald Balfour +Ross Tran +Roy Gjertson +Roy Lichtenstein +Roz Chast +Ruan Jia +Rudolf Freund +Rufino Tamayo +Rumiko Takahashi +Russ Mills +Russell Ayto +Ruth Bernhard +Ruxing Gao +Ryan Hewett +Ryan McGinley +Ryan Stegman +Ryohei Hase +Sacha Goldberger +Sailor Moon +Sakai HoÌ„itsu +Sally Mann +Salomon van Ruysdael +Salvador Dali +Sam Bosma +Sam Kieth +Sam Spratt +Samuel and Joseph Newsom +Samuel Earp +Samuel Melton Fisher +Sandra Chevrier +Sandro Botticelli +Sandy Skoglund +Saner Edgar +Sanford Kossin +Sangyeob Park +Santiago Calatrava +Santiago Caruso +Sara Wollfalk +Sarah Lucas +Satoshi Kon +Saturno Butto +Saul Bass +Saul Steinberg +Saul Tepper +Scarlett Hooft Graafland +Scott Brundage +Scott Listfield +Scott Naismith +Sean Scully +Sean Yoro +Seb Mckinnon +Sebastian Errazuriz +Serge Marshennikov +Shaddy Safadi +Shaun Tan +Shawn Coss +Sheilah Beckett +Shepard Fairey +Sherree Valentine Daines +Shin Jeongho +Shinji Aramaki +Shintaro Kago +Shohei Otomo +Shotaro Ishinomori +Shusei Nagaoko +Sidney Nolan +Silvestro Lega +Simeon Solomon +Simon Birch +Simon Bisley +Simon Stalenhag +Simone Martini +Sir James Guthrie +Siya Oum +Skottie Young +Slim Aarons +Sofonisba Anguissola +Sonia Delaunay +Sou Fujimoto +Sparth +Squeak Carnwath +Stan And Jan Berenstain +Stan Lee +Stanislav Poltavsky +Stanisław Szukalski +Stanley Donwood +Stephan Martiniere +Stephen Gammell +Stephen Oakley +Stephen Shore +Stevan Dohanos +Steve Argyle +Steve Dillon +Steve Ditko +Steve Henderson +Steve Lieber +Steve McCurry +Steven Belledin +Storm Thorgerson +Stuart Davis +Stuart Haygarth +Stuart Immonen +Studio Ghibli +Sue Bryce +Susan Luo +Susan Seddon Boulet +Sven Nordqvist +Syd Mead +Sydney Edmunds +Sydney Prior Hall +Tadao Ando +Taiyō Matsumoto +Takashi Murakami +Takato Yamamoto +Takeshi Obata +Tamara Lempicka +Tan Zhi Hui +Tara McPherson +Tari Márk Dávid +Tatsuro Kiuchi +teamLab +Ted Nasmith +Ted Wallace +Teophilus Tetteh +Terada Katsuya +Teresa Ramos +Terry Oakes +Terry Redlin +Tex Avery +theCHAMBA +Theo van Rysselberghe +Théodore Géricault +Thomas Allom +Thomas Benjamin Kennington +Thomas Blackshear +Thomas Cole +Thomas Dodd +Thomas Eakins +Thomas Gainsborough +Thomas Kinkade +Thomas Moran +Thomas Rowlandson +Thomas Saliot +Thomas Struth +Thomas Visscher +Thomas W Schaller +Thornton Oakley +Tibor Nagy +Till Freitag +Tim Burton +Tim Doyle +Tim Hildebrandt +Tim White +Tintoretto +Titian +Todd McFarlane +Todd Schorr +Toei Animations +Tokujin Yoshioka +tokyogenso +Tom Bagshaw +Tom Hammick +Tom Lovell +Tom Roberts +Tom Thomson +Tom Whalen +Tomasz Alen Kopera +Tomer Hanuka +Tomi Ungerer +Tomma Abts +Tomokazu Matsuyama +Tony DiTerlizzi +Tony Moore +Toshiharu Mizutani +Toumas Korpi +Tove Jansson +Tracey Emin +Travis Louie +Tristan Eaton +Tsutomu Nihei +Tyler Edlin +Tyler Shields +Tyler West +Ub Iwerks +Uemura Shoen +Ul Di Rico +Umberto Boccioni +Utagawa Hiroshige +Valerie Hegarty +Vhils +Victo Ngai +Victor Adame Minguez +Victor Brauner +Victor Medina +Victor Moscoso +Victor Nizovtsev +Victor Vasarely +Victoria Crowe +Viktor Vasnetsov +Viktoria Gavrilenko +Vincent Di Fate +Vincent Tanguay +Vincent Van Gogh +Virgil Finlay +Vito Acconci +Vittorio Matteo Corcos +Vivian Maier +Viviane Sassen +Vivienne Westwood +Vladimir Kush +W. Heath Robinson +W.W. Denslow +Wadim Kashin +Walt Disney +Walt Kelly +Walter Crane +Walter Kim +Walter Langley +Walter Percy Day +Wangechi Mutu +Warren Ellis +Warwick Globe +Wassily Kandinsky +Wayne Barlowe +Wendy Froud +Wes Anderson +Wilfredo Lam +Will Barnet +Will Eisner +Willem de Kooning +Willem van Haecht +William Blake +William Eggleston +William Etty +William Gropper +William Henry Hunt +William Hogarth +William Holman Hunt +William Kentridge +William Morris +William S. Burroughs +William Steig +William Stout +William Wegman +William Zorach +William-Adolphe Bouguereau +Wim Crouwel +Wim Wenders +Winslow Homer +Winsor McCay +Wojciech Ostrycharz +Wolf Kahn +Wolfgang Tillmans +Worthington Whittredge +Yaacov Agam +Yang Jialun +Yanjun Cheng +Yasuo Kuniyoshi +Yasushi Nirasawa +Yasutomo Oka +Yayi Morales +Yayoi Kusama +Yiannis Moralis +Yinka Shonibare +Yohann Schepacz +Yoji Shinkawa +Yoshitaka Amano +Yoshiyuki Tomino +Yue Minjun +Yuri Ivanovich Pimenov +Yuumei +Yves Klein +Yves Tanguy +Zack Snyder +Zaha Hadid +Zanele Muholi +Zdzisław Beksiński +Zeen Chin +Zhang Kechun +Zhelong Xu +Zhichao Cai +Zinaida Serebriakova diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/aspect-ratio.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/aspect-ratio.txt" new file mode 100644 index 0000000000000000000000000000000000000000..44b3d3cfc3f74ecd3863f420b290289130d39e0a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/aspect-ratio.txt" @@ -0,0 +1,13 @@ +4:3 aspect ratio +5:4 aspect ratio +2:1 aspect ratio +1.33:1 aspect ratio +16:9 aspect ratio +1.85:1 aspect ratio +2.39:1 anamorphic aspect ratio +2.76:1 (70mm) aspect ratio +1.37:1 (Academy) aspect ratio +2.59:1 (Cinerama) aspect ratio +2.65:1 (Cinerama) aspect ratio +2.35:1 (Cinemascope) aspect ratio +2.66:1 (Cinemascope) aspect ratio diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_genre.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_genre.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7ead14669f1ba3212284e1a87b912ee0c9cdea75 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_genre.txt" @@ -0,0 +1,4 @@ +Heavy Metal +Jazz +K-pop +Punk \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1417ec3169d11a2ca6da89bb7e0f0b692ee6d102 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_misc.txt" @@ -0,0 +1,17 @@ +audio visualizer +horn +vuvuzela +instrument +instrument case +guitar case +playing instrument +music +notes +musical note +spoken musical note +treble clef +bass clef +sheet music +music stand +singing +karaoke \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_places.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_places.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6547b73f8cde1bdd0273a6ce7407e3bcdb94ec90 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_places.txt" @@ -0,0 +1,4 @@ +concert +radio booth +recording studio +stage \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_professions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_professions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..93c8eb9e986bd43e44749ab85819c38b1c6566fb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/audio_professions.txt" @@ -0,0 +1,4 @@ +conductor +dj +guitar girl +idol \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_brass.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_brass.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6a94c3345e05248469d9c6e0ddb5a880be7aa264 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_brass.txt" @@ -0,0 +1,9 @@ +bugle +euphonium +flugelhorn +french horn +sousaphone +trombone +bass trombone +trumpet +tuba \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_keyboard.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_keyboard.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b421731fa150b6e04c88c49545eb1d5e5d76b555 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_keyboard.txt" @@ -0,0 +1,16 @@ +accordion +concertina +melodica +harpsichord +keyboard +keytar +synthesizer +organ +electric organ +pipe organ +piano +piano keys +clavia nord +fender rhodes +grand piano +upright piano \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7344e2bb95999849a34478adaddfd7cbc2b563da --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_other.txt" @@ -0,0 +1,2 @@ +jaw harp +theremin \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_percussion.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_percussion.txt" new file mode 100644 index 0000000000000000000000000000000000000000..dad6f1b135898a188b7cd2171af9688ab3d1b523 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_percussion.txt" @@ -0,0 +1,20 @@ +bell +cowbell +handbell +jingle bell +castanets +cymbals +drum +bongo drums +conga drums +drumsticks +taiko drum +timpani +tsuzumi +maracas +marimba +pyeongyeong +sistrum +tambourine +triangle +xylophone \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_strings.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_strings.txt" new file mode 100644 index 0000000000000000000000000000000000000000..cda526924b633557e997346599527d2588aef361 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_strings.txt" @@ -0,0 +1,33 @@ +banjo +balalaika +bass guitar +bow +cello +double bass +electric double bass +erhu +gayageum +guitar +acoustic guitar +electric guitar +guqin +guzheng +harp +lyre +kantele +kokyuu +koto +lute +biwa lute +mandolin +morin khuur +pipa +ruan +sitar +shamisen +tonkori +ukulele +viola +violin +wolgeum +yueqin \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_woodwinds.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_woodwinds.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8916ef6ccfe28769a06086c4a8e93073d1454059 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/instruments/audio_instruments_woodwinds.txt" @@ -0,0 +1,13 @@ +bagpipes +bassoon +clarinet +crumhorn +electric wind instrument +flute +bass flute +fife +harmonica +oboe +ocarina +recorder +saxophone \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_media.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_media.txt" new file mode 100644 index 0000000000000000000000000000000000000000..474bc9dc9e7516a527f4287d6157356f287df740 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_media.txt" @@ -0,0 +1,3 @@ +cassette tape +cd +record \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7af6aa64d65125b767c9b5bfa7c316387d9c02aa --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_other.txt" @@ -0,0 +1,10 @@ +amplifier +tube amplifier +megaphone +microphone +microphone stand +pop filter +studio microphone +vintage microphone +boom microphone +mixing console \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_players.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_players.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0e51f6b0f4713b6aecacc63a8b08e1f0a9095c8b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_players.txt" @@ -0,0 +1,9 @@ +cd player +digital media player +ipod +iphone +phonograph +turntable +radio +stereo +walkman \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_speakers_and_headphones.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_speakers_and_headphones.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0d3baefdc68320ebdc4109da967ee6458763c865 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/audio/playback/audio_playback_speakers_and_headphones.txt" @@ -0,0 +1,12 @@ +earphones +earbuds +earpiece +ear protection +earmuffs +headphones +behind-the-head headphones +bunny headphones +cat ear headphones +headphones for animal ears +headset +speaker \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/bird.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/bird.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1433cfc4fe949cdaef1fb0dbbc7266f24506b6a3 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/bird.txt" @@ -0,0 +1,237 @@ +Greater White-fronted Goose +Snow Goose +Cackling Goose +Brant +Canada Goose +Mute Swan +Trumpeter Swan +Tundra Swan +Wood Duck +Gadwall +American Wigeon +American Black Duck +Mallard +Blue-winged Teal +Northern Shoveler +Northern Pintail +Green-winged Teal +Canvasback +Redhead +Ring-necked Duck +Greater Scaup +Lesser Scaup +Long-tailed Duck +Bufflehead +Common Goldeneye +Hooded Merganser +Common Merganser +Red-breasted Merganser +Ruddy Duck +(Ring-necked Pheasant) +Ruffed Grouse +Wild Turkey +(Northern Bobwhite) +Common Loon +Pied-billed Grebe +Horned Grebe +Red-neck Grebe +Double-crested Cormorant +American Bittern +Great Blue Heron +Great Egret +Little Blue Heron +Green Heron +Black-crowned Night-Heron +Turkey Vulture +Osprey +Bald Eagle +Northern Harrier +Sharp-shinned Hawk +Cooper’s Hawk +Northern Goshawk +Red-shouldered Hawk +Broad-winged Hawk +Red-tailed Hawk +Rough-legged Hawk +Golden Eagle +American Kestrel +Merlin +Peregrine Falcon +Virginia Rail +Sora +American Coot +Sandhill Crane +Semipalmated Plover +Killdeer +Spotted Sandpiper +Solitary Sandpiper +Greater Yellowlegs +Lesser Yellowlegs +White-rumped Sandpiper +Upland Sandpiper +Ruddy Turnstone +Short-billed Dowitcher +Semipalmated Sandpiper +Least Sandpiper +Baird’s Sandpiper +Pectoral Sandpiper +Dunlin +Wilson’s Snipe +American Woodcock +Bonaparte’s Gull +Laughing Gull +Ring-billed Gull +Herring Gull +Thayer’s Gull +Iceland Gull +Lesser Black-backed Gull +Glaucous Gull +Great Black-backed Gull +Caspian Tern +Black Tern +Common Tern +Rock Pigeon +Mourning Dove +Yellow-billed Cuckoo +Black-billed Cuckoo +Eastern Screech-Owl +Great Horned Owl +Barred Owl +Short-eared Owl +Long-eared Owl +Northern Saw-whet Owl +Common Nighthawk +Eastern Whip-poor-will +Chimney Swift +Ruby-throated Hummingbird +Belted Kingfisher +Red-headed Woodpecker +Red-bellied Woodpecker +Yellow-bellied Sapsucker +Downy Woodpecker +Hairy Woodpecker +Black-backed Woodpecker +Northern Flicker +Pileated Woodpecker +Olive-sided Flycatcher +Eastern Wood-Pewee +Yellow-bellied Flycatcher +Alder Flycatcher +Willow Flycatcher +Least Flycatcher +Eastern Phoebe +Great Crested Flycatcher +Eastern Kingbird +Northern Shrike +White-eyed Vireo +Yellow-throated Vireo +Blue-headed Vireo +Warbling Vireo +Philadelphia Vireo +Red-eyed Vireo +Blue Jay +American Crow +Common Raven +Horned Lark +Northern Rough-winged Swallow +Tree Swallow +Bank Swallow +Barn Swallow +Cliff Swallow +Purple Martin +Black-capped Chickadee +Red-breasted Nuthatch +White-breasted Nuthatch +Brown Creeper +Carolina Wren +House Wren +Winter Wren +Marsh Wren +Blue-gray Gnatcatcher +Golden-crowned Kinglet +Ruby-crowned Kinglet +Eastern Bluebird +Veery +Gray-cheeked Thrush +Swainson’s Thrush +Hermit Thrush +Wood Thrush +American Robin +Gray Catbird +Northern Mockingbird +Brown Thrasher +European Starling +American Pipit +Cedar Waxwing +Lapland Longspur +Snow Bunting +Ovenbird +Northern Waterthrush +Blue-winged Warbler +Golden-winged Warbler +Black-and-white Warbler +Tennessee Warbler +Orange-crowned Warbler +Nashville Warbler +Connecticut Warbler +Mourning Warbler +Common Yellowthroat +Hooded Warbler +American Redstart +Cape May Warbler +Northern Parula +Magnolia Warbler +Bay-breasted Warbler +Blackburnian Warbler +Yellow Warbler +Chestnut-sided Warbler +Blackpoll Warbler +Black-throated Blue Warbler +Palm Warbler +Pine Warbler +Yellow-rumped Warbler +Prairie Warbler +Black-throated Green Warbler +Canada Warbler +Wilson’s Warbler +Yellow-breasted Chat +Spotted Towhee +Eastern Towhee +American Tree Sparrow +Chipping Sparrow +Clay-coloured Sparrow +Field Sparrow +Vesper Sparrow +Savannah Sparrow +Grasshopper Sparrow +LeConte’s Sparrow +Nelson’s Sparrow +Fox Sparrow +Song Sparrow +Lincoln’s Sparrow +Swamp Sparrow +White-throated Sparrow +White-crowned Sparrow +Dark-eyed Junco +Scarlet Tanager +Northern Cardinal +Rose-breasted Grosbeak +Indigo Bunting +Bobolink +Red-winged Blackbird +Eastern Meadowlark +Yellow-headed Blackbird +Rusty Blackbird +Common Grackle +Brown-headed Cowbird +Orchard Oriole +Baltimore Oriole +Purple Finch +House Finch +Red Crossbill +White-winged Crossbill +Common Redpoll +Pine Siskin +American Goldfinch +Evening Grosbeak +House Sparrow diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..016bc96260e06931be7986d570e362caa3a23ce2 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_main.txt" @@ -0,0 +1 @@ +bird \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_mythological.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_mythological.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e9b2050ee8ab745b3745c9b4183ccf9d0d90f2b6 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_mythological.txt" @@ -0,0 +1 @@ +Phoenix \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_real.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_real.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9590ac42697ce0fe8191118367a3dcc725a23277 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/birds/birds_real.txt" @@ -0,0 +1,179 @@ +Albatross +Southern Royal Albatross +Barred Buttonquail +Blue Jay +Bluebird +Blue-footed Booby +Cassowary +Southern Cassowary +Chick +Chicken +Rooster +Cockatiel +Cockatoo +Blue-eyed Cockatoo +Major Mitchell's Cockatoo +Palm Cockatoo +Sulphur-crested Cockatoo +Yellowtip Cockatoo +Condor +Andean Condor +Cormorant +Crane (animal) +Black Crowned Crane +Grey Crowned Crane +Red-crowned Crane +Siberian Crane +Crow +Carrion Crow +Cuckoo +Dodo (bird) +Duck +Duckling +Mallard +Dusky Thrush +Eagle +Bald Eagle +Haast's Eagle +Little Egret +Emu +Falcon +Peregrine Falcon +Finch +Eurasian Bullfinch +Japanese Grosbeak +Double-barred Finch +Gouldian Finch +Java Sparrow +Zebra Finch +Flamingo +Frogmouth +Blue-and-white Flycatcher +Hill Blue Flycatcher +Narcissus Flycatcher +Verditer Flycatcher +Yellow-rumped Flycatcher +Goose +Greater White-Fronted Goose +Canada goose +Hawk +Goshawk +Japanese Sparrowhawk +Harris's Hawk +Heron +Chinese Pond Heron +Great Blue Heron +Grey Heron +Hummingbird +Japanese Bush Warbler +Japanese Robin +Japanese White-eye +Common Kingfisher +Crested Kingfisher +Kookaburra +Kiwi +Southern Brown Kiwi +Kori Bustard +Lilac-breasted Roller +Long-tailed Widowbird +Magpie +Azure-winged Magpie +Black-throated Magpie-Jay +Taiwan Blue Magpie +Meadow Bunting +Moa (animal) +South Island Giant Moa +Nightingale (bird) +Northern Cardinal +Northern Lapwing +Nuthatch +Oriental Dollarbird +Black-headed Oriole +Black-naped Oriole +Osprey +Ostrich +Owl +Snowy Owl +Southern Boobook +Parakeet +Golden Parakeet +Jandaya Parakeet +Rose-ringed Parakeet +Parrot +African Grey Parrot +Black-headed Parrot +Eclectus Parrot +Kakapo +Kea (animal) +Lovebird +Macaw +Blue-and-yellow Macaw +Hyacinth Macaw +Red-and-green Macaw +Scarlet Macaw +Pacific Parrotlet +Yellow-headed Amazon +Peacock +Pelican +Penguin +Adelie Penguin +African Penguin +Emperor Penguin +Fiordland Penguin +Galapagos Penguin +Humboldt Penguin +Little Penguin +Rockhopper Penguin +Yellow-eyed Penguin +Anthropornis +Pheasant +Copper Pheasant +Ijima Copper Pheasant +Golden Pheasant +Green Pheasant +Lady Amherst's Pheasant +Pheasant-tailed Jacana +Pigeon +Dove +Fantail Pigeon +Passenger Pigeon +White-bellied Green Pigeon +Plover (animal) +Puffin +Quail +Japanese Quail +Raggiana Bird-of-paradise +Red-flanked Bluetail +Blue-fronted Redstart +Daurian Redstart +Resplendent Quetzal +Roadrunner (animal) +Ryukyu Minivet +Seagull +Secretarybird +Shoebill +Siberian Rubythroat +Sparrow +Eurasian Tree Sparrow +House Sparrow +Stork +Oriental Stork +Swallow +Tree Swallow +Swan +Black Swan (bird) +Takahe +Great Tit +Japanese Tit +Long-tailed Tit +Toucan +Vulture +Black-backed Wagtail +Grey Wagtail +White Wagtail +Japanese Wagtail +White-bellied Go-away-bird +White-cheeked Starling +Woodpecker +Great Spotted Woodpecker +Yellow Bittern \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/camera-manu.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/camera-manu.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fd2d1f50a00d7a24298d7d5c70df5d66dedb94ff --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/camera-manu.txt" @@ -0,0 +1,10 @@ +Canon +Nikon +Sony +Fujifilm +Olympus +Panasonic +Pentax +Leica +Hasselblad +Kodak diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/camera.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/camera.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3096f5a960159e99739936600305bbf7afb18da8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/camera.txt" @@ -0,0 +1,73 @@ +Canon EOS Rebel T7 DSLR Camera +Sony a7 IV Mirrorless Camera +Canon EOS R6 Mirrorless Camera +Canon EOS 5D Mark IV DSLR Camera +Sony a6600 Mirrorless Camera +Canon EOS Rebel T7 DSLR Camera +Canon EOS R6 Mirrorless Camera +Canon EOS 5D Mark IV DSLR Camera +Canon EOS RP Mirrorless Camera +Canon EOS Rebel T8i DSLR Camera +Canon EOS R5 Mirrorless Camera +Canon EOS R Mirrorless Camera +Canon EOS M6 Mark II Mirrorless Camera +Canon EOS M50 Mark II Mirrorless Camera +Canon EOS 90D DSLR Camera +Sony a7 IV Mirrorless Camera +Sony a6600 Mirrorless Camera +Sony a7 III Mirrorless Camera +Sony a6400 Mirrorless Camera +Sony a7R IV Mirrorless Camera +Sony a7S III Mirrorless Camera +Sony ZV-E10 Mirrorless Camera +Sony a9 II Mirrorless Camera +Sony a7C Mirrorless Camera +Sony a6100 Mirrorless Camera +Nikon Z6 II Mirrorless Camera +Nikon Z7 II Mirrorless Camera +Nikon D3500 DSLR Camera +Nikon D850 DSLR Camera +Nikon Z fc Mirrorless Camera +Nikon D7500 DSLR Camera +Nikon Z5 Mirrorless Camera +Fujifilm X-H2S Mirrorless Camera +Fujifilm X-T30 II Mirrorless Camera +Panasonic Lumix G7 Mirrorless Camera +Fujifilm X-S10 Mirrorless Camera +Panasonic Lumix GH6 Mirrorless Camera +Fujifilm X-T4 Mirrorless Camera +Panasonic Lumix GH5 II Mirrorless Camera +Vest Pocket Autographic Kodak Camera +Kodak 616 Junior Camera +Kodak 35 Camera +Kodak PIXPRO FZ43 16MP Digital Camera +Kodak PIXPRO AZ252 16.15MP Digital Camera +Kodak PIXPRO FZ152 16MP Digital Camera +Kodak PIXPRO AZ241 16.15MP Digital Camera +Kodak PIXPRO AZ401 16MP Digital Camera +Kodak Smile 10MP Digital Camera +Yashica T4 Camera +Konica Hexar AF Camera +Nikon L35AF Camera +Olympus XA2 Camera +Olympus OM-1 Camera +Nikon F2 Camera +Nikon FE Camera +Pentax 67 Camera +Pentax K1000 Camera +Canon AE-1 Camera +Canon A2/A2e Camera +Plaubel Makina W67 Camera +Lubitel 2 +Kodak No 3A Folding Pocket Camera +Canon G III QL17 Camera +Polaroid SX-70 Alpha Camera +Graflex Century Graphic 2×3 Camera +Pentacon Six TL Camera +Zenza Bronica ETRSi Camera +Hasselblad 500C Camera +Contax G1/G2 Camera +Leica M3 Double Stroke Camera +Leica M6 Camera +Horseman 4×5 (45HD) Camera +Mamiya 7 II Camera diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_behavior.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_behavior.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6a6fb7ed5862688636e60542df695b4e672ecb66 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_behavior.txt" @@ -0,0 +1,5 @@ +cat loaf +kneading +nuzzling +scratching +stretching \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_breeds.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_breeds.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c87b0175e342a2df0e57b2731609b071a0e3655f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_breeds.txt" @@ -0,0 +1,4 @@ +norwegian forest cat +persian cat +ragdoll (cat) +siamese cat \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_coat.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_coat.txt" new file mode 100644 index 0000000000000000000000000000000000000000..01c26e61a9c25838953c4dcd469746549d453734 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_coat.txt" @@ -0,0 +1,9 @@ +black cat +brown cat +calico +grey cat +orange cat +pink cat +purple cat +tortoiseshell cat +white cat \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_felines.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_felines.txt" new file mode 100644 index 0000000000000000000000000000000000000000..171daad5c06c6094860991416b5ca5be70f9e0e6 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_felines.txt" @@ -0,0 +1,10 @@ +black panther +cheetah +cougar +jaguar +leopard +leopard cat +chinese leopard cat +lion +serval +tiger \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5d5b306e57150daee82d4324efba33bca060dbfe --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_main.txt" @@ -0,0 +1 @@ +cat \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..92dcca73e7867c878d7a5dbda921cdba5d218552 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_misc.txt" @@ -0,0 +1,4 @@ +cathead +giant cat +kitten +too many cats \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_places.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_places.txt" new file mode 100644 index 0000000000000000000000000000000000000000..97eade7ba6147700b4ab7e3bfc39f6373925b630 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_places.txt" @@ -0,0 +1,7 @@ +cat between legs +cat in mouth +cat on ass +cat on head +cat on lap +cat on person +cat on shoulder \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_related.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_related.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3ac9fec530c8453baa8e5f6ab9728d6ec8552a91 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cats/cats_related.txt" @@ -0,0 +1,45 @@ +cat bag +cat band +cat band footwear +cat band gloves +cat ear legwear +cat boy +cat bra +cat button +cat choker +cat costume +cat cutout +cat cutout panties +cat ear bikini +cat ear headphones +cat ear panties +cat earrings +cat ears +slit pupils +cat feet +cat girl +cat gloves +cat hair +cat hair ornament +cat hat +cat helmet +cat hood +cat lingerie +cat mask +cat nose +cat ornament +cat pajamas +cat panties +cat pasties +cat paw +cat paws +cat pendant +cat print +cat skull +cat slippers +cat symbol +cat tail +cat thigh strap +barbed tongue +cat zipper +whiskers \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/celeb.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/celeb.txt" new file mode 100644 index 0000000000000000000000000000000000000000..356feae3a110a37bc1136ba9e0090dc527b6aba4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/celeb.txt" @@ -0,0 +1,525 @@ +Addison Rae +Adele +Adriana Lima +Adrianne Palicki +Adèle Exarchopoulos +Aimee Lou Wood +AJ Lee +Alanna Masterson +Alejandra Guilmant +Alessandra Ambrosio +Alex Morgan +Alexa Bliss +Alexandra Daddario +Alexis Ren +Alica Schmidt +Alice Eve +Alicia Vikander +Alison Brie +Allison Stokke +Allison Williams +Aly Michalka +Aly Raisman +Alycia Debnam-Carey +Alyson Hannigan +Alyssa Milano +Amanda Cerny +Amanda Seyfried +Amber Heard +Amy Adams +Amy Smart +Ana de Armas +Angela Kinsey +Angela Sarafyan +Angelina Jolie +Anitta +Anna Akana +Anna Faris +Anna Kendrick +Anna Kournikova +Anna Paquin +AnnaSophia Robb +Anne Hathaway +Anya Chalotra +Anya Taylor-Joy +April Bowlby +Ariana Grande +Ariel Winter +Ashley Benson +Ashley Graham +Ashley Greene +Ashley Park +Ashley Tisdale +Aubrey Plaza +Auli’i Cravalho +Avril Lavigne +Awkwafina +Bailee Madison +Bar Refaeli +Barbara Palvin +Bebe Rexha +Becky G +Becky Lynch +Bella Hadid +Bella Thorne +Beth Behrs +Betty Gilpin +Beyoncé +Billie Eilish +Blake Lively +Bonnie Wright +Brenda Song +Brianna Hildebrand +Brie Bella +Brie Larson +Brighton Sharbino +Britney Spears +Brittany Snow +Brooklyn Decker +Bryce Dallas Howard +Caity Lotz +Cameron Diaz +Camila Cabello +Camila Mendes +Camilla Luddington +Candice Swanepoel +Cara Delevingne +Cardi B +Carey Mulligan +Carice van Houten +Carla Gugino +Carly Rae Jepsen +Caroline Vreeland +Cate Blanchett +Caylee Cowan +Charisma Carpenter +Charli D’Amelio +Charli XCX +Charlize Theron +Charlotte Flair +Charlotte McKinney +Chelsea Handler +Chloe Bennet +Chloe Kim +Chloë Grace Moretz +Chloë Sevigny +Chrissy Teigen +Christen Harper +Christian Serratos +Christina Aguilera +Christina Hendricks +Christina Ricci +Cindy Crawford +CJ Perry +Cobie Smulders +Courteney Cox +Cristin Milioti +Daisy Edgar-Jones +Daisy Ridley +Dakota Fanning +Dakota Johnson +Danielle Panabaker +Debby Ryan +Deborah Ann Woll +Demi Lovato +Demi Moore +Demi Rose +Denise Richards +Diana Silvers +Diane Guerrero +Diane Kruger +Diora Baird +Disha Patani +Dixie D’Amelio +Doja Cat +Dominique McElligott +Dove Cameron +Drew Barrymore +Dua Lipa +Eiza González +Eleanor Tomlinson +Elisabeth Shue +Elisha Cuthbert +Eliza Dushku +Elizabeth Banks +Elizabeth Debicki +Elizabeth Gillies +Elizabeth Hurley +Elizabeth Olsen +Elizabeth Turner +Elle Fanning +Ellie Goulding +Ellie Kemper +Elsa Hosk +Elsa Pataky +Elsie Hewitt +Emilia Clarke +Emily Bett Rickards +Emily Blunt +Emily Kinney +Emily Osment +Emily Ratajkowski +Emily VanCamp +Emma Kenney +Emma Mackey +Emma Raducanu +Emma Roberts +Emma Stone +Emma Watson +Emmanuelle Chriqui +Emmy Rossum +Erin Moriarty +Eugenie Bouchard +Eva Amurri +Eva Green +Eva Longoria +Eva Mendes +Evan Rachel Wood +Evangeline Lilly +Felicity Jones +Florence Pugh +Freya Allan +Gal Gadot +Gemma Arterton +Giada De Laurentiis +Gigi Hadid +Gillian Anderson +Gillian Jacobs +Gina Carano +Gisele Bündchen +Gwen Stefani +Gwyneth Paltrow +Hailee Steinfeld +Hailey Bieber +Halle Berry +Halsey +Hannah Simone +Hayden Panettiere +Hayley Atwell +Heather Graham +Heidi Klum +Helen Mirren +Helene Fischer +Hilary Duff +Holly Willoughby +HoYeon Jung +Hunter King +Hunter Schafer +Iggy Azalea +Ilana Glazer +Iliza Shlesinger +Imogen Poots +Ireland Baldwin +Irina Shayk +Iris Apatow +Isabela Merced +Isidora Goreshter +Isla Fisher +Jackie Cruz +Jaime Pressly +Jaimie Alexander +Jamie Chung +Jamie Lee Curtis +Jane Levy +January Jones +Jenna Coleman +Jenna Dewan +Jenna Fischer +Jennette McCurdy +Jennifer Aniston +Jennifer Connelly +Jennifer Garner +Jennifer Holland +Jennifer Lawrence +Jennifer Lopez +Jennifer Love Hewitt +Jennifer Morrison +Jenny Slate +Jessica Alba +Jessica Biel +Jessica Chastain +Jessica Parker Kennedy +Jessica Paré +Jessica Simpson +Jewel Staite +Jodie Comer +Joey King +JoJo +Jordana Brewster +Jordyn Huitema +Jordyn Jones +Josephine Skriver +Judy Greer +Julia Fox +Julia Garner +Julianne Hough +Julianne Moore +Julie Bowen +Juno Temple +Jutta Leerdam +Kaia Gerber +Kaitlyn Dever +Kaley Cuoco +Karen Fukuhara +Karen Gillan +Karlie Kloss +Kat Dennings +Kate Beckinsale +Kate Hudson +Kate Mara +Kate Upton +Kate Winslet +Katelyn Nacon +Katelyn Ohashi +Katherine Heigl +Katherine Langford +Katherine McNamara +Katherine Waterston +Katheryn Winnick +Kathryn Hahn +Katie Holmes +Katy Perry +Kaya Scodelario +Keira Knightley +Kelly Brook +Kelly Rohrbach +Kendall Jenner +Kerry Washington +Kesha +Khloé Kardashian +Kiernan Shipka +Kim Kardashian +Kimberley Garner +Kimiko Glenn +Kira Kosarin +Kirsten Dunst +Kourtney Kardashian +Kristen Bell +Kristen Stewart +Kristen Wiig +Kristin Kreuk +Krysten Ritter +Kylie Jenner +Kylie Minogue +Lacey Chabert +Lady Gaga +Lais Ribeiro +Lake Bell +Lana Del Rey +Larsen Thompson +Laura Harrier +Laura Prepon +Lauren Cohan +Lea Seydoux +Leelee Sobieski +Leila Lowfire +Lena Gercke +Lena Headey +Lena Meyer-Landrut +Leslie Mann +Lili Reinhart +Lili Simmons +Lily Chee +Lily Collins +Lily James +Lily Sheen +Lily-Rose Depp +Linda Cardellini +Lindsay Lohan +Lindsey Pelas +Lindsey Vonn +Lisa Kudrow +Liv Tyler +Lizzy Caplan +Lorde +Lucy Hale +Lucy Lawless +Lucy Liu +Lucy Pinder +Lupita Nyong’o +Lyndsy Fonseca +Mackenzie Davis +Mackenzie Foy +Mackenzie Ziegler +Maddie Ziegler +Madelaine Petsch +Madison Beer +Madison Pettis +Madisyn Shipman +Madonna +Maggie Grace +Maggie Gyllenhaal +Maia Mitchell +Maisie Williams +Malin Åkerman +Mandy Moore +Margaret Qualley +Margot Robbie +Maria Bakalova +Maria Menounos +Maria Sharapova +Mariah Carey +Marion Cotillard +Marisa Tomei +Mary Elizabeth Winstead +Maryse +María Pedraza +Maude Apatow +Maya Hawke +McKayla Maroney +Meagan Good +Megan Fox +Megan Thee Stallion +Meghan Markle +Melissa Benoist +Melissa Fumero +Melissa Rauch +Mena Suvari +Michelle Jenneke +Michelle Monaghan +Michelle Rodriguez +Michelle Trachtenberg +Michelle Williams +Miesha Tate +Mila Kunis +Milana Vayntrub +Miley Cyrus +Milla Jovovich +Millie Bobby Brown +Minka Kelly +Miranda Cosgrove +Miranda Kerr +Monica Bellucci +Morena Baccarin +Naomi Campbell +Naomi Scott +Naomi Watts +Natalia Dyer +Natalie Alyn Lind +Natalie Dormer +Natalie Portman +Natasha Lyonne +Nathalie Emmanuel +Neve Campbell +Nicki Minaj +Nicole Kidman +Nicole Scherzinger +Nikki Bella +Nikki Glaser +Nina Agdal +Nina Dobrev +Normani +Olga Kurylenko +Olivia Culpo +Olivia Munn +Olivia Rodrigo +Olivia Taylor Dudley +Olivia Wilde +Padma Lakshmi +Paige +Paige Spiranac +Paige VanZant +Palina Rojinski +Pamela Anderson +Paris Hilton +Paulina Gaitán +Penélope Cruz +Peyton List +Phoebe Tonkin +Pokimane +Pom Klementieff +Priyanka Chopra +Rachel Bilson +Rachel Brosnahan +Rachel Cook +Rachel McAdams +Rachel Weisz +Rachel Zegler +Rashida Jones +Rebecca Black +Rebecca Ferguson +Reese Witherspoon +Rhea Seehorn +Rihanna +Rita Ora +Robin Wright +Ronda Rousey +Ronja Forcher +Rooney Mara +Rosamund Pike +Rosario Dawson +Rose Byrne +Rose Leslie +Rose McIver +Rosie Huntington-Whiteley +Ruby O. Fee +Ruby Rose +Sabrina Carpenter +Sadie Sink +Salma Hayek +Samara Weaving +Sammi Hanratty +Sandra Bullock +Saoirse Ronan +Sara Sampaio +Sara Underwood +Sarah Chalke +Sarah Hyland +Sarah McDaniel +Sarah Michelle Gellar +Sarah Paulson +Sarah Shahi +Sarah Silverman +Sasha Alexander +Sasha Banks +Saweetie +Scarlett Johansson +Selena Gomez +Serena Williams +Shailene Woodley +Shakira +Shanola Hampton +Shantel VanSanten +Sharon Stone +Sienna Miller +Simone Biles +Sofía Vergara +Sophia Lillis +Sophia Thomalla +Sophie Turner +Stacy Keibler +Stephanie Beatriz +Stephanie McMahon +Susan Sarandon +Sydney Sweeney +Taylor Hill +Taylor Schilling +Taylor Swift +Tessa Thompson +Thomasin McKenzie +Tiffani Thiessen +Tina Fey +Tinashe +Torrie Wilson +Tove Lo +Tyra Banks +Uma Thurman +Ursula Corberó +Vanessa Hudgens +Vanessa Kirby +Vica Kerekes +Victoria De Angelis +Victoria Justice +Whitney Cummings +Willa Fitzgerald +Willa Holland +Winona Ryder +Yanet Garcia +Yvonne Strahovski +Zara Larsson +Zazie Beetz +Zendaya +Zoe Saldana +Zoey Deutch +Zooey Deschanel +Zoë Kravitz diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-galaxy.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-galaxy.txt" new file mode 100644 index 0000000000000000000000000000000000000000..140c2a6f37f78bece3bd32e8087851544d4afe0d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-galaxy.txt" @@ -0,0 +1,49 @@ +Abell Galaxy +Alcyoneus Galaxy +Andromeda Galaxy +Antennae Galaxies +Antila 2 galaxy +Backward Galaxy +Black Eye Galaxy +Bode's Galaxy +Butterfly Galaxies +Canes Venatici I galaxy +Canis Major Dwarf Galaxy +Cartwheel Galaxy +Cigar Galaxy +Circinus Galaxy +Coma Pinwheel Galaxy +Comet Galaxy +Condor Galaxy +Condor Galaxy +Cosmos Redshift 7 Galaxy +Crater II galaxy +Draco Dwarf Galaxy +Eye of Sauron Galaxy +Fireworks Galaxy +Hercules A Galaxy +Hoag's Object +Hockey Stick Galaxies +Large Magellanic Cloud +Lindsay-Shapley Ring +Little Sombrero Galaxy +Malin 1 Galaxy +Mayall's Object +Medusa Merger Galaxy +Mice Galaxies +Milky Way +Needle Galaxy +Pinwheel Galaxy +Sculptor Dwarf Galaxy +Sculptor Galaxy +Sculptor Galaxy +Small Magellanic Cloud +Sombrero Galaxy +Southern Pinwheel Galaxy +Sunflower Galaxy +Tadpole Galaxy +Triangulum Galaxy +Ursa Major I Dwarf Galaxy +Ursa Minor Galaxy +Whirlpool Galaxy +Wolf-Lundmark-Melotte diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-nebula.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-nebula.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2b1c3d37bf0b9a498341976526b8c8b97226d127 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-nebula.txt" @@ -0,0 +1,96 @@ +Dandelion Puffball Nebula +Bow-Tie Nebula +Ghost of Jupiter Nebula +Clown-faced Nebula +Ring Nebula +Cats Eye Nebula +Little Ghost Nebula +Medusa Nebula +Helix Nebula +Little Dumbbell Nebula +Owl Nebula +Twin Jet Nebula +Footprint Nebula +Butterfly Nebula +Jones-Emberson 1 Nebula +Lemon Slice Nebula +Soap Bubble Nebula +Skull Nebula +Oyster Nebula +Blue Flash Nebula +Fetus Nebula +Blue Snowball Nebula +Soccer Ball Nebula +Southern Crab Nebula +Red Spider Nebula +Bug Nebula +Saturn Nebula +Eight-burst Nebula +Spare-Tyre Nebula +Stingray Nebula +Retina Nebula +Spiral Planetary Nebula +Ant Nebula +Engraved Hourglass Nebula +Fleming Nebula +Southern Owl Nebula +Robins Egg Nebula +Cleopatras Eye +Box Nebula +Phantom Streak Nebula +Little Gem Nebula +The Box Nebula +Eye of Sauron Nebula +Boomerang Nebula +Calabash Nebula +Egg Nebula +Frosty Leo Nebula +Red Rectangle Nebula +Cotton Candy Nebula +Water lily nebula +Westbrook nebula +LL pegasi nebula +M1-92 Nebula +Minkowskis Butterfly Nebula +Bubble Nebula +California Nebula +Carina Nebula +Cave Nebula +Cone Nebula +Crescent Nebula +Double Helix Nebula +Eagle Nebula +Elephants Trunk Nebula +Eta Carinae Nebula +Flame Nebula +Gum Nebula +Heart Nebula +Homunculus Nebula +Horsehead Nebula +Lagoon Nebula +North America Nebula +Omega Nebula +Pistol Nebula +Rosette Nebula +Running Chicken Nebula +Soul Nebula +Tarantula Nebula +Trifid Nebula +Witch Head Neabula +Messier 43 Nebula +Messier 78 Nebula +Reflection Nebula +Pacman Nabula +Orion Nebula +Ghost Head Nebula +NGC 604 Nebula +Tarantula Nebula +Dragonfish nebula +RCW 49 nebula +Prawn Nebula +Serepns-Aquila Rift Nebula +Canis Major Overdenisty Nebula +Large Magellanic Cloud +Barnards Loop Nebula +Coalsack Nebula +Veil Nebula diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-star.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-star.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e674b8b39ec1e9c8e6963957750e244cd811bcbb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-star.txt" @@ -0,0 +1,88 @@ +Sol Star +Sirius Star +Canopus Star +Rigil Kentaurus Star +Toliman Star +Arcturus Star +Vega Star +Capella Star +Rigel Star +Procyon Star +Achernar Star +Betelgeuse Star +Hadar Star +Altair Star +Acrux Star +Aldebaran Star +Antares Star +Spica Star +Pollux Star +Fomalhaut Star +Deneb Star +Mimosa Star +Regulus Star +Adhara Star +Shaula Star +Castor Star +Gacrux Star +Bellatrix Star +Elnath Star +Miaplacidus Star +Alnilam Star +Gamma Velorum Star +Alnair Star +Alnitak Star +Alioth Star +Dubhe Star +Mirfak Star +Wezen Star +Sargas Star +Kaus Australis Star +Avior Star +Alkaid Star +Menkalinan Star +Atria Star +Alhena Star +Peacock Star +Alsephina Star +Mirzam Star +Alphard Star +Polaris Star +Hamal Star +Algieba Star +Diphda Star +Mizar Star +Nunki Star +Menkent Star +Mirach Star +Alpheratz Star +Rasalhague Star +Kochab Star +Saiph Star +Denebola Star +Algol Star +Tiaki Star +Muhlifain Star +Aspidiske Star +Suhail Star +Alphecca Star +Mintaka Star +Sadr Star +Eltanin Star +Schedar Star +Naos Star +Almach Star +Izar Star +Dschubba Star +Larawag Star +Merak Star +Ankaa Star +Enif Star +Scheat Star +Sabik Star +Phecda Star +Aludra Star +Markeb Star +Markab Star +Aljanah Star +Acrab Star diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-terms.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-terms.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9026ae9b3341d152d510a94815684498bddda0d8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/cosmic-terms.txt" @@ -0,0 +1,152 @@ +Giant GRB Ring +Keplers Law +Supermassive Black Hole +Supernova +Asteroid +Asteroid Belt +Asteroid Field +Asteroids +Astronaut +Atmosphere +Background Radiation +Big Bang Theory +Binary Star +Binary Star System +Binary System +Black Body +Celestial +Celestial Equator +Cislunar +Comet +Comets +Corona +Cosmic Dust +Cosmic Radiation +Cosmology +Cosmonaut +Cosmos +Crater +Dark Matter +Debris Field +Deep Space +Dwarf Planet +Dwarf Planets +Elliptical Orbit +Event Horizon +Exoplanet +Exoplanets +Extragalactic +Falling Star +Falres +Flare +Flyby +Galaxies +Galaxy +Gamma Ray +Gamma Rays +Geostationary +Gravitation +Gravitational Lens +Gravitational Lensing +Gravity +Heliosphere +Helium +Hubbles Law +Hygrogen +Hyperbolic Orbit +Hypercompact Steller System +Hypernova +Ice Giant +Inferior Planet +Inferior Planets +Inner Planets +Intersteller +Intersteller Dust +Ionosphere +Kuiper Belt +Lagrange Point +Lagrange Points +Lens +Light-Years Across +Local Group +Lunar +Magnitude +Mass +Meteor +Meteor Shower +Meteoroid +Meteors +Microgravity +Minor Planet +Moon +Moons +Nebula +Nebulas +Neutron Star +Neutron Stars +Nova +Novas +Oort Cloud +Orbital Eccentricity +Orbital Inclination +Outer Planets +Parallax +Perigee +Phase +Planet +Planetary Nebula +Planetary Nursery +Planetoid +Planetoids +Planets +Pulsar +Quasar +Quasars +Radiant +Rings +Roche Limit +Rogue Planet +Rogue Planets +Satellite +Satellites +Shooting Star +Singularity +Sky +Solar +Solar System +Solar Systems +Solar Wind +Solstice +Space Lab +Space Station +Spectrum +Starlight +Stratosphere +Subspace +Sunspot +Superior Planets +Supernovas +Telemetry +Telescope +Terrestrial +Total Eclipse +Totality +Transit +Translunar +Transneptunian +Twinkling +Umbra +Universe +Vacuum +Van Allen Belt +Variable Star +Variable Stars +Vernal Equinox +Visual Magnitude +Waning +Waxing +White Hole +Wightlessness +Wormhole +X-Rays +Zenith diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dinosaur.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dinosaur.txt" new file mode 100644 index 0000000000000000000000000000000000000000..47230ff1b8c35d77ede23833b3b6f5be51d3a0d3 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dinosaur.txt" @@ -0,0 +1,535 @@ +Aardonyx +Abelisaurus +Abrictosaurus +Abrosaurus +Abydosaurus +Acanthopholis +Achelousaurus +Achillobator +Acrocanthosaurus +Adamantisaurus +Adasaurus +Adeopapposaurus +Aegyptosaurus +Aeolosaurus +Aerosteon +Afrovenator +Agilisaurus +Agustinia +Alamosaurus +Alaskacephale +Albertonykus +Albertosaurus +Alectrosaurus +Aletopelta +Alioramus +Allosaurus +Altirhinus +Alvarezsaurus +Alxasaurus +Amargasaurus +Amazonsaurus +Ammosaurus +Ampelosaurus +Amphicoelias +Amurosaurus +Anabisetia +Anatosaurus +Anatotitan +Anchiornis +Anchisaurus +Andesaurus +Angaturama +Angolatitan +Angulomastacator +Animantarx +Ankylosaurus +Anserimimus +Antarctopelta +Antarctosaurus +Antetonitrus +Apatosaurus +Appalachiosaurus +Aragosaurus +Aralosaurus +Archaeoceratops +Archaeopteryx +Archaeornithomimus +Argentinosaurus +Argyrosaurus +Aristosuchus +Arrhinoceratops +Astrodon +Atlasaurus +Atlascopcosaurus +Atrociraptor +Aublysodon +Aucasaurus +Auroraceratops +Australodocus +Australovenator +Austroraptor +Austrosaurus +Avaceratops +Aviatyrannis +Avimimus +Bactrosaurus +Bagaceratops +Bagaraatan +Bahariasaurus +Balaur +Bambiraptor +Barapasaurus +Barosaurus +Baryonyx +Becklespinax +Beipiaosaurus +Beishanlong +Bellusaurus +Bistahieversor +Bonitasaura +Borogovia +Bothriospondylus +Brachiosaurus +Brachyceratops +Brachylophosaurus +Brachytrachelopan +Brontomerus +Bruhathkayosaurus +Buitreraptor +Byronosaurus +Camarasaurus +Camelotia +Camptosaurus +Carcharodontosaurus +Carnotaurus +Caudipteryx +Centrosaurus +Cerasinops +Ceratonykus +Ceratosaurus +Cetiosauriscus +Cetiosaurus +Chaoyangsaurus +Charonosaurus +Chasmosaurus +Chialingosaurus +Chilantaisaurus +Chindesaurus +Chirostenotes +Chubutisaurus +Chungkingosaurus +Citipati +Claosaurus +Coahuilaceratops +Coelophysis +Coelurus +Colepiocephale +Compsognathus +Concavenator +Conchoraptor +Condorraptor +Corythosaurus +Crichtonsaurus +Cryolophosaurus +Cryptovolans +Dacentrurus +Daemonosaurus +Daspletosaurus +Datousaurus +Deinocheirus +Deinonychus +Deltadromeus +Diabloceratops +Diamantinasaurus +Diceratops +Dicraeosaurus +Dilong +Dilophosaurus +Dimetrodon +Diplodocus +Dollodon +Dracopelta +Dracorex +Dravidosaurus +Drinker +Dromaeosaurus +Dromiceiomimus +Dryosaurus +Dryptosaurus +Dubreuillosaurus +Dyslocosaurus +Dystrophaeus +Echinodon +Edmarka +Edmontonia +Edmontosaurus +Efraasia +Einiosaurus +Ekrixinatosaurus +Elaphrosaurus +Elrhazosaurus +Enigmosaurus +Eocarcharia +Eocursor +Eodromaeus +Eolambia +Eoraptor +Eotyrannus +Epachthosaurus +Epidendrosaurus +Equiijubus +Erketu +Erlikosaurus +Euhelopus +Euoplocephalus +Europasaurus +Euskelosaurus +Eustreptospondylus +Fabrosaurus +Falcarius +Fruitadens +Fukuiraptor +Fukuisaurus +Fulgurotherium +Futalognkosaurus +Gallimimus +Gargoyleosaurus +Garudimimus +Gasosaurus +Gasparinisaura +Gastonia +Giganotosaurus +Gigantoraptor +Gigantspinosaurus +Gilmoreosaurus +Giraffatitan +Gobiceratops +Gojirasaurus +Gondwanatitan +Gorgosaurus +Goyocephale +Gryposaurus +Guanlong +Hadrosaurus +Hagryphus +Haplocanthosaurus +Haplocheirus +Harpymimus +Herrerasaurus +Hesperonychus +Hesperosaurus +Heterodontosaurus +Hexinlusaurus +Heyuannia +Hippodraco +Homalocephale +Hongshanosaurus +Huabeisaurus +Huaxiagnathus +Huayangosaurus +Hylaeosaurus +Hypacrosaurus +Hypselosaurus +Hypsilophodon +Ignavusaurus +Iguanacolossus +Iguanodon +Ilokelesia +Incisivosaurus +Indosuchus +Ingenia +Irritator +Isisaurus +Janenschia +Jaxartosaurus +Jeholosaurus +Jeyawati +Jinfengopteryx +Jingshanosaurus +Jinzhousaurus +Jobaria +Juravenator +Kaijiangosaurus +Kentrosaurus +Khaan +Kileskus +Kol +Koreaceratops +Kosmoceratops +Kotasaurus +Kritosaurus +Kryptops +Labocania +Lagosuchus +Lambeosaurus +Lanzhousaurus +Lapparentosaurus +Leaellynasaura +Leptoceratops +Lesothosaurus +Lessemsaurus +Lexovisaurus +Liaoceratops +Liliensternus +Limusaurus +Linhenykus +Linheraptor +Linhevenator +Lophostropheus +Lourinhanosaurus +Luanchuanraptor +Lufengosaurus +Lurdusaurus +Lycorhinus +Magnirostris +Magnosaurus +Magyarosaurus +Mahakala +Maiasaura +Majungatholus +Malawisaurus +Mamenchisaurus +Mantellisaurus +Mapusaurus +Marshosaurus +Masiakasaurus +Massospondylus +Maxakalisaurus +Medusaceratops +Megalosaurus +Megapnosaurus +Megaraptor +Mei +Melanorosaurus +Metriacanthosaurus +Microceratops +Micropachycephalosaurus +Microraptor +Minmi +Minotaurasaurus +Miragaia +Mirischia +Mojoceratops +Monkonosaurus +Monoclonius +Monolophosaurus +Mononykus +Mussaurus +Muttaburrasaurus +Mymoorapelta +Nanotyrannus +Nanshiungosaurus +Nedcolbertia +Neimongosaurus +Nemegtomaia +Nemegtosaurus +Neovenator +Neuquensaurus +Nigersaurus +Noasaurus +Nodocephalosaurus +Nodosaurus +Nomingia +Nothronychus +Notohypsilophodon +Nqwebasaurus +Ojoceratops +Olorotitan +Omeisaurus +Opisthocoelicaudia +Ornitholestes +Ornithomimus +Orodromeus +Oryctodromeus +Othnielia +Ouranosaurus +Oviraptor +Oxalaia +Ozraptor +Pachycephalosaurus +Pachyrhinosaurus +Paluxysaurus +Panoplosaurus +Panphagia +Pantydraco +Paralititan +Paranthodon +Parasaurolophus +Patagosaurus +Pawpawsaurus +Pedopenna +Pelecanimimus +Pelorosaurus +Pentaceratops +Phuwiangosaurus +Piatnitzkysaurus +Pinacosaurus +Pisanosaurus +Plateosaurus +Pleurocoelus +Podokesaurus +Poekilopleuron +Polacanthus +Prenocephale +Prenoceratops +Probactrosaurus +Proceratosaurus +Procompsognathus +Propanoplosaurus +Prosaurolophus +Protarchaeopteryx +Protoceratops +Psittacosaurus +Puertasaurus +Pyroraptor +Qantassaurus +Qiaowanlong +Quaesitosaurus +Rahonavis +Rajasaurus +Rapetosaurus +Raptorex +Rebbachisaurus +Regnosaurus +Rhabdodon +Rhoetosaurus +Richardoestesia +Rinchenia +Riojasaurus +Rugops +Saichania +Saltasaurus +Saltopus +Sanjuansaurus +Santanaraptor +Sarahsaurus +Sarcolestes +Sarcosaurus +Saturnalia +Saurolophus +Sauropelta +Saurophaganax +Sauroposeidon +Saurornithoides +Saurornitholestes +Scansoriopteryx +Scelidosaurus +Scipionyx +Scutellosaurus +Secernosaurus +Segisaurus +Segnosaurus +Seismosaurus +Seitaad +Sellosaurus +Shantungosaurus +Shenzhousaurus +Shunosaurus +Shuvosaurus +Shuvuuia +Siamotyrannus +Silvisaurus +Similicaudipteryx +Sinocalliopteryx +Sinornithoides +Sinornithomimus +Sinornithosaurus +Sinosauropteryx +Sinovenator +Sinraptor +Sinusonasus +Skorpiovenator +Sonorasaurus +Sphaerotholus +Spinophorosaurus +Spinosaurus +Spinostropheus +Staurikosaurus +Stegoceras +Stegosaurus +Stenopelix +Stokesosaurus +Struthiomimus +Struthiosaurus +Stygimoloch +Styracosaurus +Suchomimus +Supersaurus +Suuwassea +Suzhousaurus +Szechuanosaurus +Talarurus +Talenkauen +Talos +Tanius +Tanycolagreus +Tarascosaurus +Tarbosaurus +Tarchia +Tastavinsaurus +Tatankacephalus +Tawa +Technosaurus +Telmatosaurus +Tenontosaurus +Teratophoneus +Tethyshadros +Texacephale +Thecodontosaurus +Theiophytalia +Therizinosaurus +Thescelosaurus +Tianyulong +Tianyuraptor +Timimus +Titanoceratops +Titanosaurus +Torosaurus +Torvosaurus +Triceratops +Troodon +Tsaagan +Tsintaosaurus +Tuojiangosaurus +Tylocephale +Tyrannosaurus +Tyrannotitan +Uberabatitan +Udanoceratops +Unaysaurus +Unenlagia +Urbacodon +Utahceratops +Utahraptor +Valdosaurus +Variraptor +Velafrons +Velociraptor +Vulcanodon +Wannanosaurus +Wellnhoferia +Wintonotitan +Wuerhosaurus +Xenoposeidon +Xenotarsosaurus +Xiaosaurus +Xiaotingia +Xiongguanlong +Xixianykus +Xuanhanosaurus +Xuanhuaceratops +Yamaceratops +Yandusaurus +Yangchuanosaurus +Yaverlandia +Yimenosaurus +Yinlong +Yizhousaurus +Yunnanosaurus +Zalmoxes +Zanabazar +Zhuchengosaurus +Zhuchengtyrannus +Zuniceratops +Zupaysaurus diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_breeds.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_breeds.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8159035c6a9f1d9e19471e219c472c32a935f759 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_breeds.txt" @@ -0,0 +1,38 @@ +Afghan hound +Akita Inu +Australian Kelpie +Beagle +Bernese mountain dog +Border collie +Borzoi +Boston terrier +Briard +Bulldog +Bull terrier +Chihuahua +Chinese crested dog +Chow Chow +Dachshund +Dalmatian +Dingo +Doberman +English springer spaniel +German shepherd +Golden retriever +Great Dane +Greyhound +Husky +Irish setter +Japanese spitz +Maltese +Mastiff +Old English Sheepdog +Papillon +Pomeranian +Poodle +Pug +Shar Pei +Shiba inu +St. Bernard +Welsh corgi +Yorkshire Terrier \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_canines.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_canines.txt" new file mode 100644 index 0000000000000000000000000000000000000000..699eedc3b343f4c1f7d939007e7e8f18bdd91d89 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_canines.txt" @@ -0,0 +1,2 @@ +Hyena +Wolf \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..99f2c4188996c266f2ab38c4dbbc6c29f4d59bb8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_misc.txt" @@ -0,0 +1,9 @@ +animal +pet +animalization +dog on head +dog walking +guide dog +no humans +pack of dogs +puppy \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_related.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_related.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0e5aefd7a8a9f5dd447559f77508ba850edd0de9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dog_related.txt" @@ -0,0 +1,5 @@ +dog costume +dog ears +dog hat +dog paws +dog tail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dogs_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dogs_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..edca36dbf175e05aa8a8ec594f22cbf219725111 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/dog/dogs_main.txt" @@ -0,0 +1 @@ +dog \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/f-stop.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/f-stop.txt" new file mode 100644 index 0000000000000000000000000000000000000000..66db1d091acc41a735be9aa018d687e75ea33aed --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/f-stop.txt" @@ -0,0 +1,10 @@ +ƒ/1 +ƒ/1.4 +ƒ2 +ƒ/2.8 +ƒ4 +ƒ/5.6 +ƒ/8 +ƒ/11 +ƒ/16 +ƒ/22 diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/fantasy-creature.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/fantasy-creature.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5eb4ff0bcfe34f3b32c13f7cdd3a2892056837f7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/fantasy-creature.txt" @@ -0,0 +1,36 @@ +Centaurs +Demon +Dragons +Dwarves +Elves +Evil Spirit +Ghost +Gnomes +King +Knight +Magician +Minotaur +Pegasus +Phoenix +Prince +Princess +Queen +Siren +Sorceress +Sprite +Trolls +Valkyrie +Vampire +Warlock +Werewolf +Wizard +Zombie +goblins +gryphons +ogres +orcs +phylakes +trolls +unicorns +witches +wizards diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/fantasy-setting.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/fantasy-setting.txt" new file mode 100644 index 0000000000000000000000000000000000000000..70cded04e8a8f4027c71833967fc069eb528af95 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/fantasy-setting.txt" @@ -0,0 +1,36 @@ +Xanadu: A magical kingdom ruled by a wise and benevolent queen +neverland: A lost island where children can stay forever young +Shambhala: A hidden paradise where enlightened beings live in perfect harmony +Atlantis: A lost city of great wisdom and power, now drowned beneath the sea +Fairyland: A place where fairies and other magical creatures live +Oz: A land of enchantment, where anything is possible +Mordor: A dark and evil land ruled by the dark lord Sauron +Narnia: A land of talking animals and magical creatures, ruled by Aslan the lion +Middle Earth: A vast and ancient world, home to Elves, Dwarves, Hobbits, and Wizards +Wonderland: A wild and fantastical place, ruled by the Queen of Hearts +A magical kingdom where everything is perfect and everyone is happy +A dark, foreboding castle where a wicked queen reigns +A fairytale land where magical creatures live and thrive +A tropical paradise where the sun shines brightly every day +An enchanted forest where magical creatures and plants abound +The Vale of Shadows: A dark and foreboding valley where no sunlight ever penetrates +The Forest of Mirrors: A place where reality is distorted and nothing is as it seems +The Crystal Cave: A fantastical place where gemstones and crystals grow in abundance +The Dragon's Lair: A deadly place where fearsome dragons dwell +The Tower of Oblivion: A stronghold where forgotten secrets and dark magic are hidden +The Island of Dreams: A place where all your deepest desires come true +The City of Glas +The Enchanted Forest +The Kingdom of the Unicorn +The Isle of the Dragon +The Valley of the Lost +The City of Mirrors +The Palace of Dreams +The Tower of Destiny +The Castle of Shadows +The Mountain of Fire +The Shattering Isles +The Isle of Doom +The Shrine of the Lost +The Valley of the Damned +The Tower of Shadows diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/focal-length.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/focal-length.txt" new file mode 100644 index 0000000000000000000000000000000000000000..68588391e69ab44d8a62ff2ac1c27b3e0af1b2a4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/focal-length.txt" @@ -0,0 +1,13 @@ +4mm - 14mm focal length +Fisheye focal length +14mm - 35mm focal length +Wide angle focal length +35mm - 85mm focal length +Standard focal length +85mm - 135mm +Short telephoto focal length +135mm+ focal length +Medium telephoto focal length +300mm+ focal length +Macro +Macro focal length diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/games.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/games.txt" new file mode 100644 index 0000000000000000000000000000000000000000..79c34c0349b8d94d5ac016c6c9cbd974df5e308b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/games.txt" @@ -0,0 +1,157 @@ +8-bit JRPG +A Short Hike +Age of Empires +Albion +Alone in the Dark +Angry Birds +Animal Crossing +Baldur's Gate II +Baldurs Gate +Bastion +Battle Axe +BioShock +Blizzard +Cadillacs and Dinosaurs +Candy Crush +Carmen Sandiego +Castle Wolfenstein +Castlevania +Chrono Trigger +Civilization V +Command & Conquer Red Alert +Commander Keen +Crusaider No Regret +Dark Souls +Darkest Dungeon +Day of the Tentacle +Desert Strike +Diablo 2 +Diablo +Disco Elysium +Discworld +Divinity Original Sin +Doom +Dota 2 +Dune +Escape from Monkey Island +Factorio +Fall out +Final Fantasy +Full Throttle +Full Throttle +Game Boy Advance +Game Boy Color +Game Boy +God of War +Grand Theft Auto III +Grim dawn +Grim Fandango +GTA 1 +GTA 2 +GTA V +Half-Life +Harvest Moon +I Have No Mouth And I Must Scream +Indiana Jones and the Fate of Atlantis +JRPG +League of Legends +Legend of Mana +Leisure Suit Larry +Lemmings +Little Big Adventure +Loom +LucasArts adventure games +LucasArts +Magic Carpet +Maniac Mansion +Mario Galaxy +Mario Odissey +Marvel vs. Capcom +MDK +Metal Slug +Metroid Prime +Minecraft +Monster Rancher +Monument Valley +Mortal Kombat +Myst +Neverwinter Nights +Nintendo DS +Nintendo N64 +Nintendo +Overwatch +Path of Exile +Plants vs zombies +Playstation 1 +Playstation 2 +Pokemon +Portal 2 +Prince of Persia +Priston Tale +PS One +Quake +Ragnarok +Raptor Call of the Shadows +Red Dead Redemption 2 +Redneck Rampage +Resident Evil +Retro RPG +Rimworld +Rockstar Games +Roler Coaster Tycoon +Runescape +Sam & Max Hit the Road +Samorost +ScummVM +Shadowrun +Sid Meier’s Civilization +Simant +Simcity 2000 +Simcity 3000 +Simon the Sorcerer +Skyrim +Sony +Square Enix +Star wars KOTOR +Star wars yoda stories +Starcraft +Stardew Valley +Starfox +Street Fighter +Super Mario Bros 3 +Super Mario RPG +Super Mario World +Super Metroid +Super Nintendo +Superfrog +Syndicate +Team Fortress 2 +Tetris +The Chaos Engine +The Dig +The Elder Scrolls +The Elder Scrolls +The Incredible Machine +The Journey Down +The Last of Us +The Legend of Zelda +The Lord of the Rings +The Secret of Monkey Island +The Witcher +Theme Park +Thief +Tibia +Tomb Raider +Tony Hawk's Pro Skater 2 +Transport Tycoon +Tropico 5 +Tyrian +Ultima Online +Undertale +Warcraft +Wing Commander +World of Warcraft +X-Com UFO Defense +X-men +Yoshi's Island +Zak McKracken and the Alien Mindbenders diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/by_gender/groups_by_gender_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/by_gender/groups_by_gender_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b342bc2be8407aeae1e8dd0cd812afc7cdcbb542 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/by_gender/groups_by_gender_other.txt" @@ -0,0 +1,7 @@ +multiple others +1other +2others +3others +4others +5others +6+others \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_animals.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_animals.txt" new file mode 100644 index 0000000000000000000000000000000000000000..828b84779bac94d87ae72e4cd5e73901e2d4d0c7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_animals.txt" @@ -0,0 +1,2 @@ +pack of dogs +too many cats \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_everyone.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_everyone.txt" new file mode 100644 index 0000000000000000000000000000000000000000..19cbe953d770811886b2f6a698016ce4798f757c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_everyone.txt" @@ -0,0 +1,2 @@ +absolutely everyone +everyone \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..cda38e46331bc21545734fbd557e648b92f41547 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/groups/groups_main.txt" @@ -0,0 +1,5 @@ +dual persona +multiple persona +clone +crossover +look-alike \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_coverings.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_coverings.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6761e3d2e80aed9825fed3825c0aca0022ccc154 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_coverings.txt" @@ -0,0 +1,15 @@ +bandaid +bandaid on knee +bandaid on arm +bandaid on leg +bandaid on face +bandaid on cheek +bandaid on ear +bandaid on forehead +bandaid on nose +bandages +bandage on face +bandage over one eye +eyepatch +sarashi +sling (medical) \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4048f3a8ee1807c249bbba5c2a839c48a8cc5d64 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_misc.txt" @@ -0,0 +1,12 @@ +Doctor +Nurse +Hospital +first aid +intravenous drip +surgery +syringe +amputee +pill +drugs +guro +organs \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_visible.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_visible.txt" new file mode 100644 index 0000000000000000000000000000000000000000..902c5c4855b74dae115018cfea2e99121d6c627e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/injury/injury_visible.txt" @@ -0,0 +1,20 @@ +blood +bleeding +blood on bandages +bruise +bruised eye +cuts +deep wound +head bump +missing eye +nosebleed +scar +scratches +stitches +scrape +sprain +stab +impaled +severed limb +amputee +track marks \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/iso-stop.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/iso-stop.txt" new file mode 100644 index 0000000000000000000000000000000000000000..912eceab60d13a4bb49ebb4587bb7aaeae8f6ab0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/iso-stop.txt" @@ -0,0 +1,13 @@ +ISO 100 +ISO 200 +ISO 400 +ISO 800 +ISO 1600 +ISO 3200 +ISO 6400 +ISO 12500 +ISO 25600 +ISO 51200 +ISO 102400 +ISO 204800 +ISO 409600 diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/jobs/jobs.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/jobs/jobs.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5200330647652dd4b1190622a8d38edcee93daa7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/jobs/jobs.txt" @@ -0,0 +1,76 @@ +actor +actress +alchemist +artist painter +astronaut +athlete +bartender +bodyguard +burglar +butler +cashier +chef +chemist +conductor +construction worker +croupier +cyclist +dentist +dj +doctor +dominatrix +driver +engineer +farmer +firefighter +flight attendant +florist +geisha +guard +hacker +hikikomori +idol +janitor +judge +librarian +lifeguard +lumberjack +maid +mechanic +merchant +miko +miner +monk +musician +ninja +nun +nurse +school nurse +office lady +pilot +police +politician +priest +prisoner +prostitution +sailor +salaryman +samurai +scientist +shepherd +slave +soldier +officer +standard-bearer +spy +stripper +teacher +terrorist +trucker +train attendant +train conductor +waiter +waitress +warrior +wizard +witch \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/landscape-type.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/landscape-type.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c892cea7ce16918d123a0477929041769daaa288 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/landscape-type.txt" @@ -0,0 +1,127 @@ +abstract landscape +aerial landscape +architectural landscape +botanical landscape +cityscapes +desertscapes +ecological landscape +fantasy landscape +flowering landscape +forestscapes +historical landscape +industrial landscape +landscape +landscape in black and white +landscape in color +landscape with Analogous Colors +landscape with animals +landscape with Asymmetry +landscape with Atmospheric Perspective +landscape with Balance +landscape with balance +landscape with Bright Colors +landscape with chroma +landscape with colors +landscape with Complementary Colors +landscape with contrast +landscape with Contrast of Brightness and Darkness +landscape with Contrast of Earth Tones and Non-Earth Tones +landscape with Contrast of Hue +landscape with Contrast of Intensity +landscape with Contrast of Lightness +landscape with Contrast of Saturation +landscape with Contrast of Visual Weight +landscape with Contrast of Warmth and Coolth +landscape with Cool Colors +landscape with Curvilinear +landscape with Dark Colors +landscape with depth +landscape with dimensions +landscape with Dimetric Perspective +landscape with Dominance +landscape with earth +landscape with Earth Colors +landscape with Emphasis +landscape with fire +landscape with Focal Point +landscape with form +landscape with Four-Point Perspective +landscape with Freeform +landscape with Geometric +landscape with Gradation +landscape with Hard Texture +landscape with Harmony +landscape with harmony +landscape with Horizontal Balance +landscape with hue +landscape with intensity +landscape with Isometric Perspective +landscape with light +landscape with lightness +landscape with Linear Perspective +landscape with moon +landscape with movement +landscape with Multi-Point Perspective +landscape with Neutral Colors +landscape with One-Point Perspective +landscape with Organic +landscape with Pattern +landscape with pattern +landscape with people +landscape with perspective +landscape with Proportion +landscape with proportion +landscape with Quadrilateral +landscape with Radial Balance +landscape with Rectilinear +landscape with Repetition +landscape with rhythm +landscape with Rough Texture +landscape with saturation +landscape with scale +landscape with shade +landscape with shadow +landscape with shapes +landscape with Simultaneous Contrast +landscape with Single-Point Perspective +landscape with sky +landscape with Smooth Texture +landscape with Soft Texture +landscape with space +landscape with stars +landscape with Subordination +landscape with sun +landscape with Symmetry +landscape with temperature +landscape with Texture +landscape with texture +landscape with Three-Point Perspective +landscape with time +landscape with tint +landscape with tone +landscape with Triangulation +landscape with Trimetric Perspective +landscape with Two-Point Perspective +landscape with Unity +landscape with unity +landscape with value +landscape with Variety +landscape with variety +landscape with Vertical Balance +landscape with Visual Weight +landscape with Warm Colors +landscape with water +landscape with wind +landscape withlines +maximalist landscape +minimalist landscape +mountainscapes +mythological landscape +panoramic landscape +political landscape +religious landscape +scientific landscape +seascapes +skyscapes +snowscapes +surreal landscape diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/meta_provenance/meta_provenance.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/meta_provenance/meta_provenance.txt" new file mode 100644 index 0000000000000000000000000000000000000000..eda7a070edc95dc7117bc4903532f344527c7871 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/meta_provenance/meta_provenance.txt" @@ -0,0 +1,51 @@ +md5 mismatch +ai-generated +ai-assisted +duplicate +pixel-perfect duplicate +derivative work +reference photo +photo-referenced +anime screencap +screencap-referenced +game screenshot +screenshot +redrawn +album cover redraw +manga panel redraw +screencap redraw +end card +game cg +novel illustration +official art +game model +key visual +promotional art +production art +animatic +disc menu +cd (source) +spoilers +check spoilers +paid reward +patreon reward +enty reward +fantia reward +fanbox reward +gumroad reward +paid reward available +line sticker available +scan +color halftone +magazine scan +scan artifacts +scan dust +self scan +bleed through +binding discoloration +fixed +avatar generator +off-topic +jav +self upload +revision \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/movement.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/movement.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3117117ab9f835fa14e4c325bbdf80f0cdce7ef9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/movement.txt" @@ -0,0 +1,189 @@ +afrofuturism +ascii art +abstract art +art brut +abstract expressionism +abstract illusionism +academic art +action painting +aestheticism +altermodern +american barbizon school +american impressionism +american realism +american scene painting +analytical art +antipodeans +arabesque +arbeitsrat für kunst +art & language +art deco +art informel +art nouveau +art photography +arte povera +arts and crafts movement +ashcan school +assemblage +australian tonalism +les automatistes +auto-destructive art +barbizon school +baroque +bauhaus +berlin secession +black arts movement +bengal school of art +brutalism +classical realism +cloisonnism +cobra +color field +context art +computer art +concrete art +conceptual art +constructivism +crystal cubism +cubo-futurism +cubism +cynical realism +dada +dansaekhwa +danube school +dau-al-set +de stijl (also known as neoplasticism) +deconstructivism +digital art +ecological art +environmental art +modern european ink painting +excessivism +expressionism +fantastic realism +fauvism +feminist art +figurative art +figuration libre +fine art +folk art +fluxus +funk art +futurism +geometric abstract art +glitch art +graffiti/street art +gutai group +gothic art +happening +harlem renaissance +heidelberg school +hudson river school +hurufiyya +hypermodernism +hyperrealism +impressionism +incoherents +interactive art +institutional critique +international gothic +international typographic style +kinetic art +kinetic pointillism +kitsch movement +land art +les nabis +letterism +light and space +lowbrow +lyco art +lyrical abstraction +magic realism +mail art +mannerism +massurrealism +maximalism +metaphysical painting +mingei +minimalism +modernism +modular constructivism +naive art +neoclassicism +neo-dada +neo-expressionism +neo-fauvism +neo-figurative +neogeo (art) +neoism +neo-primitivism +neo-romanticism +net art +new objectivity +new sculpture +northwest school +nuclear art +objective abstraction +op art +orphism +photorealism +panfuturism +paris school +pixel art +plasticien +plein air +pointillism +pop art +post-impressionism +postminimalism +precisionism +pre-raphaelitism +primitivism +private press +process art +psychedelic art +purism +qajar art +quito school +rasquache +rayonism +realism +regionalism +remodernism +renaissance +retrofuturism +rococo +romanesque +romanticism +samikshavad +serial art +shin hanga +shock art +sōsaku hanga +socialist realism +sots art +space art +street art +stuckism +sumatraism +superflat +suprematism +surrealism +symbolism +synchromism +synthetism +tachisme (aka informel) +temporary art +toyism +transgressive art +tonalism +ukiyo-e +underground comix +unilalianism +vancouver school +vanitas +verdadism +video art +visual art +viennese actionism +vorticism diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/nationality.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/nationality.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e1a6d40e3757338a993538ee19987df1e66d29d4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/nationality.txt" @@ -0,0 +1,228 @@ +Afghan +Albanian +Algerian +American +Andorran +Angolan +Anguillan +Citizen of Antigua +Barbuda +Argentine +Armenian +Australian +Austrian +Azerbaijani +Bahamian +Bahraini +Bangladeshi +Barbadian +Belarusian +Belgian +Belizean +Beninese +Bermudian +Bhutanese +Bolivian +Citizen of Bosnia +Herzegovina +Botswanan +Brazilian +British +British Virgin Islander +Bruneian +Bulgarian +Burkinan +Burmese +Burundian +Cambodian +Cameroonian +Canadian +Cape Verdean +Cayman Islander +Central African +Chadian +Chilean +Chinese +Colombian +Comoran +Congolese (Congo) +Congolese (DRC) +Cook Islander +Costa Rican +Croatian +Cuban +Cymraes +Cymro +Cypriot +Czech +Danish +Djiboutian +Dominican +Citizen of the Dominican Republic +Dutch +East Timorese +Ecuadorean +Egyptian +Emirati +English +Equatorial Guinean +Eritrean +Estonian +Ethiopian +Faroese +Fijian +Filipino +Finnish +French +Gabonese +Gambian +Georgian +German +Ghanaian +Gibraltarian +Greek +Greenlandic +Grenadian +Guamanian +Guatemalan +Citizen of Guinea-Bissau +Guinean +Guyanese +Haitian +Honduran +Hong Konger +Hungarian +Icelandic +Indian +Indonesian +Iranian +Iraqi +Irish +Israeli +Italian +Ivorian +Jamaican +Japanese +Jordanian +Kazakh +Kenyan +Kittitian +Citizen of Kiribati +Kosovan +Kuwaiti +Kyrgyz +Lao +Latvian +Lebanese +Liberian +Libyan +Liechtenstein citizen +Lithuanian +Luxembourger +Macanese +Macedonian +Malagasy +Malawian +Malaysian +Maldivian +Malian +Maltese +Marshallese +Martiniquais +Mauritanian +Mauritian +Mexican +Micronesian +Moldovan +Monegasque +Mongolian +Montenegrin +Montserratian +Moroccan +Mosotho +Mozambican +Namibian +Nauruan +Nepalese +New Zealander +Nicaraguan +Nigerian +Nigerien +Niuean +North Korean +Northern Irish +Norwegian +Omani +Pakistani +Palauan +Palestinian +Panamanian +Papua New Guinean +Paraguayan +Peruvian +Pitcairn Islander +Polish +ortuguese +Prydeinig +Puerto Rican +Qatari +Romanian +Russian +Rwandan +Salvadorean +Sammarinese +Samoan +Sao Tomean +Saudi Arabian +Scottish +Senegalese +Serbian +Citizen of Seychelles +Sierra Leonean +Singaporean +Slovak +Slovenian +Solomon Islander +Somali +South African +South Korean +South Sudanese +Spanish +Sri Lankan +St Helenian +St Lucian +Stateless +Sudanese +Surinamese +Swazi +Swedish +Swiss +Syrian +Taiwanese +Tajik +Tanzanian +Thai +Togolese +Tongan +Trinidadian +Tristanian +Tunisian +Turkish +Turkmen +Turks +Caicos Islander +Tuvaluan +Ugandan +Ukrainian +Uruguayan +Uzbek +Vatican citizen +Citizen of Vanuatu +Venezuelan +Vietnamese +Vincentian +Wallisian +Welsh +Yemeni +Zambian +Zimbabwean diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/natl-park.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/natl-park.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5df114323ae19cb047e0deede7d0064c66d4c2a2 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/natl-park.txt" @@ -0,0 +1,55 @@ +Acadia National Park +Arches National Park +Badlands National Park +Big Bend National Park +Biscayne National Park +Black Canyon of the Gunnison National Park +Bryce Canyon National Park +Canyonlands National Park +Capitol Reef National Park +Carlsbad Caverns National Park +Channel Islands National Park +Congaree National Park +Crater Lake National Park +Cuyahoga Valley National Park +Death Valley National Park +Denali National Park and Preserve +Dry Tortugas National Park +Everglades National Park +Gates of the Arctic National Park and Preserve +Glacier National Park Glacier Bay National Park and Preserve +Grand Canyon National Park +Grand Teton National Park +Great Basin National Park +Great Sand Dunes National Park and Preserve +Great Smoky Mountains National Park +Guadalupe Mountains National Park +Haleakala National Park +Hawaii Volcanoes National Park +Hot Springs National Park +Isle Royale National Park +Joshua Tree National Park +Katmai National Park and Preserve +Kenai Fjords National Park +Kings Canyon National Park +Kobuk Valley National Park +Lake Clark National Park and Preserve +Lassen Volcanic National Park +Mammoth Cave National Park +Mesa Verde National Park +Mount Rainier National Park +North Cascades National Park +Olympic National Park +Petrified Forest National Park +Redwood National and State Parks +Rocky Mountain National Park +Saguaro National Park +Sequoia National Park +Shenandoah National Park +Theodore Roosevelt National Park +Voyageurs National Park +Wind Cave National Park +Wrangell-St. Elias National Park and Preserve +Yellowstone National Park +Yosemite National Park +Zion National Park diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/neg-weight.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/neg-weight.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9e4336acf4d2f55536abdb331f45e9243e30cd5c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/neg-weight.txt" @@ -0,0 +1,32 @@ +jpeg artifact +Image compression +Distorted +Grainy +Out of Focus +Blurry +DOF +Noisy +Watermark +Text +Copyright +low resolution +shaky +too bright +too dark +Poorly lit +Pixelated +Poor quality +low quality +Unclear +Blocked +Artifacts +Banding +Truncated +Out of Frame +disjointed +incoherent +asymmetry +disorganized +jumbled +tasteless +tacky diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/new.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/new.txt" new file mode 100644 index 0000000000000000000000000000000000000000..74c5a12e45302deef998c3772e25212976a6d306 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/new.txt" @@ -0,0 +1,2 @@ +dated +halloween \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-beauty.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-beauty.txt" new file mode 100644 index 0000000000000000000000000000000000000000..393e8007db476df45cfa9a93aed14172b9c18072 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-beauty.txt" @@ -0,0 +1,71 @@ +a perfect 10 +a vision +admirer +allure +amazement +angel +appeal +appreciation +art +artistry +artwork +awe +beautiful creature +beauty +charm +charmer +composition +cutie pie +infatuation +inner beauty +jewel +knockout +longing +look +looker +loveliness +marvel +miracle +natural beauty +object of desire +objet d’art +perfection +purity +rarity +reverence +sculpture +dish +doll +dream +dream girl +dream guy +dreamboat +elegance +enchanter +exotic beauty +eye candy +eyeful +feast for the eyes +femininity +fox +glorification +good looks +good-looking person +grace +sex appeal +sex symbol +sight +sophistication +stunner +stupor +treasure +uncontested beauty +universal beauty +visual feast +visual poetry +visual treat +wonder +wonderment +work of art +worshiper +wow factor diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-emote.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-emote.txt" new file mode 100644 index 0000000000000000000000000000000000000000..54f1128988cd8e433a77d7e09bd57d299c4454c2 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-emote.txt" @@ -0,0 +1,61 @@ +Happy +Afraid +Sad +Hot +Amused +Bored +Anxious +Confident +Cold +Suspicious +Surprised +Loving +Curious +Envious +Jealous +Miserable +Confused +Stupid +Angry +Sick +Ashamed +Withdrawn +Indifferent +Sorry +Determined +Crazy +Bashful +Depressed +Enraged +Frightened +Interested +Shy +Hopeful +Regretful +Scared +Stubborn +Thirsty +Guilty +Nervous +Embarrassed +Disgusted +Proud +Ecstatic +Lonely +Frustrated +Hurt +Hungry +Tired +Smug +Thoughtful +Pained +Optimistic +Relieved +Puzzled +Shocked +Joyful +Sleepy +Excited +Skeptical +Bad +Worried diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-fantasy.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-fantasy.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e40fee58cace87bcfdeca9e0ac5d3383e76941dc --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-fantasy.txt" @@ -0,0 +1,128 @@ +Alchemy +Apprentice +Archer +Armor +Battering Ram +Black Ice +Black Magic +Breaking Point +Burn Mark +Candlelight +Castle Gate +Castle Wall +Castle +Contempt +Countermeasure +Crossroads +Crypt +Dark Elf +Dark Elves +Defender +Dragon +Dreaming +Drow Elf +Drow Elves +Dungeon +Dwarf +Dwarves +Elf +Elves +Energy +Explosion +Fang +Figure +Folklore +Gems +Gemstones +Gentle Wind +Goblin +Goblins +Gold +Golum +Griffin +Ground +Gryphon +Guard +Guardhouse +Guild +Hippogriff +Hobbit +Hobbits +Imp +Imps +Interloper +King +Knight +Knight +Lich +Limit +Magic Spell +Magical Power +Magnetic North +Mayhem +Melancholy +Mercury +Monster +Moonbeam +Mystic +Necromancer +Necromancer +Nightmare +Pandemonium +Psychic +Queen +Radiation +Resistance +Retribution +Rogue +Royal +Scroll +Scrolls +Serpent +Shadows +Shard +Shoreline +Sixth Sense +Skeleton Warrior +Skeleton +Skeletons +Society +Sorcerer +Sorceress +Sorcery +Southeast +Spellcaster +Spirit World +Spirit +Squire +Stare Down +Strike Down +Stronghold +Summoner +Sunburst +Survivor +Sword +Tarot Card +Tavern +Templar +Third Eye +Throw Back +Thunderclap +Treasure +Turning +Undead +Undersurface +Underworld +Vampire +Viper +Warlock +Warrior +Watchman +Whisperer +Wind +Witch +Wizard +Wizardry +Wood Elf +Wood Elves +World diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-horror.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-horror.txt" new file mode 100644 index 0000000000000000000000000000000000000000..cdc17b8a9744faf511ff11cacb87319cb03a6ee9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/noun-horror.txt" @@ -0,0 +1,114 @@ +apparition +enchanted +ghost +ghostly +ghoulish +goblin +haunted house +haunting +hobgoblin +levitation +medium +occult +orbs +the paranormal +paranormal activity +phantasm +phantasms +phantom +poltergeist +possessed +possession +seance +specter +spirit +unearthly +wraith +potion +tombstone +death +poltergeist +demon +crow +ghoul +ghost story +hell +graveyard +monster +full moon +mummy +devil incarnate +vampire +make-up +spider web +screams +scary story +vampire +sorceress +pumpkin +mask +headless horseman +skulls +fantasy +horror +creature +specter +moonlight +doom +nightmare +threat +eye of newt +brains +occult +worms +haunted house +goblin +witch +bonfire +fright +werewolf +noose +shadow +lightening +bogeyman +fire & brimstone +hell-hound +intestines +darkness +evil spirit +trick +superstition +grave +graves +nightfall +spider +fangs +phantom +magic +phenomena +skeleton +spirits +pillow case +werewolf +tale of horror +fog +tomb +corpse +bones +dead body +zombie +banshee +bats +blood +terror +cemetery +haunting +murder +lightning +thunder +midnight +raven +cobweb +cobwebs +owl +night of fright diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/occupation.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/occupation.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ab2f55f97d373a34fbc0f280f97f1629ec438393 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/occupation.txt" @@ -0,0 +1,133 @@ +able-bodied seaman +Activist +Air Force Intelligence +Air Force Security Forces +Air Force Weather +Air Traffic Controllers +Aircraft Maintenance Technicians +Aircrew +Airman,annalist +Anarchist +Artillery +Artilleryman +artist +athlete +barber +bicyclist +bus driver +businessmen +Calvary +carpenter +Cashier +cashier +Cavalryman +chef +client +colonel +commander +Communications Specialist +Communications +Communist +commuter +consumer +corporal +Customer Service Representative +Democrat +Department Manager +Diplomat +doctor +driver +electrician +employer +Engineer +firefighter +garbage collector +general +gunner's mate +housekeeper +Infantry +Infantryman +Intelligence Analyst +Intelligence +janitor +Judge +landscaper +Lawyer +lawyer +lawyers +leader +Liaison Officer +Libertarian +lieutenant +listener +Lobbyist +Logisticics +mail carrier +Maintenance +Marine +mechanic +Medic +Merchandiser +musician +nanny +navigator +officer +Ordnance +Pararescuemen +passenger +patient +pedestrian +Pilot +pilot +plitician +plumber +police officer +private +Progressive +Protester +Psychic +quartermaster +radarman +Recon +Reconnaissance Specialist +registered nurse +Republican +Retail Associate +Retail Clerk +Retail Sales Associate +Retail Stocker +Retail Supervisor +Sailor +seaman +sergeant +shopper +Sniper +Socialist +software engineer +Soldier +sonarman +speaker +specialist +SpecOps +Spy +Store Manager +storekeeper +strategist +student +Supply +tactician +Tank Crewman +Tanker +taxi driver +teacher +Terrorist +Traditionalist +trooper +Uber driver +waiter +waitress +web developer +Wingman +worker +writer +yeoman diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/photo-term.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/photo-term.txt" new file mode 100644 index 0000000000000000000000000000000000000000..282d189b947e1f30fa3586588e09d395156ce8d4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/photo-term.txt" @@ -0,0 +1,9 @@ +Bokeh +Depth of Field +DoF +Digital Camera +Optical Camera +Long Exposure +Short Exposure +Over Exposed +Focused diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pop-culture.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pop-culture.txt" new file mode 100644 index 0000000000000000000000000000000000000000..13de1a9652d8d1c247aeafd0888e2f0d3885cb65 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pop-culture.txt" @@ -0,0 +1,174 @@ +3DO Interactive Multiplayer +A Clockwork Orange (1971) +Adventure Time +Agent 007 +Ahh! Real Monsters +Alien (1979) +Aliens (1986) +Animaniacs +Atari +Attack on Titan +Batman +Batman: The Animated Series +BeastWars +Beetlejuice +Ben Hur +Black Lagoon +Blade Runner +Blazing Saddles +Bleach +Braveheart (1995) +Breakfast at Tiffany's +Call of Duty +Call of Duty Modern Warfare +Casablanca +CatDog +Chip 'n Dale: Rescue Rangers +Close Encounters of the Third Kind +Code Geass +Comic-Con +Contact (1997) +Courage the Cowardly Dog +Cowboy Bebop +Danger Mouse +Dark Souls +Darkwing Duck +Death Note +Demons Souls +Dexter's Laboratory +Die Hard +Digimon +Dragon Ball +Dragon Ball GT +Dragon Ball Super +Dragon Ball Z +E.T. the Extra-Terrestrial (1982) +Earthworm Jim +Elden Ring +Ewoks +FLCL +Fooly Cooly +Fortnite +Full Metal Jacket (1987) +Fullmetal Alchemist +Futurama +Game Boy +Game Boy Advance +Game Boy Color +Game Boy SP +Gargoyles +George Lucas +Ghostbusters +Goldfinger +Gone with the Wind (1939) +Gundam +He-Man and the Masters of the Universe +Hellsing +Inuyasha +James Bond +Johnny Bravo +Jurassic Park +Jurassic Park III +Jurassic World +Jurassic World Dominion +Jurassic World Fallen Kingdom +Justice League +King Kong +M.A.S.H. +MASH +Mad Max +Mad Max 2: The Road Warrior +Mad Max Beyond Thunderdome +Mad Max: Fury Road +Magic: The Gathering +MechWarrior +Mega Man +Megan Man X +Metabots +Michael Crichton +Mighty Max +Naruto +Neon Genesis Evangelion +Nintendo +Nintendo 3DS +Nintendo 64 +Nintendo DS +Nintendo Entertainment System +Nintendo Switch +Nintendo Wii +Nintendo WiiU +Overwatch +PAX +PC Engine +Pinky and the Brain +Planet of the Apes +Pokemon +Poltergeist (1982) +Pulp Fiction +Raiders of the Lost Ark (1981) +Rainbow Brite +ReBoot (1994) +Regular Show +Robotech +Rocko’s Modern Life +Rugrats +SEGA +Samurai Jack +Scary Stories to Tell in the Dark +She-Ra: Princess of Power +Singing in the Rain +Sony Playstation +Sony Playstation 2 +Sony Playstation 3 +Sony Playstation 4 +Sony Playstation 5 +Space Dandy +Spider-Man +Spiderman +Star Wars (1977) +Star Wars: Episode V - The Empire Strikes Back (1980) +Starcraft +Steven King +Steven Spielberg +Stranger Things +Street Sharks +Super Mario Bros. +Super Nintendo +Superman +T-1000 +T-600 +T-800 +TaleSpin +Teenage Mutant Ninja Turtles +Terminator 3 +Terminator Dark Fate +Terminator Genysis +Terminator Salvation +The Day the Earth Stood Still +The Day the Earth Stood Still (1951) +The Godfather +The Grim Adventures of Billy and Mandy +The Hulk +The Lord of the Rings: The Fellowship of the Ring (2001) +The Lost World +The Matrix +The New Adventures of Johnny Quest +The Poseidon Adventure (1972) +The Ren & Stimpy Show +The Simpsons +The Terminator +The Terminator 2: Judgement Day +The Thing from Another World (1951) +The Wizard of Oz (1939) +Thundercats +Titanic (1997) +Total Recall +Toxic Crusaders +Transformers +Trigun +War for the Planet of the Apes +WarCraft +Wizard of the Coast +X-Men +Xenomorphs +Yu Yu Hakusho diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pop-location.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pop-location.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b3ce00494e854a0ff3d9a3b37d32502ede9c755d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pop-location.txt" @@ -0,0 +1,77 @@ +The Grand Canyon, Arizona +Rome, Italy +Maui, Hawaii +Masai Mara, Kenya +New Zealand +Istanbul, Turkey +Angkor Wat, Cambodia +London, England +Manhattan, New York +Paris, France +Montreal, Canada +San Francisco, California +Hong Kong +Marrakesh, Morocco +Cartagena, Columbia +Bali, Indonesia +New Orleans, Louisiana +Kerry, Ireland +Marrakesh, Morocco +The Maldives +Cape Town, South Africa +Dubai, U.A.E. +Bora Bora, French Polynesia +Dubrovnik, Croatia +Edinburgh, Scotland +Paro Valley, Bhutan +Jaipur, India +Waikato, New Zealand +Havana, Cuba +Tokyo, Japan +Antarctica +Vancouver, Canada +Los Angeles, California +Kruger National Park, South Africa +Santorini, Greece +Moscow, Russia +Singapore +London, England +Rio de Janeiro, Brazil +Petra, Jordan +Hong Kong +Barbabos +Amsterdam, Netherlands +Santiago, Chile +Cairo, Egypt +Copenhagen, Denmark +Seoul, Korea +Laucala Island Resort, Fiji +Providencia, Colombia +Machu Picchu, Peru +Virunga National Park, Democratic Republic of Congo +Lisbon, Portugal +Hanoi, Vietnam +Hawaii +Ibiza, Spain +Beijing, China +Budapest, Hungary +Cinque Terre, Italy +Buenos Aires, Argentina +Las Vegas +Matterhorn, Switzerland +Tahiti +Turks & Caicos, Dominican Republic +Phuket, Thailand +Glacier National Park, Montana +Banff, Alberta +Costa Rica +Machu Picchu, Peru +Sydney, Australia +Lucia, Caribbean +Great Barrier Reef, Australia +Florence, Italy +Santorini, Greece +Yellowstone +Yosemite National Park, California +Amalfi Coast, Italy +Victoria Falls diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/punk.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/punk.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c32da119fe1d1db60a5fe76f5c553d7468b6d1ed --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/punk.txt" @@ -0,0 +1,20 @@ +atompunk +biopunk +clockpunk +cyberpunk +dieselpunk +fantasy punk +frostpunk +gothpunk +magicpunk +mythpunk +nanopunk +post-apocalypticpunk +slipstream +solarpunk +space punk +splatterpunk +steampunk +steelpunk +stonepunk +weirdpunk diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_adornments.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_adornments.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fca057a9b0d877413799d5dca2fe283b1cf2c076 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_adornments.txt" @@ -0,0 +1,2 @@ +Clitoris piercing +Clitoris ring \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_anatomy.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_anatomy.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ce4711113c76d7d19e55a497223707ff791fcd69 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_anatomy.txt" @@ -0,0 +1,8 @@ +Clitoris +Erect clitoris +Cleft of venus +Labia +Fat mons +Pubic hair +Pussy +No pussy \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_attire.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_attire.txt" new file mode 100644 index 0000000000000000000000000000000000000000..42bed55ad5aa4b5f88a70582c52e3c633d880feb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_attire.txt" @@ -0,0 +1,5 @@ +Bandaid on pussy +Maebari +Panties +Strapless bottom +Swimsuit \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_fluids.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_fluids.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9beb83a50474868bc667262d4c0d4bcdee2ab975 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_fluids.txt" @@ -0,0 +1,5 @@ +Cum in pussy +Cum on pussy +Pussy juice +Pussy juice puddle +Pussy juice trail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_hands_on.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_hands_on.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3c86ae1d7150725b2710bfe72350a6c969540500 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_hands_on.txt" @@ -0,0 +1,4 @@ +Clitoris tweak +Covering crotch +Hand between legs +Spread pussy \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_sexual_objects_and_acts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_sexual_objects_and_acts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c00237244756c17af333270743c353de7b59819c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_sexual_objects_and_acts.txt" @@ -0,0 +1,3 @@ +Vaginal +Gun in pussy +Phone in pussy \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_under_clothes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_under_clothes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d36173a6a0d6f70af787ee11acf41535142e60d5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_under_clothes.txt" @@ -0,0 +1,2 @@ +Cameltoe +Spread pussy under clothes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_visible_parts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_visible_parts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fea0ff0a70b9de494a7130466d7b7ff8c18fb9dd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/pussy/pussy_visible_parts.txt" @@ -0,0 +1,3 @@ +Clitoris slip +Pussy peek +Pussy peek \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/put wildcards here.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/put wildcards here.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/scenario-desc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/scenario-desc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f221f3473f078222a6f3089d612af944a0f11583 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/scenario-desc.txt" @@ -0,0 +1,62 @@ +A calm and peaceful place with green grasses, a blue sky and lots of beauty everywhere +A city abandoned for years +A cramped space with a lot of rubble and little space to move around +A dark and evil land +A dry place like an empty desert +A frozen place +A futuristic place full of cables with lots of neon lights and led lighting +A Hawaiian-influenced environment and some surfboards +A hurricane-ravaged environment with lots of broken glass and scattered items +A land full of bones and skeletons of people who once fought there +A land full of slime moss stagnant water and insects +A land full of war trenches with lots of wreckage and remnants of war +A land where nothing makes sense and everything is out of place +A land with many riches and treasures +A maintenance environment with black and yellow stripes signaling construction +A mountainous location full of icy boulders and snow +A mountainous location full of jagged rocks and boulders +A musical concert environment +A place battered by time and natural disasters +A place flooded by the sea and with beach sand +A place forgotten by the world +A place full of gardens suspended by ropes and chains and butterflies with little details in everything +A place full of my ancestral past and several vessels with trapped souls +A place full of sand scattered with dry branches and very arid +A place infested with bugs and worms with decaying woods +A place located in the year 1900 with influence of Baroque and Neo-Gothic architecture +A place off planet earth in a galaxy far far away +A place taken by a flood with everyday objects floating and cars submerged in the water +A place taken by nature full of mosses and vegetation +A place taken over by mushrooms and tree roots +A place that was once inhabited by beings from another world +A place with a lot of glass details in silver and gold +A place with a morbid look that resembles a funeral a little foggy +A place with a very vast nature with big trees and lots of vegetation +A place with everything made of wood with a little old paint peeling off and lots of ingrown tribal details +A place with lots of leaves on the ground and autumn colors +A place with oil stains on the floor with remains of bodywork and screws and a lot of rusty things +A private club with a water slide and beach chairs by the pool where every table has nice drinks +A scene all frozen and full of snow with stalactites +A space environment with intergalactic and cosmic details +A space station with many technological resources and many drawers +A sunny place with a holiday and childhood vibe +A tropical and latin place with lots of green and beach +A tropical environment with a party atmosphere and a lot of wealth +A valley forgotten by time +A vast archaeological site located in the south region of Italy that was buried by meters of ash and pumice after a catastrophic eruption +A venue full of marble and gold detailing with reflective flooring +A very clean and well organized contemporary place +A very dirty and littered place. +A very humid place full of puddles and leaks +A very trendy and frequented place with footprints everywhere and wear on the floor +An environment that has just been waxed and carefully cleaned where everything is shiny and organized +An environment with a very vast and very humid fauna +An environment with an oriental look and solid design ornaments +An environment with Aztec tendencies and many sculptures with masks on the wall +An environment with glass windows and chandelier lighting +An environment with lights like a disco and dance floor +An environment with rustic furniture and a rustic country farm look +An environment with tombstones and a similar look to a cemetery with some bats and cockroaches +An environment with wabi-sabi trends and furniture made by famous designers +An old place full of cracks, misplaced things and broken pieces +An ornate venue with ornate floors and walls covered in tapestry diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/site.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/site.txt" new file mode 100644 index 0000000000000000000000000000000000000000..46e6b3325dd17667433dfb3653f3c07987c26da0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/site.txt" @@ -0,0 +1,17 @@ +trending on ArtStation +trending on ConceptArtWorld +trending on Flickr +trending on Behance +trending on 500px +trending on Getty Images +trending on DeviantArt +trending on Pixiv +trending on IllustrationX +trending on CGSociety +trending on Unsplash +trending on Google Images +trending on Pinterest +trending on Sketchfab +trending on Artsy +National Gallery of Art Highlights +trending on Saatchi Art diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/style.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/style.txt" new file mode 100644 index 0000000000000000000000000000000000000000..93487152e70eb423afa37bca3334bd82b75f3c71 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/style.txt" @@ -0,0 +1,45 @@ +Tempera +Oil Paints +Acrylic Paints +Watercolors +Charcoals +Pastels +Chalks +Graphite Pencils +Color Pencils +Ink and Pen +Digital Paintings +Vector Art +3D Renderings +3D Path Tracing +3D Ray Tracing +3D Sculpting +Matte Paintings +Digital Matte Paintings +2D Animation +Concept Artist +Fractal Art +Algorithmic Art +Data-Moshing +Dynamic Paintings +Digital Collage +2D Digital Paintings +3D Digital Paintings +Integrated Arts +Mixed-Medium Arts +Raster Paintings +Miniature +Diorama +Terrarium +Done in LEGO +Paper Cut Out +Voxel +Pixel Art +Low Poly +PSX GRAPHICS +Retro Graphics +Generative Art +Play-Doh +Botanical Drawing +Technical drawing +Optical Illusion diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/trippy.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/trippy.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bcb884993efa5460b68e6c6891c590ee26a98ec5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/trippy.txt" @@ -0,0 +1,73 @@ +abstract fractals +abstract fractals +astral +c4d abstract fractals +celestial +chroma +consciousness-expanding +consciousness-expanding +coruscating +crazy-looking +dazed +delirium +effulgence +eidetic +empyrean +entoptic fractals +entoptic patterns +entoptic phenomena +ether +euneirophrenia +experimental +fractal mandelbrot +fractals +fractoids +freaky +freaky-looking +geometric fractals +geometric patterns +geometric shapes +gorgonize +hallucinatory +hallucinogen +hallucinogenic +iridescent +kaleidoid +kaleidoscopic +lambent +lustrous +mind-bending +mind-blowing +mind-changing +mind-expanding +mind-expanding drug +multi-dimensional +multicolored +multidimensional +mystical fractals +nacreous +nebulous +nimbus +nubivagant +numinous +oneironaut +opalescent +phantasmagoria +phantasmal +prismatic +prisms +psychedelic +psychoactive +psychotomimetic +psychotomimetic +psychotropic +spaced-out +spacy +speciouis +sublunary +supernal +supramundane +transient +triangular polgyonal +trippy +verisimilitude diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/wh-site.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/wh-site.txt" new file mode 100644 index 0000000000000000000000000000000000000000..274eec2546e13e3be72467a8a93e1953342fc4ee --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\205\266\344\273\226/wh-site.txt" @@ -0,0 +1,1244 @@ +Minaret and Archaeological Remains of Jam +Cultural Landscape and Archaeological Remains of the Bamiyan Valley +Natural and Cultural Heritage of the Ohrid region +Butrint +Historic Centres of Berat and Gjirokastra +Al Qal'a of Beni Hammad +Djémila +M'Zab Valley +Tassili n'Ajjer +Timgad +Tipasa +Kasbah of Algiers +Madriu-Perafita-Claror Valley +Mbanza Kongo, Vestiges of the Capital of the former Kingdom of Kongo +Antigua Naval Dockyard and Related Archaeological Sites +Los Glaciares National Park +Jesuit Missions of the Guaranis: San Ignacio Mini, Santa Ana, Nuestra Señora de Loreto and Santa Maria Mayor (Argentina), Ruins of Sao Miguel das Missoes (Brazil) +Iguazu National Park +Cueva de las Manos, Río Pinturas +Península Valdés +Ischigualasto / Talampaya Natural Parks +Jesuit Block and Estancias of Córdoba +Quebrada de Humahuaca +Qhapaq Ñan, Andean Road System +The Architectural Work of Le Corbusier +Los Alerces National Park +Monasteries of Haghpat and Sanahin +Cathedral and Churches of Echmiatsin and the Archaeological Site of Zvartnots +Monastery of Geghard and the Upper Azat Valley +Great Barrier Reef +Kakadu National Park +Willandra Lakes Region +Lord Howe Island Group +Tasmanian Wilderness +Gondwana Rainforests of Australia +Uluru-Kata Tjuta National Park +Wet Tropics of Queensland +Shark Bay, Western Australia +K’gari (Fraser Island) +Australian Fossil Mammal Sites (Riversleigh / Naracoorte) +Heard and McDonald Islands +Macquarie Island +Greater Blue Mountains Area +Purnululu National Park +Royal Exhibition Building and Carlton Gardens +Sydney Opera House +Australian Convict Sites +Ningaloo Coast +Budj Bim Cultural Landscape +Historic Centre of the City of Salzburg +Palace and Gardens of Schönbrunn +Hallstatt-Dachstein / Salzkammergut Cultural Landscape +Semmering Railway +City of Graz – Historic Centre and Schloss Eggenberg +Wachau Cultural Landscape +Historic Centre of Vienna +Fertö / Neusiedlersee Cultural Landscape +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Prehistoric Pile Dwellings around the Alps +Frontiers of the Roman Empire – The Danube Limes (Western Segment) +The Great Spa Towns of Europe +Walled City of Baku with the Shirvanshah's Palace and Maiden Tower +Gobustan Rock Art Cultural Landscape +Historic Centre of Sheki with the Khan’s Palace +Qal’at al-Bahrain – Ancient Harbour and Capital of Dilmun +Pearling, Testimony of an Island Economy +Dilmun Burial Mounds +Historic Mosque City of Bagerhat +Ruins of the Buddhist Vihara at Paharpur +The Sundarbans +Historic Bridgetown and its Garrison +Białowieża Forest +Mir Castle Complex +Architectural, Residential and Cultural Complex of the Radziwill Family at Nesvizh +Struve Geodetic Arc +Flemish Béguinages +La Grand-Place, Brussels +The Four Lifts on the Canal du Centre and their Environs, La Louvière and Le Roeulx (Hainaut) +Belfries of Belgium and France +Historic Centre of Brugge +Major Town Houses of the Architect Victor Horta (Brussels) +Neolithic Flint Mines at Spiennes (Mons) +Notre-Dame Cathedral in Tournai +Plantin-Moretus House-Workshops-Museum Complex +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Stoclet House +Major Mining Sites of Wallonia +The Architectural Work of Le Corbusier +Colonies of Benevolence +The Great Spa Towns of Europe +Belize Barrier Reef Reserve System +Royal Palaces of Abomey +W-Arly-Pendjari Complex +City of Potosí +Jesuit Missions of the Chiquitos +Historic City of Sucre +Fuerte de Samaipata +Noel Kempff Mercado National Park +Tiwanaku: Spiritual and Political Centre of the Tiwanaku Culture +Qhapaq Ñan, Andean Road System +Old Bridge Area of the Old City of Mostar +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Mehmed Paša Sokolović Bridge in Višegrad +Stećci Medieval Tombstone Graveyards +Tsodilo +Okavango Delta +Historic Town of Ouro Preto +Historic Centre of the Town of Olinda +Jesuit Missions of the Guaranis: San Ignacio Mini, Santa Ana, Nuestra Señora de Loreto and Santa Maria Mayor (Argentina), Ruins of Sao Miguel das Missoes (Brazil) +Historic Centre of Salvador de Bahia +Sanctuary of Bom Jesus do Congonhas +Iguaçu National Park +Brasilia +Serra da Capivara National Park +Historic Centre of São Luís +Atlantic Forest South-East Reserves +Discovery Coast Atlantic Forest Reserves +Historic Centre of the Town of Diamantina +Central Amazon Conservation Complex +Pantanal Conservation Area +Brazilian Atlantic Islands: Fernando de Noronha and Atol das Rocas Reserves +Cerrado Protected Areas: Chapada dos Veadeiros and Emas National Parks +Historic Centre of the Town of Goiás +São Francisco Square in the Town of São Cristóvão +Rio de Janeiro: Carioca Landscapes between the Mountain and the Sea +Pampulha Modern Ensemble +Valongo Wharf Archaeological Site +Paraty and Ilha Grande – Culture and Biodiversity +Sítio Roberto Burle Marx +Boyana Church +Madara Rider +Rock-Hewn Churches of Ivanovo +Thracian Tomb of Kazanlak +Ancient City of Nessebar +Pirin National Park +Rila Monastery +Srebarna Nature Reserve +Thracian Tomb of Sveshtari +Ancient and Primeval Beech Forests of the Carpathians +W-Arly-Pendjari Complex * +Ruins of Loropéni +Ancient Ferrous Metallurgy Sites of Burkina Faso +Cidade Velha, Historic Centre of Ribeira Grande +Angkor +Temple of Preah Vihear +Temple Zone of Sambor Prei Kuk, Archaeological Site of Ancient Ishanapura +Dja Faunal Reserve +Sangha Trinational +L’Anse aux Meadows National Historic Site +Nahanni National Park +Dinosaur Provincial Park +Kluane / Wrangell-St. Elias / Glacier Bay / Tatshenshini-Alsek +Head-Smashed-In Buffalo Jump +SGang Gwaay +Wood Buffalo National Park +Canadian Rocky Mountain Parks +Historic District of Old Québec +Gros Morne National Park +Old Town Lunenburg +Waterton Glacier International Peace Park +Miguasha National Park +Rideau Canal +Joggins Fossil Cliffs +Landscape of Grand Pré +Red Bay Basque Whaling Station +Mistaken Point +Pimachiowin Aki +Writing-on-Stone / Áísínai’pi +Manovo-Gounda St Floris National Park +Sangha Trinational +Lakes of Ounianga +Ennedi Massif: Natural and Cultural Landscape +Rapa Nui National Park +Churches of Chiloé +Historic Quarter of the Seaport City of Valparaíso +Humberstone and Santa Laura Saltpeter Works +Sewell Mining Town +Qhapaq Ñan, Andean Road System +Settlement and Artificial Mummification of the Chinchorro Culture in the Arica and Parinacota Region +Imperial Palaces of the Ming and Qing Dynasties in Beijing and Shenyang +Mausoleum of the First Qin Emperor +Mogao Caves +Mount Taishan +Peking Man Site at Zhoukoudian +The Great Wall +Mount Huangshan +Huanglong Scenic and Historic Interest Area +Jiuzhaigou Valley Scenic and Historic Interest Area +Wulingyuan Scenic and Historic Interest Area +Ancient Building Complex in the Wudang Mountains +Historic Ensemble of the Potala Palace, Lhasa +Mountain Resort and its Outlying Temples, Chengde +Temple and Cemetery of Confucius and the Kong Family Mansion in Qufu +Lushan National Park +Mount Emei Scenic Area, including Leshan Giant Buddha Scenic Area +Ancient City of Ping Yao +Classical Gardens of Suzhou +Old Town of Lijiang +Summer Palace, an Imperial Garden in Beijing +Temple of Heaven: an Imperial Sacrificial Altar in Beijing +Dazu Rock Carvings +Mount Wuyi +Ancient Villages in Southern Anhui – Xidi and Hongcun +Imperial Tombs of the Ming and Qing Dynasties +Longmen Grottoes +Mount Qingcheng and the Dujiangyan Irrigation System +Yungang Grottoes +Three Parallel Rivers of Yunnan Protected Areas +Capital Cities and Tombs of the Ancient Koguryo Kingdom +Historic Centre of Macao +Sichuan Giant Panda Sanctuaries - Wolong, Mt Siguniang and Jiajin Mountains +Yin Xu +Kaiping Diaolou and Villages +South China Karst +Fujian Tulou +Mount Sanqingshan National Park +Mount Wutai +China Danxia +Historic Monuments of Dengfeng in The Centre of Heaven and Earth +West Lake Cultural Landscape of Hangzhou +Chengjiang Fossil Site +Site of Xanadu +Cultural Landscape of Honghe Hani Rice Terraces +Xinjiang Tianshan +Silk Roads: the Routes Network of Chang'an-Tianshan Corridor +The Grand Canal +Tusi Sites +Hubei Shennongjia +Zuojiang Huashan Rock Art Cultural Landscape +Kulangsu, a Historic International Settlement +Qinghai Hoh Xil +Fanjingshan +Archaeological Ruins of Liangzhu City +Migratory Bird Sanctuaries along the Coast of Yellow Sea-Bohai Gulf of China +Quanzhou: Emporium of the World in Song-Yuan China +Port, Fortresses and Group of Monuments, Cartagena +Los Katíos National Park +Historic Centre of Santa Cruz de Mompox +National Archeological Park of Tierradentro +San Agustín Archaeological Park +Malpelo Fauna and Flora Sanctuary +Coffee Cultural Landscape of Colombia +Qhapaq Ñan, Andean Road System +Chiribiquete National Park The Maloca of the Jaguar +Sangha Trinational +Talamanca Range-La Amistad Reserves / La Amistad National Park +Cocos Island National Park +Area de Conservación Guanacaste +Precolumbian Chiefdom Settlements with Stone Spheres of the Diquís +Mount Nimba Strict Nature Reserve +Taï National Park +Comoé National Park +Historic Town of Grand-Bassam +Sudanese style mosques in northern Côte d’Ivoire +Historical Complex of Split with the Palace of Diocletian +Old City of Dubrovnik +Plitvice Lakes National Park +Episcopal Complex of the Euphrasian Basilica in the Historic Centre of Poreč +Historic City of Trogir +The Cathedral of St James in Šibenik +Stari Grad Plain +Stećci Medieval Tombstone Graveyards +Venetian Works of Defence +Old Havana and its Fortification System +Trinidad and the Valley de los Ingenios +San Pedro de la Roca Castle, Santiago de Cuba +Desembarco del Granma National Park +Viñales Valley +Archaeological Landscape of the First Coffee Plantations in the South-East of Cuba +Alejandro de Humboldt National Park +Urban Historic Centre of Cienfuegos +Historic Centre of Camagüey +Paphos +Painted Churches in the Troodos Region +Choirokoitia +Historic Centre of Český Krumlov +Historic Centre of Prague +Historic Centre of Telč +Pilgrimage Church of St John of Nepomuk at Zelená Hora +Kutná Hora: Historical Town Centre with the Church of St Barbara and the Cathedral of Our Lady at Sedlec +Lednice-Valtice Cultural Landscape +Gardens and Castle at Kroměříž +Holašovice Historic Village +Litomyšl Castle +Holy Trinity Column in Olomouc +Tugendhat Villa in Brno +Jewish Quarter and St Procopius' Basilica in Třebíč +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Erzgebirge/Krušnohoří Mining Region +Landscape for Breeding and Training of Ceremonial Carriage Horses at Kladruby nad Labem +The Great Spa Towns of Europe +Complex of Koguryo Tombs +Historic Monuments and Sites in Kaesong +Virunga National Park +Kahuzi-Biega National Park +Garamba National Park +Salonga National Park +Okapi Wildlife Reserve +Jelling Mounds, Runic Stones and Church +Roskilde Cathedral +Kronborg Castle +Ilulissat Icefjord +Wadden Sea +Stevns Klint +Christiansfeld, a Moravian Church Settlement +The par force hunting landscape in North Zealand +Kujataa Greenland: Norse and Inuit Farming at the Edge of the Ice Cap +Aasivissuit – Nipisat. Inuit Hunting Ground between Ice and Sea +Morne Trois Pitons National Park +Colonial City of Santo Domingo +City of Quito +Galápagos Islands +Sangay National Park +Historic Centre of Santa Ana de los Ríos de Cuenca +Qhapaq Ñan, Andean Road System +Abu Mena +Ancient Thebes with its Necropolis +Historic Cairo +Memphis and its Necropolis – the Pyramid Fields from Giza to Dahshur +Nubian Monuments from Abu Simbel to Philae +Saint Catherine Area +Wadi Al-Hitan (Whale Valley) +Joya de Cerén Archaeological Site +Asmara: A Modernist African City +Historic Centre (Old Town) of Tallinn +Struve Geodetic Arc +Rock-Hewn Churches, Lalibela +Simien National Park +Fasil Ghebbi, Gondar Region +Aksum +Lower Valley of the Awash +Lower Valley of the Omo +Tiya +Harar Jugol, the Fortified Historic Town +Konso Cultural Landscape +Levuka Historical Port Town +Fortress of Suomenlinna +Old Rauma +Petäjävesi Old Church +Verla Groundwood and Board Mill +Bronze Age Burial Site of Sammallahdenmäki +High Coast / Kvarken Archipelago +Struve Geodetic Arc +Chartres Cathedral +Mont-Saint-Michel and its Bay +Palace and Park of Versailles +Prehistoric Sites and Decorated Caves of the Vézère Valley +Vézelay, Church and Hill +Amiens Cathedral +Arles, Roman and Romanesque Monuments +Cistercian Abbey of Fontenay +Palace and Park of Fontainebleau +Roman Theatre and its Surroundings and the Triumphal Arch of Orange +From the Great Saltworks of Salins-les-Bains to the Royal Saltworks of Arc-et-Senans +Abbey Church of Saint-Savin sur Gartempe +Gulf of Porto: Calanche of Piana, Gulf of Girolata, Scandola Reserve +Place Stanislas, Place de la Carrière and Place d'Alliance in Nancy +Pont du Gard (Roman Aqueduct) +Strasbourg, Grande-Île and Neustadt +Cathedral of Notre-Dame, Former Abbey of Saint-Rémi and Palace of Tau, Reims +Paris, Banks of the Seine +Bourges Cathedral +Historic Centre of Avignon: Papal Palace, Episcopal Ensemble and Avignon Bridge +Canal du Midi +Historic Fortified City of Carcassonne +Pyrénées - Mont Perdu +Historic Site of Lyon +Routes of Santiago de Compostela in France +Belfries of Belgium and France +Jurisdiction of Saint-Emilion +The Loire Valley between Sully-sur-Loire and Chalonnes +Provins, Town of Medieval Fairs +Le Havre, the City Rebuilt by Auguste Perret +Ancient and Primeval Beech Forests of the Carpathians +Bordeaux, Port of the Moon +Fortifications of Vauban +Lagoons of New Caledonia: Reef Diversity and Associated Ecosystems +Episcopal City of Albi +Pitons, cirques and remparts of Reunion Island +Prehistoric Pile Dwellings around the Alps +The Causses and the Cévennes, Mediterranean agro-pastoral Cultural Landscape +Nord-Pas de Calais Mining Basin +Decorated Cave of Pont d’Arc, known as Grotte Chauvet-Pont d’Arc, Ardèche +Champagne Hillsides, Houses and Cellars +The Climats, terroirs of Burgundy +The Architectural Work of Le Corbusier, an Outstanding Contribution to the Modern Movement +Taputapuātea +Chaîne des Puys - Limagne fault tectonic arena +French Austral Lands and Seas +Cordouan Lighthouse +Nice, Winter Resort Town of the Riviera +Ecosystem and Relict Cultural Landscape of Lopé-Okanda +Ivindo National Park +Kunta Kinteh Island and Related Sites +Stone Circles of Senegambia +Gelati Monastery +Historical Monuments of Mtskheta +Upper Svaneti +Colchic Rainforests and Wetlands +Aachen Cathedral +Speyer Cathedral +Würzburg Residence with the Court Gardens and Residence Square +Pilgrimage Church of Wies +Castles of Augustusburg and Falkenlust at Brühl +St Mary's Cathedral and St Michael's Church at Hildesheim +Roman Monuments, Cathedral of St Peter and Church of Our Lady in Trier +Frontiers of the Roman Empire +Hanseatic City of Lübeck +Palaces and Parks of Potsdam and Berlin +Abbey and Altenmünster of Lorsch +Mines of Rammelsberg, Historic Town of Goslar and Upper Harz Water Management System +Maulbronn Monastery Complex +Town of Bamberg +Collegiate Church, Castle and Old Town of Quedlinburg +Völklingen Ironworks +Messel Pit Fossil Site +Bauhaus and its Sites in Weimar, Dessau and Bernau +Cologne Cathedral +Luther Memorials in Eisleben and Wittenberg +Classical Weimar +Museumsinsel (Museum Island), Berlin +Wartburg Castle +Garden Kingdom of Dessau-Wörlitz +Monastic Island of Reichenau +Zollverein Coal Mine Industrial Complex in Essen +Historic Centres of Stralsund and Wismar +Upper Middle Rhine Valley +Muskauer Park / Park Mużakowski +Town Hall and Roland on the Marketplace of Bremen +Old town of Regensburg with Stadtamhof +Ancient and Primeval Beech Forests of the Carpathians +Berlin Modernism Housing Estates +Wadden Sea +Fagus Factory in Alfeld +Prehistoric Pile Dwellings around the Alps +Margravial Opera House Bayreuth +Bergpark Wilhelmshöhe +Carolingian Westwork and Civitas Corvey +Speicherstadt and Kontorhaus District with Chilehaus +The Architectural Work of Le Corbusier, an Outstanding Contribution to the Modern Movement +Caves and Ice Age Art in the Swabian Jura +Archaeological Border complex of Hedeby and the Danevirke +Naumburg Cathedral +Erzgebirge/Krušnohoří Mining Region +Water Management System of Augsburg +Frontiers of the Roman Empire – The Danube Limes (Western Segment) +Frontiers of the Roman Empire – The Lower German Limes +Mathildenhöhe Darmstadt +ShUM Sites of Speyer, Worms and Mainz +The Great Spa Towns of Europe +Forts and Castles, Volta, Greater Accra, Central and Western Regions +Asante Traditional Buildings +Temple of Apollo Epicurius at Bassae +Acropolis, Athens +Archaeological Site of Delphi +Medieval City of Rhodes +Meteora +Mount Athos +Paleochristian and Byzantine Monuments of Thessalonika +Sanctuary of Asklepios at Epidaurus +Archaeological Site of Mystras +Archaeological Site of Olympia +Delos +Monasteries of Daphni, Hosios Loukas and Nea Moni of Chios +Pythagoreion and Heraion of Samos +Archaeological Site of Aigai (modern name Vergina) +Archaeological Sites of Mycenae and Tiryns +The Historic Centre (Chorá) with the Monastery of Saint-John the Theologian and the Cave of the Apocalypse on the Island of Pátmos +Old Town of Corfu +Archaeological Site of Philippi +Antigua Guatemala +Tikal National Park +Archaeological Park and Ruins of Quirigua +Mount Nimba Strict Nature Reserve +National History Park – Citadel, Sans Souci, Ramiers +Historic Centre of Rome, the Properties of the Holy See in that City Enjoying Extraterritorial Rights and San Paolo Fuori le Mura +Vatican City +Maya Site of Copan +Río Plátano Biosphere Reserve +Budapest, including the Banks of the Danube, the Buda Castle Quarter and Andrássy Avenue +Old Village of Hollókő and its Surroundings +Caves of Aggtelek Karst and Slovak Karst +Millenary Benedictine Abbey of Pannonhalma and its Natural Environment +Hortobágy National Park - the Puszta +Early Christian Necropolis of Pécs (Sopianae) +Fertö / Neusiedlersee Cultural Landscape +Tokaj Wine Region Historic Cultural Landscape +Þingvellir National Park +Surtsey +Vatnajökull National Park - Dynamic Nature of Fire and Ice +Agra Fort +Ajanta Caves +Ellora Caves +Taj Mahal +Group of Monuments at Mahabalipuram +Sun Temple, Konârak +Kaziranga National Park +Keoladeo National Park +Manas Wildlife Sanctuary +Churches and Convents of Goa +Fatehpur Sikri +Group of Monuments at Hampi +Khajuraho Group of Monuments +Elephanta Caves +Great Living Chola Temples +Group of Monuments at Pattadakal +Sundarbans National Park +Nanda Devi and Valley of Flowers National Parks +Buddhist Monuments at Sanchi +Humayun's Tomb, Delhi +Qutb Minar and its Monuments, Delhi +Mountain Railways of India +Mahabodhi Temple Complex at Bodh Gaya +Rock Shelters of Bhimbetka +Champaner-Pavagadh Archaeological Park +Chhatrapati Shivaji Terminus (formerly Victoria Terminus) +Red Fort Complex +The Jantar Mantar, Jaipur +Western Ghats +Hill Forts of Rajasthan +Great Himalayan National Park Conservation Area +Rani-ki-Vav (the Queen’s Stepwell) at Patan, Gujarat +Archaeological Site of Nalanda Mahavihara at Nalanda, Bihar +Khangchendzonga National Park +The Architectural Work of Le Corbusier, an Outstanding Contribution to the Modern Movement +Historic City of Ahmadabad +Victorian Gothic and Art Deco Ensembles of Mumbai +Jaipur City, Rajasthan +Dholavira: a Harappan City +Kakatiya Rudreshwara (Ramappa) Temple, Telangana +Borobudur Temple Compounds +Komodo National Park +Prambanan Temple Compounds +Ujung Kulon National Park +Sangiran Early Man Site +Lorentz National Park +Tropical Rainforest Heritage of Sumatra +Cultural Landscape of Bali Province: the Subak System as a Manifestation of the Tri Hita Karana Philosophy +Ombilin Coal Mining Heritage of Sawahlunto +Meidan Emam, Esfahan +Persepolis +Tchogha Zanbil +Takht-e Soleyman +Bam and its Cultural Landscape +Pasargadae +Soltaniyeh +Bisotun +Armenian Monastic Ensembles of Iran +Shushtar Historical Hydraulic System +Sheikh Safi al-din Khānegāh and Shrine Ensemble in Ardabil +Tabriz Historic Bazaar Complex +The Persian Garden +Gonbad-e Qābus +Masjed-e Jāmé of Isfahan +Golestan Palace +Shahr-i Sokhta +Cultural Landscape of Maymand +Susa +Lut Desert +The Persian Qanat +Historic City of Yazd +Sassanid Archaeological Landscape of Fars Region +Hyrcanian Forests +Cultural Landscape of Hawraman/Uramanat +Trans-Iranian Railway +Hatra +Ashur (Qal'at Sherqat) +Samarra Archaeological City +Erbil Citadel +The Ahwar of Southern Iraq: Refuge of Biodiversity and the Relict Landscape of the Mesopotamian Cities +Babylon +Brú na Bóinne - Archaeological Ensemble of the Bend of the Boyne +Sceilg Mhichíl +Masada +Old City of Acre +White City of Tel-Aviv – the Modern Movement +Biblical Tels - Megiddo, Hazor, Beer Sheba +Incense Route - Desert Cities in the Negev +Bahá’i Holy Places in Haifa and the Western Galilee +Sites of Human Evolution at Mount Carmel: The Nahal Me’arot / Wadi el-Mughara Caves +Caves of Maresha and Bet-Guvrin in the Judean Lowlands as a Microcosm of the Land of the Caves +Necropolis of Bet She’arim: A Landmark of Jewish Renewal +Rock Drawings in Valcamonica +Church and Dominican Convent of Santa Maria delle Grazie with “The Last Supper” by Leonardo da Vinci +Historic Centre of Rome, the Properties of the Holy See in that City Enjoying Extraterritorial Rights and San Paolo Fuori le Mura * +Historic Centre of Florence +Piazza del Duomo, Pisa +Venice and its Lagoon +Historic Centre of San Gimignano +The Sassi and the Park of the Rupestrian Churches of Matera +City of Vicenza and the Palladian Villas of the Veneto +Crespi d'Adda +Ferrara, City of the Renaissance, and its Po Delta +Historic Centre of Naples +Historic Centre of Siena +Castel del Monte +Early Christian Monuments of Ravenna +Historic Centre of the City of Pienza +The Trulli of Alberobello +18th-Century Royal Palace at Caserta with the Park, the Aqueduct of Vanvitelli, and the San Leucio Complex +Archaeological Area of Agrigento +Archaeological Areas of Pompei, Herculaneum and Torre Annunziata +Botanical Garden (Orto Botanico), Padua +Cathedral, Torre Civica and Piazza Grande, Modena +Costiera Amalfitana +Portovenere, Cinque Terre, and the Islands (Palmaria, Tino and Tinetto) +Residences of the Royal House of Savoy +Su Nuraxi di Barumini +Villa Romana del Casale +Archaeological Area and the Patriarchal Basilica of Aquileia +Cilento and Vallo di Diano National Park with the Archeological Sites of Paestum and Velia, and the Certosa di Padula +Historic Centre of Urbino +Villa Adriana (Tivoli) +Assisi, the Basilica of San Francesco and Other Franciscan Sites +City of Verona +Isole Eolie (Aeolian Islands) +Villa d'Este, Tivoli +Late Baroque Towns of the Val di Noto (South-Eastern Sicily) +Sacri Monti of Piedmont and Lombardy +Monte San Giorgio +Etruscan Necropolises of Cerveteri and Tarquinia +Val d'Orcia +Syracuse and the Rocky Necropolis of Pantalica +Genoa: Le Strade Nuove and the system of the Palazzi dei Rolli +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Mantua and Sabbioneta +Rhaetian Railway in the Albula / Bernina Landscapes +The Dolomites +Longobards in Italy. Places of the Power +Prehistoric Pile Dwellings around the Alps +Medici Villas and Gardens in Tuscany +Mount Etna +Vineyard Landscape of Piedmont: Langhe-Roero and Monferrato +Arab-Norman Palermo and the Cathedral Churches of Cefalú and Monreale +Venetian Works of Defence between the 16th and 17th Centuries: Stato da Terra – Western Stato da Mar +Ivrea, industrial city of the 20th century +Le Colline del Prosecco di Conegliano e Valdobbiadene +Padua’s fourteenth-century fresco cycles +The Great Spa Towns of Europe +The Porticoes of Bologna +Blue and John Crow Mountains +Buddhist Monuments in the Horyu-ji Area +Himeji-jo +Shirakami-Sanchi +Yakushima +Historic Monuments of Ancient Kyoto (Kyoto, Uji and Otsu Cities) +Historic Villages of Shirakawa-go and Gokayama +Hiroshima Peace Memorial (Genbaku Dome) +Itsukushima Shinto Shrine +Historic Monuments of Ancient Nara +Shrines and Temples of Nikko +Gusuku Sites and Related Properties of the Kingdom of Ryukyu +Sacred Sites and Pilgrimage Routes in the Kii Mountain Range +Shiretoko +Iwami Ginzan Silver Mine and its Cultural Landscape +Hiraizumi – Temples, Gardens and Archaeological Sites Representing the Buddhist Pure Land +Ogasawara Islands +Fujisan, sacred place and source of artistic inspiration +Tomioka Silk Mill and Related Sites +Sites of Japan’s Meiji Industrial Revolution: Iron and Steel, Shipbuilding and Coal Mining +The Architectural Work of Le Corbusier, an Outstanding Contribution to the Modern Movement +Sacred Island of Okinoshima and Associated Sites in the Munakata Region +Hidden Christian Sites in the Nagasaki Region +Mozu-Furuichi Kofun Group: Mounded Tombs of Ancient Japan +Amami-Oshima Island, Tokunoshima Island, Northern part of Okinawa Island, and Iriomote Island +Jomon Prehistoric Sites in Northern Japan +Old City of Jerusalem and its Walls +Petra +Quseir Amra +Um er-Rasas (Kastrom Mefa'a) +Wadi Rum Protected Area +Baptism Site ‘Bethany Beyond the Jordan’(Al-Maghtas) +As-Salt - The Place of Tolerance and Urban Hospitality +Mausoleum of Khoja Ahmed Yasawi +Petroglyphs of the Archaeological Landscape of Tanbaly +Saryarka – Steppe and Lakes of Northern Kazakhstan +Silk Roads: the Routes Network of Chang'an-Tianshan Corridor +Western Tien-Shan +Lake Turkana National Parks +Mount Kenya National Park/Natural Forest +Lamu Old Town +Sacred Mijikenda Kaya Forests +Fort Jesus, Mombasa +Kenya Lake System in the Great Rift Valley +Thimlich Ohinga Archaeological Site +Phoenix Islands Protected Area +Sulaiman-Too Sacred Mountain +Silk Roads: the Routes Network of Chang'an-Tianshan Corridor +Western Tien-Shan +Town of Luang Prabang +Vat Phou and Associated Ancient Settlements within the Champasak Cultural Landscape +Megalithic Jar Sites in Xiengkhuang – Plain of Jars +Historic Centre of Riga +Struve Geodetic Arc +Anjar +Baalbek +Byblos +Tyre +Ouadi Qadisha (the Holy Valley) and the Forest of the Cedars of God (Horsh Arz el-Rab) +Maloti-Drakensberg Park +Archaeological Site of Cyrene +Archaeological Site of Leptis Magna +Archaeological Site of Sabratha +Rock-Art Sites of Tadrart Acacus +Old Town of Ghadamès +Vilnius Historic Centre +Curonian Spit +Kernavė Archaeological Site (Cultural Reserve of Kernavė) +Struve Geodetic Arc +City of Luxembourg: its Old Quarters and Fortifications +Tsingy de Bemaraha Strict Nature Reserve +Royal Hill of Ambohimanga +Rainforests of the Atsinanana +Lake Malawi National Park +Chongoni Rock-Art Area +Gunung Mulu National Park +Kinabalu Park +Melaka and George Town, Historic Cities of the Straits of Malacca +Archaeological Heritage of the Lenggong Valley +Old Towns of Djenné +Timbuktu +Cliff of Bandiagara (Land of the Dogons) +Tomb of Askia +City of Valletta +Ħal Saflieni Hypogeum +Megalithic Temples of Malta +Bikini Atoll Nuclear Test Site +Banc d'Arguin National Park +Ancient Ksour of Ouadane, Chinguetti, Tichitt and Oualata +Aapravasi Ghat +Le Morne Cultural Landscape +Historic Centre of Mexico City and Xochimilco +Historic Centre of Oaxaca and Archaeological Site of Monte Albán +Historic Centre of Puebla +Pre-Hispanic City and National Park of Palenque +Pre-Hispanic City of Teotihuacan +Sian Ka'an +Historic Town of Guanajuato and Adjacent Mines +Pre-Hispanic City of Chichen-Itza +Historic Centre of Morelia +El Tajin, Pre-Hispanic City +Historic Centre of Zacatecas +Rock Paintings of the Sierra de San Francisco +Whale Sanctuary of El Vizcaino +Earliest 16th-Century Monasteries on the Slopes of Popocatepetl +Historic Monuments Zone of Querétaro +Pre-Hispanic Town of Uxmal +Hospicio Cabañas, Guadalajara +Archaeological Zone of Paquimé, Casas Grandes +Historic Monuments Zone of Tlacotalpan +Archaeological Monuments Zone of Xochicalco +Historic Fortified Town of Campeche +Ancient Maya City and Protected Tropical Forests of Calakmul, Campeche +Franciscan Missions in the Sierra Gorda of Querétaro +Luis Barragán House and Studio +Islands and Protected Areas of the Gulf of California +Agave Landscape and Ancient Industrial Facilities of Tequila +Central University City Campus of the Universidad Nacional Autónoma de México (UNAM) +Monarch Butterfly Biosphere Reserve +Protective town of San Miguel and the Sanctuary of Jesús Nazareno de Atotonilco +Camino Real de Tierra Adentro +Prehistoric Caves of Yagul and Mitla in the Central Valley of Oaxaca +El Pinacate and Gran Desierto de Altar Biosphere Reserve +Aqueduct of Padre Tembleque Hydraulic System +Archipiélago de Revillagigedo +Tehuacán-Cuicatlán Valley: originary habitat of Mesoamerica +Nan Madol: Ceremonial Centre of Eastern Micronesia +Uvs Nuur Basin +Orkhon Valley Cultural Landscape +Petroglyphic Complexes of the Mongolian Altai +Great Burkhan Khaldun Mountain and its surrounding sacred landscape +Landscapes of Dauria +Natural and Culturo-Historical Region of Kotor +Durmitor National Park +Stećci Medieval Tombstone Graveyards +Venetian Works of Defence between the 16th and 17th Centuries: Stato da Terra – Western Stato da Mar +Medina of Fez +Medina of Marrakesh +Ksar of Ait-Ben-Haddou +Historic City of Meknes +Archaeological Site of Volubilis +Medina of Tétouan (formerly known as Titawin) +Medina of Essaouira (formerly Mogador) +Portuguese City of Mazagan (El Jadida) +Rabat, Modern Capital and Historic City: a Shared Heritage +Island of Mozambique +Pyu Ancient Cities +Bagan +Twyfelfontein or /Ui-//aes +Namib Sand Sea +Kathmandu Valley +Sagarmatha National Park +Chitwan National Park +Lumbini, the Birthplace of the Lord Buddha +Schokland and Surroundings +Dutch Water Defence Lines +Historic Area of Willemstad, Inner City and Harbour, Curaçao +Mill Network at Kinderdijk-Elshout +Ir.D.F. Woudagemaal (D.F. Wouda Steam Pumping Station) +Droogmakerij de Beemster (Beemster Polder) +Rietveld Schröderhuis (Rietveld Schröder House) +Wadden Sea +Seventeenth-Century Canal Ring Area of Amsterdam inside the Singelgracht +Van Nellefabriek +Colonies of Benevolence +Frontiers of the Roman Empire – The Lower German Limes +Te Wahipounamu – South West New Zealand +Tongariro National Park +New Zealand Sub-Antarctic Islands +Ruins of León Viejo +León Cathedral +Air and Ténéré Natural Reserves +W-Arly-Pendjari Complex +Historic Centre of Agadez +Sukur Cultural Landscape +Osun-Osogbo Sacred Grove +Natural and Cultural Heritage of the Ohrid region +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Bryggen +Urnes Stave Church +Røros Mining Town and the Circumference +Rock Art of Alta +Vegaøyan – The Vega Archipelago +Struve Geodetic Arc +West Norwegian Fjords – Geirangerfjord and Nærøyfjord +Rjukan-Notodden Industrial Heritage Site +Bahla Fort +Archaeological Sites of Bat, Al-Khutm and Al-Ayn +Land of Frankincense +Aflaj Irrigation Systems of Oman +Ancient City of Qalhat +Archaeological Ruins at Moenjodaro +Buddhist Ruins of Takht-i-Bahi and Neighbouring City Remains at Sahr-i-Bahlol +Taxila +Fort and Shalamar Gardens in Lahore +Historical Monuments at Makli, Thatta +Rohtas Fort +Rock Islands Southern Lagoon +Birthplace of Jesus: Church of the Nativity and the Pilgrimage Route, Bethlehem +Palestine: Land of Olives and Vines – Cultural Landscape of Southern Jerusalem, Battir +Hebron/Al-Khalil Old Town +Fortifications on the Caribbean Side of Panama: Portobelo-San Lorenzo +Darien National Park +Talamanca Range-La Amistad Reserves / La Amistad National Park +Archaeological Site of Panamá Viejo and Historic District of Panamá +Coiba National Park and its Special Zone of Marine Protection +Kuk Early Agricultural Site +Jesuit Missions of La Santísima Trinidad de Paraná and Jesús de Tavarangue +City of Cuzco +Historic Sanctuary of Machu Picchu +Chavin (Archaeological Site) +Huascarán National Park +Chan Chan Archaeological Zone +Manú National Park +Historic Centre of Lima +Río Abiseo National Park +Lines and Geoglyphs of Nasca and Palpa +Historical Centre of the City of Arequipa +Sacred City of Caral-Supe +Qhapaq Ñan, Andean Road System +Chankillo Archaeoastronomical Complex +Baroque Churches of the Philippines +Tubbataha Reefs Natural Park +Rice Terraces of the Philippine Cordilleras +Historic City of Vigan +Puerto-Princesa Subterranean River National Park +Mount Hamiguitan Range Wildlife Sanctuary +Historic Centre of Krakow +Wieliczka and Bochnia Royal Salt Mines +Auschwitz Birkenau German Nazi Concentration and Extermination Camp (1940-1945) +Bialowieza Forest +Historic Centre of Warsaw +Old City of Zamosc +Castle of the Teutonic Order in Malbork +Medieval Town of Torun +Kalwaria Zebrzydowska: the Mannerist Architectural and Park Landscape Complex and Pilgrimage Park +Churches of Peace in Jawor and Swidnica +Wooden Churches of Southern Malopolska +Muskauer Park / Park Muzakowski +Centennial Hall in Wroclaw +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Wooden Tserkvas of the Carpathian Region in Poland and Ukraine +Tarnowskie Gory Lead-Silver-Zinc Mine and its Underground Water Management System +Krzemionki Prehistoric Striped Flint Mining Region +Central Zone of the Town of Angra do Heroismo in the Azores +Convent of Christ in Tomar +Monastery of Batalha +Monastery of the Hieronymites and Tower of Belem in Lisbon +Historic Centre of Evora +Monastery of Alcobaca +Cultural Landscape of Sintra +Historic Centre of Oporto, Luiz I Bridge and Monastery of Serra do Pilar +Prehistoric Rock Art Sites in the Côa Valley and Siega Verde +Laurisilva of Madeira +Alto Douro Wine Region +Historic Centre of Guimarães +Landscape of the Pico Island Vineyard Culture +Garrison Border Town of Elvas and its Fortifications +University of Coimbra – Alta and Sofia +Royal Building of Mafra – Palace, Basilica, Convent, Cerco Garden and Hunting Park (Tapada) +Sanctuary of Bom Jesus do Monte in Braga +Al Zubarah Archaeological Site +Haeinsa Temple Janggyeong Panjeon, the Depositories for the Tripitaka Koreana Woodblocks +Jongmyo Shrine +Seokguram Grotto and Bulguksa Temple +Changdeokgung Palace Complex +Hwaseong Fortress +Gochang, Hwasun and Ganghwa Dolmen Sites +Gyeongju Historic Areas +Jeju Volcanic Island and Lava Tubes +Royal Tombs of the Joseon Dynasty +Historic Villages of Korea: Hahoe and Yangdong +Namhansanseong +Baekje Historic Areas +Sansa, Buddhist Mountain Monasteries in Korea +Seowon, Korean Neo-Confucian Academies +Getbol, Korean Tidal Flats +Struve Geodetic Arc +Danube Delta +Churches of Moldavia +Monastery of Horezu +Villages with Fortified Churches in Transylvania +Dacian Fortresses of the Orastie Mountains +Historic Centre of Sighişoara +Wooden Churches of Maramureş +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Rosia Montana Mining Landscape +Historic Centre of Saint Petersburg and Related Groups of Monuments +Kizhi Pogost +Kremlin and Red Square, Moscow +Cultural and Historic Ensemble of the Solovetsky Islands +Historic Monuments of Novgorod and Surroundings +White Monuments of Vladimir and Suzdal +Architectural Ensemble of the Trinity Sergius Lavra in Sergiev Posad +Church of the Ascension, Kolomenskoye +Virgin Komi Forests +Lake Baikal +Volcanoes of Kamchatka +Golden Mountains of Altai +Western Caucasus +Curonian Spit +Ensemble of the Ferapontov Monastery +Historic and Architectural Complex of the Kazan Kremlin +Central Sikhote-Alin +Citadel, Ancient City and Fortress Buildings of Derbent +Uvs Nuur Basin +Ensemble of the Novodevichy Convent +Natural System of Wrangel Island Reserve +Historical Centre of the City of Yaroslavl +Struve Geodetic Arc +Putorana Plateau +Lena Pillars Nature Park +Bolgar Historical and Archaeological Complex +Assumption Cathedral and Monastery of the town-island of Sviyazhsk +Landscapes of Dauria +Churches of the Pskov School of Architecture +Petroglyphs of Lake Onega and the White Sea +Brimstone Hill Fortress National Park +Pitons Management Area +San Marino Historic Centre and Mount Titano +Hegra Archaeological Site +At-Turaif District in ad-Dir'iyah +Historic Jeddah, the Gate to Makkah +Rock Art in the Hail Region of Saudi Arabia +Al-Ahsa Oasis, an Evolving Cultural Landscape +Hima Cultural Area +Island of Goree +Niokolo-Koba National Park +Djoudj National Bird Sanctuary +Island of Saint-Louis +Stone Circles of Senegambia +Saloum Delta +Bassari Country: Bassari, Fula and Bedik Cultural Landscapes +Stari Ras and Sopocani +Studenica Monastery +Medieval Monuments in Kosovo +Gamzigrad-Romuliana, Palace of Galerius +Stecci Medieval Tombstone Graveyards +Aldabra Atoll +Vallee de Mai Nature Reserve +Singapore Botanic Gardens +Historic Town of Banská Štiavnica and the Technical Monuments in its Vicinity +Levoca, Spissky Hrad and the Associated Cultural Monuments +Vlkolinec +Caves of Aggtelek Karst and Slovak Karst +Bardejov Town Conservation Reserve +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Wooden Churches of the Slovak part of the Carpathian Mountain Area +Frontiers of the Roman Empire – The Danube Limes (Western Segment) +Skocjan Caves +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Prehistoric Pile Dwellings around the Alps +Heritage of Mercury. Almadén and Idrija +The works of Jože Plečnik in Ljubljana +East Rennell +Fossil Hominid Sites of South Africa +iSimangaliso Wetland Park +Robben Island +Maloti-Drakensberg Park +Mapungubwe Cultural Landscape +Cape Floral Region Protected Areas +Vredefort Dome +Richtersveld Cultural and Botanical Landscape +Khomani Cultural Landscape +Barberton Makhonjwa Mountains +Alhambra, Generalife and Albayzin, Granada +Burgos Cathedral +Historic Centre of Cordoba +Monastery and Site of the Escurial, Madrid +Works of Antoni Gaudi +Cave of Altamira and Paleolithic Cave Art of Northern Spain +Monuments of Oviedo and the Kingdom of the Asturias +Old Town of Ávila with its Extra-Muros Churches +Old Town of Segovia and its Aqueduct +Santiago de Compostela (Old Town) +Garajonay National Park +Historic City of Toledo +Mudejar Architecture of Aragon +Old Town of Caceres +Cathedral, Alcazar and Archivo de Indias in Seville +Old City of Salamanca +Poblet Monastery +Archaeological Ensemble of Merida +Routes of Santiago de Compostela: Camino Frances and Routes of Northern Spain +Royal Monastery of Santa María de Guadalupe +Donana National Park +Historic Walled Town of Cuenca +La Lonja de la Seda de Valencia +Las Medulas +Palau de la Musica Catalana and Hospital de Sant Pau, Barcelona +Pyrenees - Mont Perdu +San Millan Yuso and Suso Monasteries +Prehistoric Rock Art Sites in the Côa Valley and Siega Verde +Rock Art of the Mediterranean Basin on the Iberian Peninsula +University and Historic Precinct of Alcala de Henares +Ibiza, Biodiversity and Culture +San Cristóbal de La Laguna +Archaeological Ensemble of Tarraco +Archaeological Site of Atapuerca +Catalan Romanesque Churches of the Vall de Boi +Palmeral of Elche +Roman Walls of Lugo +Aranjuez Cultural Landscape +Renaissance Monumental Ensembles of Ubeda and Baeza +Vizcaya Bridge +Teide National Park +Tower of Hercules +Cultural Landscape of the Serra de Tramuntana +Heritage of Mercury. Almadén and Idrija +Antequera Dolmens Site +Caliphate City of Medina Azahara +Risco Caido and the Sacred Mountains of Gran Canaria Cultural Landscape +Paseo del Prado and Buen Retiro, a landscape of Arts and Sciences +Ancient City of Polonnaruwa +Ancient City of Sigiriya +Sacred City of Anuradhapura +Old Town of Galle and its Fortifications +Sacred City of Kandy +Sinharaja Forest Reserve +Rangiri Dambulla Cave Temple +Central Highlands of Sri Lanka +Gebel Barkal and the Sites of the Napatan Region +Archaeological Sites of the Island of Meroe +Sanganeb Marine National Park and Dungonab Bay – Mukkawar Island Marine National Park +Central Suriname Nature Reserve +Historic Inner City of Paramaribo +Royal Domain of Drottningholm +Birka and Hovgarden +Engelsberg Ironworks +Rock Carvings in Tanum +Skogskyrkogarden +Hanseatic Town of Visby +Church Town of Gammelstad, Lulea +Laponian Area +Naval Port of Karlskrona +Agricultural Landscape of Southern Oland +High Coast / Kvarken Archipelago +Mining Area of the Great Copper Mountain in Falun +Grimeton Radio Station, Varberg +Struve Geodetic Arc +Decorated Farmhouses of Halsingland +Abbey of St Gall +Benedictine Convent of St John at Mustair +Old City of Berne +Three Castles, Defensive Wall and Ramparts of the Market-Town of Bellinzona +Swiss Alps Jungfrau-Aletsch +Monte San Giorgio +Ancient and Primeval Beech Forests of the Carpathians +Lavaux, Vineyard Terraces +Rhaetian Railway in the Albula / Bernina Landscapes +Swiss Tectonic Arena Sardona +La Chaux-de-Fonds / Le Locle, Watchmaking Town Planning +Prehistoric Pile Dwellings around the Alps +The Architectural Work of Le Corbusier +Ancient City of Damascus +Ancient City of Bosra +Site of Palmyra +Ancient City of Aleppo +Crac des Chevaliers and Qal’at Salah El-Din +Ancient Villages of Northern Syria +Proto-urban Site of Sarazm +Tajik National Park (Mountains of the Pamirs) +Historic City of Ayutthaya +Historic Town of Sukhothai and Associated Historic Towns +Thungyai-Huai Kha Khaeng Wildlife Sanctuaries +Ban Chiang Archaeological Site +Dong Phayayen-Khao Yai Forest Complex +Kaeng Krachan Forest Complex +Koutammakou, the Land of the Batammariba +Amphitheatre of El Jem +Archaeological Site of Carthage +Medina of Tunis +Ichkeul National Park +Punic Town of Kerkuane and its Necropolis +Kairouan +Medina of Sousse +Dougga / Thugga +Goreme National Park and the Rock Sites of Cappadocia +Great Mosque and Hospital of Divriği +Historic Areas of Istanbul +Hattusha: the Hittite Capital +Nemrut Dag +Hierapolis-Pamukkale +Xanthos-Letoon +City of Safranbolu +Archaeological Site of Troy +Selimiye Mosque and its Social Complex +Neolithic Site of Catalhoyuk +Bursa and Cumalıkızık: the Birth of the Ottoman Empire +Pergamon and its Multi-Layered Cultural Landscape +Diyarbakır Fortress and Hevsel Gardens Cultural Landscape +Ephesus +Archaeological Site of Ani +Aphrodisias +Gobekli Tepe +Arslantepe Mound +State Historical and Cultural Park “Ancient Merv” +Kunya-Urgench +Parthian Fortresses of Nisa +Bwindi Impenetrable National Park +Rwenzori Mountains National Park +Tombs of Buganda Kings at Kasubi +Kyiv: Saint-Sophia Cathedral and Related Monastic Buildings, Kyiv-Pechersk Lavra +L'viv – the Ensemble of the Historic Centre +Struve Geodetic Arc +Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe +Residence of Bukovinian and Dalmatian Metropolitans +Ancient City of Tauric Chersonese and its Chora +Wooden Tserkvas of the Carpathian Region in Poland and Ukraine +Cultural Sites of Al Ain (Hafit, Hili, Bidaa Bint Saud and Oases Areas) +Castles and Town Walls of King Edward in Gwynedd +Durham Castle and Cathedral +Giant's Causeway and Causeway Coast +Ironbridge Gorge +St Kilda +Stonehenge, Avebury and Associated Sites +Studley Royal Park including the Ruins of Fountains Abbey +Blenheim Palace +City of Bath +Frontiers of the Roman Empire +Palace of Westminster and Westminster Abbey including Saint Margaret’s Church +Canterbury Cathedral, St Augustine's Abbey, and St Martin's Church +Henderson Island +Tower of London +Gough and Inaccessible Islands +Old and New Towns of Edinburgh +Maritime Greenwich +Heart of Neolithic Orkney +Blaenavon Industrial Landscape +Historic Town of St George and Related Fortifications, Bermuda +Derwent Valley Mills +Dorset and East Devon Coast +New Lanark +Saltaire +Royal Botanic Gardens, +Cornwall and West Devon Mining Landscape +Pontcysyllte Aqueduct and Canal +The Forth Bridge +Gorham's Cave Complex +The English Lake District +Jodrell Bank Observatory +The Great Spa Towns of Europe +The Slate Landscape of Northwest Wales +Ngorongoro Conservation Area +Ruins of Kilwa Kisiwani and Ruins of Songo Mnara +Serengeti National Park +Selous Game Reserve +Kilimanjaro National Park +Stone Town of Zanzibar +Kondoa Rock-Art Sites +Mesa Verde National Park +Yellowstone National Park +Everglades National Park +Grand Canyon National Park +Independence Hall +Kluane / Wrangell-St. Elias / Glacier Bay / Tatshenshini-Alsek +Redwood National and State Parks +Mammoth Cave National Park +Olympic National Park +Cahokia Mounds State Historic Site +Great Smoky Mountains National Park +La Fortaleza and San Juan National Historic Site in Puerto Rico +Statue of Liberty +Yosemite National Park +Chaco Culture +Hawaii Volcanoes National Park +Monticello and the University of Virginia in Charlottesville +Taos Pueblo +Carlsbad Caverns National Park +Waterton Glacier International Peace Park +Papahānaumokuākea +Monumental Earthworks of Poverty Point +San Antonio Missions +The 20th-Century Architecture of Frank Lloyd Wright +Historic Quarter of the City of Colonia del Sacramento +Fray Bentos Industrial Landscape +The work of engineer Eladio Dieste: Church of Atlantida +Itchan Kala +Historic Centre of Bukhara +Historic Centre of Shakhrisyabz +Samarkand Crossroad of Cultures +Western Tien-Shan +Chief Roi Matas Domain +Coro and its Port +Canaima National Park +Ciudad Universitaria de Caracas +Complex of Hue Monuments +Ha Long Bay +Hoi An Ancient Town +My Son Sanctuary +Phong Nha-Ke Bang National Park +Central Sector of the Imperial Citadel of Thang Long - Hanoi +Citadel of the Ho Dynasty +Trang An Landscape Complex +Old Walled City of Shibam +Old City of Sana'a +Historic Town of Zabid +Socotra Archipelago +Mosi-oa-Tunya / Victoria Falls +Mana Pools National Park, Sapi and Chewore Safari Areas +Great Zimbabwe National Monument +Khami Ruins National Monument +Mosi-oa-Tunya / Victoria Falls +Matobo Hills diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/Actions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/Actions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a151752205c1d29ba64664df5b841c77dc1c1226 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/Actions.txt" @@ -0,0 +1,20 @@ +Standing +Sitting +Kneeling +Thinking +Looking into distance +Looking towards us +Being concerned +Riding a horse +Riding a beast +Riding a motorcycle +Holding a sword +Holding flowers +Running +Riding a dragon +Riding a robot +Holding a gun +Shooting +Action pose +Relaxed +Symmetric \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/eyes_gazes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/eyes_gazes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fc2efb9f58828fcc7ce1d1f8033ac3dd25b54f66 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/eyes_gazes.txt" @@ -0,0 +1,16 @@ +eye contact +looking afar +looking at another +looking at breasts +looking at hand +looking at mirror +looking at phone +looking at viewer +looking away +looking back +looking down +looking outside +looking over eyewear +looking through legs +looking to the side +looking up \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/gestures_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/gestures_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ecb4b05971b71dc28a49b34dc703427baafb7048 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/gestures_other.txt" @@ -0,0 +1,10 @@ +air guitar +bowing +curtsey +heart tail +heart tail duo +kuji-in +shrugging +toe-point +victory pose +orchid fingers \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/gestures_two_hands.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/gestures_two_hands.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8be2a9aaae070392d4d1c4ed45149ed974ec7c0a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/gestures_two_hands.txt" @@ -0,0 +1,27 @@ +air quotes +bunny pose +carry me +clenched hands +cupping hands +double v +fidgeting +finger counting +finger frame +fist bump +fist in hand +hand glasses +own hands clasped +heart arms +heart hands +heart hands duo +heart hands trio +high five +horns pose +noogie +palm-fist greeting +palm-fist tap +penetration gesture +pinky swear +shadow puppet +tsuki ni kawatte oshioki yo +steepled fingers \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_one_open_finger.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_one_open_finger.txt" new file mode 100644 index 0000000000000000000000000000000000000000..83d79e02a25eb81817a7da21113d2ee7a88a3a63 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_one_open_finger.txt" @@ -0,0 +1,17 @@ +akanbe +slit throat +index finger raised +middle finger +pinky out +beckoning +pointing +pointing at self +pointing at viewer +pointing down +pointing forward +pointing up +kamina pose +saturday night fever +shushing +thumbs down +thumbs up \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_three_open_fingers.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_three_open_fingers.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1c970f6d40ba590e51a31c4402b7524b00d5e04f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_three_open_fingers.txt" @@ -0,0 +1,4 @@ +\m/ +middle w +ok sign +money gesture \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_two_open_fingers.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_two_open_fingers.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6c62f25831c72884efcc7d290564601b0824a245 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_two_open_fingers.txt" @@ -0,0 +1,8 @@ +\n/ +crossed fingers +fox shadow puppet +finger gun +finger heart +inward v +shaka sign +two-finger salute \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_variable_number_of_open_fingers.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_variable_number_of_open_fingers.txt" new file mode 100644 index 0000000000000000000000000000000000000000..21dccd5f61ed6d137215f9050b3d8544bafd7db8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_variable_number_of_open_fingers.txt" @@ -0,0 +1,13 @@ +beckoning +stroking own chin +cunnilingus gesture +fellatio gesture +handjob gesture +hat tip +oral invitation +oral simulation +shocker (gesture) +v +v over eye +v over mouth +w \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_whole_closed_hand.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_whole_closed_hand.txt" new file mode 100644 index 0000000000000000000000000000000000000000..49a95f007e071cc0abad620c5e02af0630409534 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_whole_closed_hand.txt" @@ -0,0 +1,5 @@ +clenched hand +fig sign +paw pose +power fist +raised fist \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_whole_open_hand.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_whole_open_hand.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a1ce26bd9411e5d354ca93df98708194c036143d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/gestures/one_hand/gestures_one_hand_whole_open_hand.txt" @@ -0,0 +1,11 @@ +\||/ +claw pose +open hand +ohikaenasutte +facepalm +reaching +salute +spread fingers +straight-arm salute +vulcan salute +waving \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/hands_strange.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/hands_strange.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c8f666edd3b53f32fa5dee119ed4db558f10ae84 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/hands_strange.txt" @@ -0,0 +1 @@ +hand mouth \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_above_neck.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_above_neck.txt" new file mode 100644 index 0000000000000000000000000000000000000000..855d0108529c35bf0cab72f6e099512d3c006f53 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_above_neck.txt" @@ -0,0 +1,12 @@ +hand on ear +adjusting eyewear +hand on another's head +hand on own head +hand on own forehead +hand on another's face +hand on own face +hands on another's face +hands on own face +hand on another's cheek +hand on own cheek +hand on headwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_breasts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_breasts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7a0b0134a8c7e64923d96b8017eafe4789ab0e91 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_breasts.txt" @@ -0,0 +1,13 @@ +breast hold +breast grab +flat chest grab +guided breast grab +breast lift +breast poke +breast press +breast squeeze +breast suppress +groping +arm between breasts +hand in bra +nipple tweak \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_lower_body.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_lower_body.txt" new file mode 100644 index 0000000000000000000000000000000000000000..864f127276b53b629430039846da48d930f20cb1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_lower_body.txt" @@ -0,0 +1,6 @@ +hand on own ass +hands on ass +hands on hips +hand on own knee +hands on own knees +hand between legs \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_members.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_members.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e53516884f3597031308424a1140e2b41d3f8b84 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_members.txt" @@ -0,0 +1,3 @@ +hands on feet +holding hands +hand on another's hand \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_somewhere_else.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_somewhere_else.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d3050b3db1e05adb0406050e43e5d0dbea3beb87 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_somewhere_else.txt" @@ -0,0 +1,2 @@ +hand in pocket +hands in pockets \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_upper_body.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_upper_body.txt" new file mode 100644 index 0000000000000000000000000000000000000000..02131bf64b9090108364f0c9ade153741a45ea32 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/hands/where/hands_where_upper_body.txt" @@ -0,0 +1,5 @@ +hand on another's chest +hand on own chest +hands on hips +hand on another's shoulder +hand on own shoulder \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/new.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/new.txt" new file mode 100644 index 0000000000000000000000000000000000000000..09c625720b29412b64d920a7a38fa13044840d1e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/new.txt" @@ -0,0 +1,2 @@ +holding smoking pipe +smoke \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_by_gender.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_by_gender.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5bdba1f8c983d1dcbe4528b376f80effaf240cab --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_by_gender.txt" @@ -0,0 +1,4 @@ +CFNF +CFNM +CMNF +CMNM \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_complete.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_complete.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0223a252adaca9e429b96aa19a8f345c6e553ecd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_complete.txt" @@ -0,0 +1,2 @@ +completely nude +nude \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_dressing_covering_body_parts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_dressing_covering_body_parts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d9c3a518432a1121d5033827c378628579127d11 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_dressing_covering_body_parts.txt" @@ -0,0 +1,12 @@ +covering +covering anus +covering ass +covering breasts +covering crotch +covering head +covering ears +covering one eye +covering eyes +covering face +covering mouth +nude cover \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a37ec772b0fff5f7d678566ca0a7c8b7d1626d7d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_misc.txt" @@ -0,0 +1,3 @@ +nude modeling +tan +tanlines \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_naughty_points_of_view.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_naughty_points_of_view.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fd8409d25dd12d5cf55ee35a062b54f97b094816 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_naughty_points_of_view.txt" @@ -0,0 +1,4 @@ +downblouse +downpants +pantyshot +upskirt \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_touching_clothes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_touching_clothes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4ce491df2d6825c88e1dc64f515e1432f511ad17 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/nudity_touching_clothes.txt" @@ -0,0 +1,11 @@ +adjusting clothes +clothes grab +apron grab +collar grab +necktie grab +skirt grab +collar tug +dress tug +shirt tug +skirt tug +wringing clothes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_any_clothes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_any_clothes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2f87fb8742f855c0d15a4964e55ae5b9a8fce7e8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_any_clothes.txt" @@ -0,0 +1,12 @@ +clothing aside +clothes down +open clothes +revealing clothes +see-through +unbuttoned +undressing +unfastened +untied +untying +unzipped +unzipping \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_breasts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_breasts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8657ce440eba2caa2132d5f4c57c4690966b6b2f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_breasts.txt" @@ -0,0 +1,8 @@ +breastless clothes +one breast out +breast slip +breasts out +bra lift +no bra +bra pull +topless \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_breasts_parts_of.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_breasts_parts_of.txt" new file mode 100644 index 0000000000000000000000000000000000000000..435753d753dd4078c97aa225dd429c217ce6c879 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_breasts_parts_of.txt" @@ -0,0 +1,6 @@ +backboob +cleavage +cleavage cutout +sideboob +underboob +underboob cutout \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_chest.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_chest.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f8cb436f37b7bce1fc49ce00fed52dd34958a195 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_chest.txt" @@ -0,0 +1,21 @@ +center opening +open coat +open collar +dress pull +open hoodie +open jacket +leotard pull +kimono down +kimono pull +pajamas pull +open robe +shirt aside +topless male +no shirt +open shirt +shirt lift +shirt pull +shirt slip +sweater lift +top pull +open vest \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_head_or_neck.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_head_or_neck.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4e8956803c09bb327985c9e75bd0276cad649989 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_head_or_neck.txt" @@ -0,0 +1 @@ +hood down \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_nipples.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_nipples.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c6e1388bd040227e93f14c8f7f69bc1f22b72fe8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_nipples.txt" @@ -0,0 +1,4 @@ +areola slip +nipple slip +nippleless clothes +nipples \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_shoulders_and_arms.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_shoulders_and_arms.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1b091305c93a1b52f0f06bb6875c07b4ee1840d7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_shoulders_and_arms.txt" @@ -0,0 +1,12 @@ +armpits +armpit cutout +bare arms +arm cutout +bare shoulders +off shoulder +shoulder cutout +sleeves rolled up +sleeveless +low-cut armhole +sleeves pushed up +sleeves rolled up \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_torso.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_torso.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d1c1fc8996302ced3db6bed7608aa3f0c5c625d0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_exposed_torso.txt" @@ -0,0 +1,11 @@ +barefoot +bare legs +dress lift +hip vent +leg cutout +thigh cutout +side slit +no pants +shoe pull +sock pull +zettai ryouiki \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_focus_on_exposed_ass_or_crotch.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_focus_on_exposed_ass_or_crotch.txt" new file mode 100644 index 0000000000000000000000000000000000000000..113e40637a39963aa1ebf03bb27750779cbcd63d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_focus_on_exposed_ass_or_crotch.txt" @@ -0,0 +1,32 @@ +ass cutout +bottomless +buruma pull +buruma aside +clitoris slip +clothing aside +crotch cutout +dress aside +leotard aside +hakama pull +kimono lift +yukata lift +no panties +panties aside +pants pull +open pants +pants pull +panty lift +panty pull +pussy peek +pussy peek +pantyhose pull +shorts aside +open shortsn +shorts pull +skirt around one leg +skirt around ankles +open skirt +skirt pull +skirt lift +swimsuit aside +bikini aside \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_focus_on_exposed_legs_or_feet.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_focus_on_exposed_legs_or_feet.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d1c1fc8996302ced3db6bed7608aa3f0c5c625d0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_focus_on_exposed_legs_or_feet.txt" @@ -0,0 +1,11 @@ +barefoot +bare legs +dress lift +hip vent +leg cutout +thigh cutout +side slit +no pants +shoe pull +sock pull +zettai ryouiki \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1c789f9097f26f60b0347fc2aa163f36e67418e1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_misc.txt" @@ -0,0 +1,16 @@ +bikini aside +bikini pull +cape lift +lifting covers +open bikini +open bra +open kimono +robe slip +strap lift +strap pull +strap slip +swimsuit aside +swimsuit pull +open towel +towel slip +male underwear pull \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_specific_clothes_or_ornaments_being_worn_as_exceptions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_specific_clothes_or_ornaments_being_worn_as_exceptions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d1c1fc8996302ced3db6bed7608aa3f0c5c625d0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/nudity/partial/nudity_partial_specific_clothes_or_ornaments_being_worn_as_exceptions.txt" @@ -0,0 +1,11 @@ +barefoot +bare legs +dress lift +hip vent +leg cutout +thigh cutout +side slit +no pants +shoe pull +sock pull +zettai ryouiki \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/portrait-type.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/portrait-type.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c7216e7996f7dc52614580c859e14a4eb9e80166 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/portrait-type.txt" @@ -0,0 +1,25 @@ +Sitting +Standing +Lying down +Leaning +Against a wall +Looking straight at the camera +Looking off to the side +Looking down +Close-up +Mid-range +Wide shot +Headshot +3/4 shot +Full body shot +Environmental portrait +Sitting down with legs crossed +Laying down on one's back +propped up on one elbow (resting) +propped up on one hand (sitting or standing) +leaning against a wall +leaning forward +arms crossed +head tilted to the side +looking over one shoulder +looking off into the distance diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/arms/posture_arms_basic.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/arms/posture_arms_basic.txt" new file mode 100644 index 0000000000000000000000000000000000000000..14081afe632920f4df09fb131a8660bf12bc10ad --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/arms/posture_arms_basic.txt" @@ -0,0 +1,13 @@ +arm behind back +arms behind back +arm up +arm behind head +victory pose +arms up +\o/ +arms behind head +outstretched arm +outstretched arms +spread arms +arm at side +arms at side \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/arms/posture_arms_specific.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/arms/posture_arms_specific.txt" new file mode 100644 index 0000000000000000000000000000000000000000..449a4ece979342bf637214bc4ed27f581adb31c5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/arms/posture_arms_specific.txt" @@ -0,0 +1,10 @@ +airplane arms +crossed arms +flexing +praise the sun +reaching +shrugging +t-pose +a-pose +v arms +w arms \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5067a77ae1c29d912970b2e5b8bf19348b102f0c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_main.txt" @@ -0,0 +1 @@ +hug \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_one_character.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_one_character.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5d79957bfd3c49f455cabdc142877111794070ba --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_one_character.txt" @@ -0,0 +1,4 @@ +hugging own legs +object hug +tail hug +wing hug \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_two_characters.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_two_characters.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4e82b418be813db617df9aaed38b28d55cdfc209 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/hugging/posture_hugging_two_characters.txt" @@ -0,0 +1,3 @@ +arm hug +hug from behind +waist hug \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_basic.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_basic.txt" new file mode 100644 index 0000000000000000000000000000000000000000..271068ba9d02108cc20ff19b396bbc65995a2d1d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_basic.txt" @@ -0,0 +1,28 @@ +kneeling +one knee +lying +crossed legs +fetal position +on back +on side +on stomach +sitting +butterfly sitting +crossed legs +figure four sitting +indian style +hugging own legs +lotus position +seiza +sitting on lap +sitting on person +straddling +thigh straddling +upright straddle +wariza +yokozuwari +standing +balancing +crossed legs +legs apart +standing on one leg \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_carrying.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_carrying.txt" new file mode 100644 index 0000000000000000000000000000000000000000..580570393ed7ee75b595487afc8db470d3fbccfd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_carrying.txt" @@ -0,0 +1,12 @@ +baby carry +carrying +carried breast rest +carrying over shoulder +carrying under arm +child carry +fireman's carry +piggyback +princess carry +shoulder carry +sitting on shoulder +standing on shoulder \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_foot_position.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_foot_position.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bd51838cd1358bf3a361ebdf111bb67c4b625c7a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_foot_position.txt" @@ -0,0 +1,5 @@ +dorsiflexion +pigeon-toed +plantar flexion +tiptoes +tiptoe kiss \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hand_position.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hand_position.txt" new file mode 100644 index 0000000000000000000000000000000000000000..18ccc01cca0f3f98ad978c9f2605f138b47342db --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hand_position.txt" @@ -0,0 +1,2 @@ +stroking own chin +outstretched hand \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hands_touching_each_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hands_touching_each_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..df27af7c4941bb0a87a8b88adbc46eb4ac3ff330 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hands_touching_each_other.txt" @@ -0,0 +1,4 @@ +interlocked fingers +own hands clasped +own hands together +star hands \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_head.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_head.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0d8e77bf70caa54d0c886b59390dfebb2d279e9f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_head.txt" @@ -0,0 +1,2 @@ +head down +head tilt \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hips.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hips.txt" new file mode 100644 index 0000000000000000000000000000000000000000..88e44493cad1969e1258599c89db2fc17d092816 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_hips.txt" @@ -0,0 +1 @@ +contrapposto \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_knee_location.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_knee_location.txt" new file mode 100644 index 0000000000000000000000000000000000000000..398e12c5423c7a133a61f8afddd97f950cf73ee5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_knee_location.txt" @@ -0,0 +1,2 @@ +knees together feet apart +knees apart feet together \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_leg_location.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_leg_location.txt" new file mode 100644 index 0000000000000000000000000000000000000000..107923db6bb72b5b7d4ca9d645f25e1213ad3db6 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_leg_location.txt" @@ -0,0 +1,13 @@ +crossed ankles +folded +leg up +legs up +knees to chest +legs over head +leg lift +outstretched leg +split +pigeon pose +standing split +spread legs +watson cross \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_movement_of_the_body.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_movement_of_the_body.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f548bb16fef6541117e4e71bdbb666d14269ab8e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_movement_of_the_body.txt" @@ -0,0 +1,6 @@ +balancing +crawling +jumping +running +walking +wallwalking \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_other_whole_body.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_other_whole_body.txt" new file mode 100644 index 0000000000000000000000000000000000000000..37a219c8014b42ae0198c0b7bab0fb60578e63f9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_other_whole_body.txt" @@ -0,0 +1,19 @@ +all fours +top-down bottom-up +prostration +chest stand +cowering +crucifixion +faceplant +fighting stance +battoujutsu stance +full scorpion +reclining +squatting +stretching +superhero landing +upside-down +handstand +headstand +yoga +scorpion pose \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_poses.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_poses.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0a61d0972da54337d07d1f75c3276483d768cdf4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_poses.txt" @@ -0,0 +1,20 @@ +animal pose +bunny pose +horns pose +paw pose +claw pose +archer pose +bras d'honneur +body bridge +contrapposto +dojikko pose +gendou pose +inugami-ke no ichizoku pose +jojo pose +letter pose +ojou-sama pose +saboten pose +symmetrical hand pose +victory pose +villain pose +zombie pose \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_rest_points.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_rest_points.txt" new file mode 100644 index 0000000000000000000000000000000000000000..733f00fc8b6e624f1fccccfc7822ef569bf6aa52 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_rest_points.txt" @@ -0,0 +1,2 @@ +arm support +head rest \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_three_characters.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_three_characters.txt" new file mode 100644 index 0000000000000000000000000000000000000000..639cef8d5c0b8d8e39ce4a820dfcb7689b51b3a1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_three_characters.txt" @@ -0,0 +1,2 @@ +circle formation +group hug \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_torso.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_torso.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8a9eaa9c0678fd7e68348b5b3c3c3bf6579c7867 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_torso.txt" @@ -0,0 +1,6 @@ +arched back +bent over +leaning back +leaning forward +slouching +twisted torso \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_two_characters.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_two_characters.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0c2722d6a81075726cc5f21a87b8162e9220fd34 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/posture/posture_two_characters.txt" @@ -0,0 +1,7 @@ +back-to-back +cheek-to-cheek +eye contact +forehead-to-forehead +heads together +holding hands +leg lock \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/extreme/sex_acts_fetishes_extreme_mutilation.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/extreme/sex_acts_fetishes_extreme_mutilation.txt" new file mode 100644 index 0000000000000000000000000000000000000000..841e7c82c64f229ad09bf0e86871fea02553e799 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/extreme/sex_acts_fetishes_extreme_mutilation.txt" @@ -0,0 +1,5 @@ +guro +amputee +double amputee +quadruple amputee +triple amputee \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/extreme/sex_acts_fetishes_extreme_scat_and_urination.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/extreme/sex_acts_fetishes_extreme_scat_and_urination.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3c67f903d7b80d68082cc6fe46b94b15a95e6848 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/extreme/sex_acts_fetishes_extreme_scat_and_urination.txt" @@ -0,0 +1,8 @@ +scat +enema +peeing +drinking pee +golden shower +peeing on viewer +have to pee +peeing self \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_ageplay.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_ageplay.txt" new file mode 100644 index 0000000000000000000000000000000000000000..533abbe9ad79687c2573e5839416e895e0c86d6e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_ageplay.txt" @@ -0,0 +1,6 @@ +diaper +adult baby +loli +shota +toddlercon +mature female \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_animal_play.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_animal_play.txt" new file mode 100644 index 0000000000000000000000000000000000000000..eb664b2ddef114fb151cce50cc03186bc54611b4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_animal_play.txt" @@ -0,0 +1,4 @@ +leash pull +pet play +human dog +pony play \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_body_types.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_body_types.txt" new file mode 100644 index 0000000000000000000000000000000000000000..32427ccb3674d3c5f43cac2cd064ad301d39570f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_body_types.txt" @@ -0,0 +1,10 @@ +giant +giantess +miniboy +minigirl +muscular +muscular female +plump +curvy +fat +skinny \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_bondage.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_bondage.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c25b01c0259acf73520271eb7b2d705345a79d8a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_bondage.txt" @@ -0,0 +1,28 @@ +bdsm +asphyxiation +strangling +bondage +breast bondage +predicament bondage +shibari +spreader bar +suspension +femdom +assertive female +clothed female nude male +sitting on face +humiliation +body writing +neglect play +public use +tally +rope walking +slave +spanked +torture +clitoris torture +nipple torture +nipple pull +ball busting +tickle torture +wax play \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_breasts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_breasts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a1afd5dc0bc564919efca2077945a47761395b1a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_breasts.txt" @@ -0,0 +1,8 @@ +breasts +breast expansion +gigantic breasts +breast smother +multiple breast smother +lactation +breastfeeding +lactation through clothes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_cum_play.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_cum_play.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6f9c96d70ededa16849958925bee3b48c550fbaa --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_cum_play.txt" @@ -0,0 +1,31 @@ +cum +bukkake +cumdump +cum bath +cumdrip +cum pool +cum inflation +cum in ass +cum in clothes +cum in cup +cum in mouth +cum in throat +cum on tongue +cum in pussy +cum in urethra +cum on body +cum on hair +cum on ass +cum on pussy +cum on breasts +cum on fingers +cum on clothes +cum on glasses +cum on food +ejaculation +ejaculating while penetrated +pull out +facial +autofacial +felching +gokkun \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_exhibitionism.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_exhibitionism.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4449421bf5019e4d3e18288f45f543a3e555d07a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_exhibitionism.txt" @@ -0,0 +1,17 @@ +exhibitionism +clothed sex +public indecency +public nudity +zenra +public vibrator +stealth masturbation +stealth sex +cheating +cuckolding +netorare +prostitution +public use +voyeurism +caught +peeping +walk-in \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_facial_expressions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_facial_expressions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b131a72459d19f95e7b721ab739c1f114397ebd6 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_facial_expressions.txt" @@ -0,0 +1,5 @@ +:>= +fucked silly +ahegao +torogao +naughty face \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_gender_play.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_gender_play.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9e47fac07caef90719713531f1fe5493011b45bd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_gender_play.txt" @@ -0,0 +1,11 @@ +crossdressing +otoko no ko +reverse trap +futanari +implied futanari +full-package futanari +futa with female +futa with futa +futa with male +male futanari +no testicles \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_incest.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_incest.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c43755260444503e684949c7e4341ce0d9730cf9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_incest.txt" @@ -0,0 +1,2 @@ +incest +twincest \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fc7484b2d2dadcab50ae469299ab79491109a65f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_misc.txt" @@ -0,0 +1,42 @@ +anatomy +conjoined +extra arms +extra breasts +extra penises +bestiality +Knotting +condom +used condom on penis +condom left inside +used condom +defloration +nantaimori +nyotaimori +wakamezake +bust cup +female ejaculation +Pussy juice +human furniture +forniphilia +human toilet +Mecha on girl +pregnant +birth +impregnation +unbirthing +selfcest +smelling +smelling feet +smelling clothes +smelling underwear +smelling pantyhose +tentacles +consensual tentacles +tentacle gagged +tentaclejob +tentacle sex +tentacles in thighhighs +tentacles under clothes +tentacles on male +too many tentacles +vore \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_rape.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_rape.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ca40c1303b6c543f2407b3446dcc7a646cf3c91d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_rape.txt" @@ -0,0 +1,6 @@ +rape +femdom rape +imminent rape +molestation +chikan +compensated molestation \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_smother.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_smother.txt" new file mode 100644 index 0000000000000000000000000000000000000000..35f6afbf0f133d3ed2845517ae5b7a7dc40f4239 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_smother.txt" @@ -0,0 +1,4 @@ +smother +asphyxiation +breast smother +sitting on face \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_views.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_views.txt" new file mode 100644 index 0000000000000000000000000000000000000000..837f733ea3a5225fa204c963e1fdca1718e06785 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/fetishes/sex_acts_fetishes_views.txt" @@ -0,0 +1,3 @@ +cross-section +internal cumshot +x-ray \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_before.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_before.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bdcfbc6597086dfd36877217f4d54f2766087cd7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_before.txt" @@ -0,0 +1 @@ +presenting \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_group_sex.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_group_sex.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e19d8861d274afb15f217d18d32273a3fc33f4e4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_group_sex.txt" @@ -0,0 +1,17 @@ +group sex +bisexual +bisexual female +bisexual male +daisy chain +gangbang +double penetration +triple penetration +love train +cooperative fellatio +cooperative footjob +multiple breast smother +orgy +reverse spitroast +spitroast +teamwork +threesome \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_penetration_and_insertion.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_penetration_and_insertion.txt" new file mode 100644 index 0000000000000000000000000000000000000000..be0c367d545e4a265628ae7cb0a16b5c27cd09fe --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_penetration_and_insertion.txt" @@ -0,0 +1,60 @@ +object insertion +vaginal object insertion +anal object insertion +large insertion +stomach bulge +food insertion +animal insertion +multiple insertions +urethral insertion +sounding +cervical penetration +nipple penetration +nosejob +fingering +anal fingering +prostate milking +fingering through clothes +fingering through panties +implied fingering +fisting +anal fisting +self fisting +sex +after sex +after anal +after buttjob +after fellatio +after fingering +after frottage +after insertion +after masturbation +after oral +after paizuri +after rape +after urethral +after vaginal +afterglow +clothed after sex +anal +double anal +imminent anal +pegging +triple anal +clothed sex +guided penetration +happy sex +imminent penetration +implied sex +navel sex +sex from behind +skull fucking +ear sex +penis in eye +tentacle sex +underwater sex +vaginal +after vaginal +double vaginal +imminent vaginal +triple vaginal \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_same_sex.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_same_sex.txt" new file mode 100644 index 0000000000000000000000000000000000000000..22031804f16175107e86223a0c925fa3dbfd8490 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_same_sex.txt" @@ -0,0 +1,9 @@ +yaoi +bara +implied yaoi +bulges touching +penises touching +testicles touching +yuri +implied yuri +tribadism \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_stimulation.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_stimulation.txt" new file mode 100644 index 0000000000000000000000000000000000000000..cf004a24a573aef96025ef8c13531b8920fb76ea --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sex_acts/sex_acts_stimulation.txt" @@ -0,0 +1,77 @@ +footjob +double footjob +implied footjob +cooperative footjob +shoejob +feet +licking foot +foot worship +smelling feet +frottage +armpit sex +grinding +backjob +buttjob +double grinding +kneepit sex +paizuri +Pecjob +cooperative paizuri +naizuri +paizuri over clothes +paizuri on lap +paizuri under clothes +thigh sex +groping +ass grab +breast grab +guided breast grab +pectoral grab +guided pectoral grab +nipple tweak +crotch grab +guided crotch grab +grabbing own breast +torso grab +hairjob +handjob +caressing testicles +double handjob +cooperative handjob +nursing handjob +reach-around +two-handed handjob +masturbation +clothed masturbation +crotch rub +building sex +pillow sex +teddy bear sex +table sex +implied masturbation +male masturbation +mutual masturbation +stealth masturbation +tail masturbation +tail +tail insertion +tail masturbation +tailjob +oral +anilingus +trombone +breast sucking +cunnilingus +autocunnilingus +implied cunnilingus +fellatio +autofellatio +cum swap +deepthroat +implied fellatio +irrumatio +cooperative fellatio +multiple penis fellatio +sitting on face +testicle sucking +licking testicle \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_equpment.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_equpment.txt" new file mode 100644 index 0000000000000000000000000000000000000000..400dac76fbf1ad3adf7ed375121bb3746de7de2d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_equpment.txt" @@ -0,0 +1,34 @@ +Ball +Beachball +Bowling ball +Billiard ball +American football +Golf ball +Marble +Rugby ball +Soccer ball +Tennis ball +Jersey +Uniform +Sportswear +Baseball cap +baseball jersey +baseball uniform +basketball uniform +hockey sweater +Tennis uniform +Baseball bat +Bowling glove +Bowling pin +Cue stick +Golf club +Hoop +Juggling club +Pool table +Racket +Badminton racket +Table tennis paddle +Tennis racket +Ribbon +Ribbon baton +Rope \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_list.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_list.txt" new file mode 100644 index 0000000000000000000000000000000000000000..71d7cf7f378ef4b11f77a38fa2739e986b932bb1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_list.txt" @@ -0,0 +1,23 @@ +American football +Badminton +Baseball +Basketball +Billiards +Boxing +Bowling +Croquet +Field Hockey +Golf +Gymnastics +Rhythmic Gymnastics +Ice hockey +Kabaddi +Kemari +Lacrosse +Rugby +Sepak Takraw +Soccer +Table tennis +Tennis +Volleyball +Water polo \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8432ac833f9795ddd1a8e5fb3cd9f6bbf53a19a5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/sports/sports_main.txt" @@ -0,0 +1 @@ +Playing sports \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/tree_action.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/tree_action.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7b87d34d9adcb38dcfd1a4a7dd412f05bd6ee497 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/tree_action.txt" @@ -0,0 +1,5 @@ +against tree +in tree +sitting in tree +outdoors +under tree \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_ambiguous.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_ambiguous.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9b186729c8dc2734b6cd29db18104bc391dc9bd0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_ambiguous.txt" @@ -0,0 +1,12 @@ +bullying +egg laying +pegging +penises touching +pipelining +sounding +symmetrical docking +asymmetrical docking +prostate milking +squatting +taunting +teasing \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_gerunds.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_gerunds.txt" new file mode 100644 index 0000000000000000000000000000000000000000..502c0570762b9b0d5e7a329f3236da69eed75f52 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_gerunds.txt" @@ -0,0 +1,87 @@ +battle +bleeding +blowing +bowling +boxing +mind control +breast padding +building +model building +bullfighting +burning +camping +carving +cheating +clapping +cling +breast clinging +crossdressing +dating +2D dating +dissolving +drawing +child drawing +drowning +face stretching +fading +fencing +fishing +flashing +mooning +streaking +flexing +foreshortening +frottage +goldfish scooping +grilling +hiking +hitchhiking +horseback riding +kidnapping +kiss +necking +knitting +marching +masturbation +misunderstanding +mixed bathing +mooning +painting +drawing on another's face +panicking +pose +pouting +racing +filming +recording +relationshipping +rotting +rowing +scolding +lecturing +sewing +shoplifting +skating +figure skating +ice skating +skiing +slapping +smother +stalking +straddling +pillow straddling +thigh straddling +swimming +freediving +tail wagging +theft +tribadism +tucking +unbirthing +voyeurism +wedding +welding +wrestling +arm wrestling +writing +yawning \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_verbs.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_verbs.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b5c2eba8c82b4971c8d8fec1d973c06433687f99 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\213\225\344\275\234/verbs_and_gerunds/verbs_and_gerunds_verbs.txt" @@ -0,0 +1,334 @@ +aiming +aiming at viewer +animal milking +apologizing +applying makeup +balancing +bathing +beckoning +begging +biting +ass biting +breast biting +cheek biting +ear biting +finger biting +foreskin biting +glove biting +hand biting +head biting +lip biting +nail biting +penis biting +tail biting +thumb biting +blinking +blocking +bouncing +bouncing breasts +bowing +breaking +breastfeeding +breathing +heavy breathing +panting +bubble blowing +bumping +carrying +catching +charging +chasing +cheering +stroking own chin +cleaning +dishwashing +sweeping +climbing +comforting +concentrating +cooking +baking +chocolate making +roasting +stirring +coughing +covering +cowering +crawling +crushing +crying +sobbing +cuddling +cupping +cutting +wrist cutting +dancing +pole dancing +diaper changing +digging +diving +dodging +dragging +dreaming +dressing +forced dressing +undressing +drinking +drinking +dripping +driving +drooling +dropping +drying +dual wielding +quadruple wielding +triple wielding +dying +ear cleaning +eating +chewing +swallowing +tasting +eavesdropping +elbowing +drawing on another's face +fainting +falling +fanning +fan speaking +fanning crotch +fanning face +feeding +pov feeding +fidgeting +fighting +fingering +anal fingering +firing +fisting +anal fisting +flailing +flapping +flirting +floating +flying +gardening +watering +giving +glaring +glowing +grabbing +grinding +double grinding +groping +hairdressing +brushing hair +hair tucking +hair twirling +hanging +hatching +healing +hiding +hitting +holding +humping +dry humping +imagining +imitating +juggling +jumping +hopping +pouncing +kicking +kneeing +knocking +landing +laughing +giggling +launching +leaning +licking +licking armpit +licking breast +licking another's cheek +licking cum +licking ear +licking eye +licking another's face +licking finger +licking floor +licking foot +licking blade +licking leg +licking navel +licking nipple +licking panties +licking testicle +licking thigh +lifting +looking +looking at viewer +looking away +looking back +looking down +looking up +lying +measuring +bust measuring +penis measuring +waist measuring +melting +mind reading +moaning +multitasking +nose picking +opening +pantsing +patting +peeing +bedwetting +peeking +peeling +peeping +headpat +petting +pinching +cheek pinching +pitching +playing +pointing +pointing at viewer +poking +cheek poking +polishing +pouring +decantering +praying +presenting +programming +protecting +pulling +punching +pushing +rappelling +reaching +reading +reloading +repairing +resisting +resting +riding +broom riding +umbrella riding +ripping +roaring +rolling +rubbing +crotch rub +running +scratching +screaming +searching +shading eyes +shaking +sharing +shaving +shopping +shouting +shushing +singing +sinking +sitting +reclining +sketching +skipping +slashing +sleeping +sliding +slipping +smelling +smelling clothes +smelling pantyhose +smelling underwear +smoking +sneezing +snowing +spanking +spilling +spinning +pen spinning +spitting +splashing +spooning +spraying +squeezing +squinting +squirting liquid +stacking +standing +staring +stepping +stomping +strangling +struggling +studying +sucking +blood sucking +breast sucking +self breast sucking +finger sucking +thumb sucking +toe sucking +sulking +summoning +sunbathing +surfing +broom surfing +sky surfing +swaying +swinging +tail fondling +talking +sleep talking +tapping +teaching +text messaging +thinking +throwing +tickling +foot tickling +touching +training +trembling +trolling +tugging +twitching +tying +scarf tying +shoe tying +untying +typing +unsheathing +unzipping +vomiting +wading +skinny dipping +waiting +walking +dog walking +rope walking +wallwalking +warming +washing +body soaping +watching +watching television +waving +whipping +whisking +whispering +whistling +working +wringing +zipping +unzipping \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/Locations.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/Locations.txt" new file mode 100644 index 0000000000000000000000000000000000000000..eb9279f41db289b0f8cdf7dbf03b52a3dc9e4e44 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/Locations.txt" @@ -0,0 +1,40 @@ +In Postapocalyptic wasteland +In Deep forest +On Snowy mountaintop +In Dark cave +In Futuristic city +In Cyberpunk city +In Abandoned factory +In Windy desert +On Tropical island +In Military base +In Underground facility +In Ruined city +On Beautiful lake +In Dreamy landscape +Underwater +In space +On an asteroid +On the Moon +On Mars +Inside spaceship +In Warzone +In Mystical area +In Elegant room +In forest +In Jungle +In Enchanted Forest +In Alleyway +In Spaceship +In Space +Inside Temple +Inside Castle +In Cathedral +In Cyberspace +On Battlefield +On Ice +On Coastline +In Office +In front of Flower shop +In front of Armor shop +In front of Gun shop \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/adj-architecture.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/adj-architecture.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6ea4442a6537eec53898bfcbe407c53588af9730 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/adj-architecture.txt" @@ -0,0 +1,129 @@ +Minimalist +Eclectic architecture +Reflective +Congested +Streamlined +Opaque +Pleasant +Gothic architecture +Gothic Revival architecture +Fabric +Structure +Suspended +Facade +Monolithic +Diminutive +Resemble +Revivalist +Streamline +Economical +Sparse +Luxurious +Compact +Pervasive +Effusive +Tatami +Shoji +Fusuma +Engawa +Genkan +Byzantine architecture +Roman architecture +Romanesque architecture +Islamic architecture +Persian architecture +Renaissance architecture +Baroque architecture +Churrigueresque architecture +Churrigueresco architecture +Rococo architecture +Bauhaus architecture +Art Deco +Art Nouveau +Brutalism architecture +Colonial architecture +Composite architecture +Corinthian architecture +Decorated style +Doric architecture +Early Christian architecture +Early English architecture +Edwardian architecture +Empire architecture +Federation architecture +functionalism +Georgian architecture +International Style +Modernist +Ionic +Jacobean architecture +Louis Quatorze +Louis Quinze +Louis Seize +Louis Treize +Mannerism +Moderne +Moorish architecture +Morisco architecture +Mudéjar architecture +Neoclassicial +Palladian +perpendicular +postmodernism +post-modernism +Queen-Anne architecture +Regency architecture +Saracen architecture +transition +transitional +Tudor architecture +Tuscan architecture +Victorian architecture +Cambodian Khmer architecture +Buddhist architecture +New Indies architecture +French Colonial architecture +Spanish Colonial architecture +Vernacular +Neo-Tropical +Bilateral Symmetry +Horizontal emphasis +Cosmological concept architecture +Imperial architecture +Relgious architecture +Confucius era +Hui Style architecture +Shanxi architecture +Lingnan architecture +Cantonese architecture +Minnan architecture +Hokkien architecture +Hakka architecture +Gan architecture +Yaodong architecture +Tibetan architecture +Xinjiang architecture +Hutong architecture +Longtang architecture +Shikumen architecture +Shanghainese architecture +Meiji architecture +Taishō architecture +Shōwa architecture +Heisei architecture +Meiji period +Taishō period +Shōwa period +Heisei period +Japanese aesthetic +Chinese aesthetic +Korean aesthetic +Russian aesthetic +Eastern Roman +Pagan Roman +Kievan Rus architecture +Muscovite architecture +Imperial Russia architecture +Soviet Union architecture +Postwar Soviet architecture +Modern Russian architecture diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/buildings/locations_buildings.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/buildings/locations_buildings.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c2b6057ee9cbed719f508982026fed00f4b83aaf --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/buildings/locations_buildings.txt" @@ -0,0 +1,86 @@ +airport +control tower +hangar +runway +apartment +aquarium +arcade +bar +izakaya +tavern +barn +bunker +bus stop +bowling alley +cafe +casino +castle +floating castle +church +construction site +convention +garage +gas station +gazebo +greenhouse +gym +hospital +house +hotel +hut +industrial +factory +refinery +powerplant +nuclear powerplant +cooling tower +warehouse +library +lighthouse +mall +megastructure +space elevator +military base +mosque +museum +art gallery +nightclub +observatory +office +onsen +pagoda +planetarium +prison +restaurant +rooftop +ruins +school +skating rink +shack +shrine +shop +bakery +bookstore +convenience store +flower shop +pharmacy +salon +sex shop +supermarket +weapon shop +skyscraper +stadium +arena +stilt house +temple +theater +amphitheater +movie theater +tower +bell tower +clock tower +guard tower +minaret +train station +treehouse +windmill \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/buildings/locations_buildings_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/buildings/locations_buildings_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..dbbef1748a01b6b93187facd78d9a927f0a16321 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/buildings/locations_buildings_main.txt" @@ -0,0 +1 @@ +Building \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/indoors/locations_indoors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/indoors/locations_indoors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c32b1e877605012355d9126c78b98c1a935aefd6 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/indoors/locations_indoors.txt" @@ -0,0 +1,36 @@ +bathroom +bathtub +toilet stall +shower +bedroom +hotel room +messy room +otaku room +cafeteria +changing room +classroom +clubroom +conservatory +courtroom +dining room +dressing room +dungeon +prison cell +fitting room +gym +locker room +gym storeroom +infirmary +kitchen +laboratory +library +living room +office +cubicle +pool +stage +staff room +storage room +armory +closet +workshop \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/indoors/locations_indoors_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/indoors/locations_indoors_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e6f70ffa61c8c1b58685cefba5e34fbebd8d5a12 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/indoors/locations_indoors_main.txt" @@ -0,0 +1 @@ +Indoors \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/locations_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/locations_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ef139264f97b2c28d5e41f124052353d7d26a895 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/locations_other.txt" @@ -0,0 +1,16 @@ +architecture +cityscape +landscape +space +asteroid +moon +planet +space station +vehicle interior +airplane interior +bus interior +car interior +cockpit +spacecraft interior +tank interior +train interior \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..80805b450d67df71c8785b7a8386708a62766d2a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_main.txt" @@ -0,0 +1 @@ +Outdoors \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_man_made.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_man_made.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3449944f017dad887a45588c4a5a8e54d174faa4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_man_made.txt" @@ -0,0 +1,53 @@ +airfield +runway +amusement park +carousel +ferris wheel +roller coaster +aqueduct +bridge +rope bridge +wooden bridge +canal +city +floating city +rural +town +village +dam +dock +drydock +shipyard +garden +graveyard +jetty +field +flower field +rice paddy +wheat field +fountain +harbor +market +market stall +pier +phone booth +pool +poolside +waterpark +railroad tracks +railroad crossing +road +alley +crosswalk +dirt road +highway +path +sidewalk +stone walkway +street +soccer field +running track +trench +tunnel +well +zoo \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_natural.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_natural.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7b92d3edba3cd8a3b77314535db532b0f5b75c20 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/locations/outdoors/locations_outdoors_natural.txt" @@ -0,0 +1,34 @@ +beach +shore +canyon +cave +cliff +desert +oasis +forest +bamboo forest +glacier +hill +island +floating island +jungle +meadow +mountain +volcano +nature +park +playground +parking lot +plain +savannah +wetland +water +geyser +lake +ocean +ocean bottom +pond +river +stream +waterfall +wasteland \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/sexual_positions/sexual_positions_bondage.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/sexual_positions/sexual_positions_bondage.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6d1361d636ec349ebc542a2760bca09f62369851 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/sexual_positions/sexual_positions_bondage.txt" @@ -0,0 +1,17 @@ +box tie +bound arms +bound feet +bound legs +bound calves +bound knees +bound thighs +bound torso +bound wrists +frogtie +hogtie +separated arms +separated legs +separated wrists +shrimp tie +strappado +suspension \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/sexual_positions/sexual_positions_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/sexual_positions/sexual_positions_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a5addb6aef414cf4ab51e8836436abf5b07101a7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\234\260\351\273\236/sexual_positions/sexual_positions_main.txt" @@ -0,0 +1,22 @@ +doggystyle +top-down bottom-up +spitroast +spooning +girl on top +amazon position +cowgirl position +reverse cowgirl position +reverse upright straddle +upright straddle +boy on top +anvil position +folded +knees to chest +legs over head +legs up +missionary +mating press +suspended congress +reverse suspended congress +full nelson +piledriver diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_headwear.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_headwear.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b405aae427e305326f9a25ca0990e465d6a0c5fa --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_headwear.txt" @@ -0,0 +1,15 @@ +balaclava +crown +hair bow +hair ribbon +hairband +headband +forehead protector +sweatband +hachimaki +mongkhon +headdress +maid headdress +tiara +veil +honggaitou \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_head_and_face.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_head_and_face.txt" new file mode 100644 index 0000000000000000000000000000000000000000..faeafb9b734c872f36baf576ea6fdb6ffbc818b4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_head_and_face.txt" @@ -0,0 +1,12 @@ +earrings +hoop earrings +stud earrings +earclip +glasses +monocle +hair ornament +hairclip +hairpin +mask +plague doctor mask +surgical mask \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_limbs.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_limbs.txt" new file mode 100644 index 0000000000000000000000000000000000000000..77141889a7f329c6592dcef30d2a819d7ccda3c0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_limbs.txt" @@ -0,0 +1,29 @@ +anklet +arm belt +armband +armlet +bracelet +bangle +spiked bracelet +bridal gauntlets +detached sleeves +arm warmers +fingernails +gloves +boxing gloves +elbow gloves +fingerless gloves +bridal gauntlets +spiked gloves +yugake +mittens +leg belt +ring +claw ring +wedding ring +wedding ring +thighlet +wide sleeves +wristband +wrist cuffs +wrist scrunchie \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_neck_and_shoulders.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_neck_and_shoulders.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4c4ba7117387b637b482cf9cc925be0165fed393 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_neck_and_shoulders.txt" @@ -0,0 +1,17 @@ +ascot +bowtie +choker +collar +epaulettes +feather boa +lapels +lapel pin +neck ruff +neckerchief +necklace +necktie +tie clip +neck ribbon +scarf +shawl +stole \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_torso_and_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_torso_and_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4638299e1672aa25d2cccfd52ac2f19b197ccc5d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_jewelry_and_accessories/attire_jewelry_and_accessories_torso_and_misc.txt" @@ -0,0 +1,20 @@ +aiguillette +badge +belly chain +belt +buckle +sam browne belt +boutonniere +brooch +buttons +large buttons +button badge +corsage +pentacle +piercing +sarong +suspenders +tassel +watch +pocket watch +zipper \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_legs_and_feet.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_legs_and_feet.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e47dd4be39112e7bfa0abdd0ae62122966767915 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_legs_and_feet.txt" @@ -0,0 +1,18 @@ +garter belt +garter straps +legwear garter +kneehighs +leggings +leg warmers +over-kneehighs +pantyhose +thighband pantyhose +socks +ankle socks +bobby socks +loose socks +tabi +toe socks +tube socks +thighhighs +thigh strap \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bbb655b7f622cb3ac3f878a806a2d324814a3340 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_other.txt" @@ -0,0 +1,12 @@ +dress flower +flower trim +frills +fur trim +gathers +gold trim +lace trim +ribbon trim +see-through +silver trim +taut shirt +torn clothes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_pants_and_bottomwear.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_pants_and_bottomwear.txt" new file mode 100644 index 0000000000000000000000000000000000000000..35260324e6b6fce9a66ac70984c8ede1f5c864fa --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_pants_and_bottomwear.txt" @@ -0,0 +1,36 @@ +bloomers +buruma +chaps +kilt +pants +bell-bottoms +capri pants +detached pants +jeans +cutoff jeans +lowleg pants +pants rolled up +pelvic curtain +petticoat +sarong +shorts +bike shorts +denim shorts +dolphin shorts +gym shorts +lowleg shorts +micro shorts +short shorts +shorts under skirt +skirt +bubble skirt +high-waist skirt +high-low skirt +long skirt +lowleg skirt +microskirt +miniskirt +overall skirt +plaid skirt +pleated skirt +suspender skirt \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_shirts_and_topwear.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_shirts_and_topwear.txt" new file mode 100644 index 0000000000000000000000000000000000000000..298b7e13aca58ab027c9bbd05cee0b954bf99294 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_shirts_and_topwear.txt" @@ -0,0 +1,66 @@ +blouse +frilled shirt +sleeveless shirt +bustier +crop top +camisole +cardigan +cardigan vest +coat +duffel coat +fur coat +fur-trimmed coat +long coat +overcoat +peacoat +raincoat +yellow raincoat +transparent raincoat +trench coat +winter coat +compression shirt +corset +dress +halterneck +criss-cross halter +halterneck +hoodie +jacket +blazer +cropped jacket (bolero) +letterman jacket +safari jacket +suit jacket +sukajan +trench coat +poncho +raglan sleeves +sash +shoulder sash +stole +shirt +collared shirt +dress shirt +off-shoulder shirt +sleeveless shirt +striped shirt +t-shirt +shrug +surcoat +sweater +pullover +turtleneck +sleeveless turtleneck +sweater dress +ribbed sweater +aran sweater +tabard +tailcoat +tank top +stringer +tube top +bandeau +underbust +vest +sweater vest +waistcoat \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_shoes_and_footwear.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_shoes_and_footwear.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7b861399920eec2cf4fe1aecddc2419e24343a0b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_shoes_and_footwear.txt" @@ -0,0 +1,44 @@ +boots +ankle boots +armored boots +knee boots +high heel boots +lace-up boots +rubber boots +thigh boots +cowboy boots +spurs +cross-laced footwear +flats +footwear ribbon +high heels +loafers +kiltie loafers +mary janes +platform footwear +pointy footwear +pumps +sandals +clog sandals +cross-laced sandals +flip-flops +gladiator sandals +geta +okobo +waraji +zouri +oxfords +saddle shoes +monk shoes +slippers +animal slippers +ballet slippers +crocs +uwabaki +sneakers +high tops +converse +toeless footwear +uwabaki +winged footwear +wedge heels \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_styles_and_patterns/attire_styles_and_patterns_patterns.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_styles_and_patterns/attire_styles_and_patterns_patterns.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3dbaa386206aec8685f5486ed712e3cb03b4cb39 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_styles_and_patterns/attire_styles_and_patterns_patterns.txt" @@ -0,0 +1,10 @@ +argyle +camouflage +checkered +floral print +pinstripe pattern +plaid +polka dot +striped +multicolored stripes +double vertical stripe \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_styles_and_patterns/attire_styles_and_patterns_prints.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_styles_and_patterns/attire_styles_and_patterns_prints.txt" new file mode 100644 index 0000000000000000000000000000000000000000..41d3a05619e602fbf3b3b3a1247815d184f2dc86 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_styles_and_patterns/attire_styles_and_patterns_prints.txt" @@ -0,0 +1,40 @@ +animal print +bat print +bear print +butterfly print +cow print +leopard print +tiger print +snake print +clover print +crescent print +floral print +rose print +cherry blossom print +sunflower print +morning glory print +food print +fruit pattern +apple print +blueberry print +cherry print +kiwi print +lemon print +pineapple print +orange print +strawberry print +watermelon print +leaf print +maple leaf print +moon print +musical note print +paw print +petal print +piano print +sparkle print +triangle print +space print +starry sky print +star print +wave print +wing print \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_swimsuits_and_bodysuits.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_swimsuits_and_bodysuits.txt" new file mode 100644 index 0000000000000000000000000000000000000000..70c98b2cc8f857e63e275f8f651696aba982b572 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_swimsuits_and_bodysuits.txt" @@ -0,0 +1,36 @@ +bikesuit +racing suit +bodystocking +bodysuit +jumpsuit +short jumpsuit +leotard +strapless leotard +playboy bunny +swimsuit +competition swimsuit +slingshot swimsuit +school swimsuit +bikini +leaf bikini +string bikini +micro bikini +side-tie bikini bottom +lowleg bikini +thong bikini +venus bikini +sports bikini +tankini +criss-cross halter +swim briefs +jammers +legskin +rash guard +robe +bathrobe +open robe +kesa +romper +sarong +tunic +unitard \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_traditional_clothing.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_traditional_clothing.txt" new file mode 100644 index 0000000000000000000000000000000000000000..22a18844b20e87b6dc252f6db40a6c060e2da06f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_traditional_clothing.txt" @@ -0,0 +1,42 @@ +chinese clothes +changpao +china dress +fengguan +hanfu +longpao +tangzhuang +dirndl +japanese clothes +fundoshi +yamakasa +geta +hakama +hakama skirt +hakama short skirt +hakama pants +kimono +furisode +layered kimono +short kimono +uchikake +yukata +haori +happi +chanchanko +dotera +hanten +kimono skirt +miko +nontraditional miko +sarashi +Midriff sarashi +Chest sarashi +Budget sarashi +Undone sarashi +straw cape +mino boushi +tabi +tasuki +korean clothes +hanbok +vietnamese dress \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_uniforms_and_costumes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_uniforms_and_costumes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bfcddd4f0d56333ac39a734fe9dd69249599b0b1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/attire/attire_uniforms_and_costumes.txt" @@ -0,0 +1,64 @@ +apron +armor +armored dress +bikini armor +band uniform +cape +capelet +hood +shoulder cape +cassock +cheerleader +costume +ghost costume +gym uniform +buruma +habit +harem outfit +loincloth +hazmat suit +hev suit +kigurumi +animal costume +bear costume +boar costume +cat costume +cow costume +dog costume +monkey costume +mouse costume +panda costume +penguin costume +pig costume +rabbit costume +reindeer costume +seal costume +sheep costume +tiger costume +maid +miko +nontraditional miko +military uniform +overalls +pajamas +pilot suit +plugsuit +sailor +santa costume +school uniform +serafuku +sailor dress +gakuran +meiji schoolgirl uniform +shosei +suit +business suit +pant suit +skirt suit +tuxedo +track suit +sweatpants +sweatshirt +tutu +waitress +cowboy western \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..07fa0cd32d935301d8c710f8557f50aea9ee7321 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_colors.txt" @@ -0,0 +1,12 @@ +aqua bra +black bra +blue bra +brown bra +green bra +grey bra +orange bra +pink bra +purple bra +red bra +white bra +yellow bra \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4e77327a92f76e9fd95a1c1d23ddd557d3354aab --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_main.txt" @@ -0,0 +1 @@ +bra \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f34e13a0ee63582e0a9a8720cc49332b09a75be7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_misc.txt" @@ -0,0 +1,11 @@ +adjusting bra +bra lift +bra on ass +bra on head +bra pull +bra removed +bra strap +holding bra +no bra +open bra +torn bra \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_models.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_models.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a511f6275aa0d04c79009de184cc7daa5f39c9fe --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_models.txt" @@ -0,0 +1,12 @@ +adhesive bra +beltbra +bow bra +bridgeless bra +cupless bra +frilled bra +shelf bra +sports bra +strapless bra +training bra +front-tie bra +front-hook bra \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_multiple_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_multiple_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..52fb22985dcb5ebd5b7f0535b5a67950cd379a6b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_multiple_colors.txt" @@ -0,0 +1 @@ +multicolored bra \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_patterns_and_prints.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_patterns_and_prints.txt" new file mode 100644 index 0000000000000000000000000000000000000000..056a4ada2c1eae97f0a60bf10555fb4e0c4c2ee7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/bra/bra_patterns_and_prints.txt" @@ -0,0 +1,9 @@ +checkered bra +lace bra +lace-trimmed bra +plaid bra +polka dot bra +print bra +strawberry bra +striped bra +vertical-striped bra \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..674e3b137ed33e4b335a42aa32acded46a801638 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_colors.txt" @@ -0,0 +1,13 @@ +aqua dress +black dress +blue dress +brown dress +green dress +grey dress +orange dress +pink dress +purple dress +red dress +sepia dress +white dress +yellow dress \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_models.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_models.txt" new file mode 100644 index 0000000000000000000000000000000000000000..83fb7963b211cbec1a51b1b531308237386378ad --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_models.txt" @@ -0,0 +1,24 @@ +cake dress +china dress +coat dress +cocktail dress +denim dress +dirndl +evening gown +flowing dress +funeral dress +gown +mermaid dress +negligee +nightgown +pencil dress +pinafore dress +sailor dress +santa dress +sundress +sweater dress +tennis dress +trapeze dress +tube dress +vietnamese dress +wedding dress \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_multiple_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_multiple_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0052bbd317cef7f11f06e0f73515e0ef5dbd7e8d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_multiple_colors.txt" @@ -0,0 +1,2 @@ +multicolored dress +two-tone dress \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..93d7df64fa17fb186acd42b3f6e4a22e68f2c122 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_other.txt" @@ -0,0 +1,26 @@ +armored dress +backless dress +collared dress +frilled dress +fur-trimmed dress +half-dress +halter dress +highleg dress +high-low skirt +hobble dress +impossible dress +lace-trimmed dress +latex dress +layered dress +long dress +off-shoulder dress +pleated dress +plunging neckline +ribbed dress +ribbon-trimmed dress +short dress +side slit +taut dress +see-through dress +sleeveless dress +strapless dress \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_patterns_and_prints.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_patterns_and_prints.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1bf59fba7f467ad520bf38917bf055978c756530 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/appearance/dress_appearance_patterns_and_prints.txt" @@ -0,0 +1,9 @@ +argyle dress +checkered dress +flag dress +plaid dress +polka dot dress +print dress +striped dress +vertical-striped dress +pinstripe dress \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/dress_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/dress_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ebbdd07dc1e8a985de3f6475912bb3a22498d26a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/dress/dress_misc.txt" @@ -0,0 +1,11 @@ +crinoline +dress flip +dress grab +dress lift +dress pull +dress tug +open dress +skirt basket +tied dress +torn dress +wet dress \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_brimless.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_brimless.txt" new file mode 100644 index 0000000000000000000000000000000000000000..473d5faef49ca20e6da400a9252e01386e4ea2ac --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_brimless.txt" @@ -0,0 +1,37 @@ +baseball cap +budenovka +cabbie hat +deerstalker +kepi +mao cap +peaked cap +shako capbashlik +dixie cup hat +dunce cap +chef hat +flat top chef hat +toque blanche +beret +beanie +fez hat +fur hat +bearskin cap +papakha +ushanka +garrison cap +kippah +mian guan +mitre +mortarboard +nightcap +nurse cap +party hat +sailor hat +sajkaca +santa hat +mini santa hat +songkok +tam o' shanter +tate eboshi +tokin hat +tsunokakushi \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..96ff5e4ede3cca01fc3b397543e4768cf9b240cb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_misc.txt" @@ -0,0 +1,14 @@ +animal hat +hat with ears +deviruchi hat +bear hat +bunny hat +cat hat +dog hat +fox hat +pelt +penguin hat +wolf hat +eggshell hat +pumpkin hat +Helmets \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_brim.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_brim.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5c60b460e401c51c3e4efb53fbbacff84efe3fdc --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_brim.txt" @@ -0,0 +1,28 @@ +bicorne +boater hat +bowler hat +bucket hat +campaign hat +cloche hat +cowboy hat +straw hat +ajirogasa +rice hat +fedora +female service cap +flat cap +gat +mob cap +pillow hat +pirate hat +pith helmet +porkpie hat +shampoo hat +sombrero +sun hat +top hat +mini top hat +tricorne +witch hat +mini witch hat +wizard hat \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_ear_flaps.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_ear_flaps.txt" new file mode 100644 index 0000000000000000000000000000000000000000..765ae51425b53b7553a6af53435bc7a3ae5d312f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_ear_flaps.txt" @@ -0,0 +1,4 @@ +deerstalker +budenovka +papakha +ushanka \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_visor.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_visor.txt" new file mode 100644 index 0000000000000000000000000000000000000000..32ab18e8bf27f22c14b06ce8c6e1ac2344e10181 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/hats/headwear_hats_with_visor.txt" @@ -0,0 +1,8 @@ +baseball cap +budenovka +cabbie hat +deerstalker +kepi +mao cap +peaked cap +shako cap \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_accessories.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_accessories.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2750539ec4ab446123193cc9968d5b7faf5a4b5f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_accessories.txt" @@ -0,0 +1,18 @@ +adrian helmet +american football helmet +baseball helmet +bicycle helmet +brodie helmet +diving helmet +dragoon helmet +hardhat +headlamp +helm +horned helmet +kabuto +motorcycle helmet +pickelhaube +pith helmet +stahlhelm +tank helmet +winged helmet \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_actions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_actions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..10b8267b2befdf00796d98d2c1849c7f437c0538 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_actions.txt" @@ -0,0 +1,18 @@ +adjusting headwear +backwards hat +sideways hat +hat removed +hand on headwear +hands on headwear +hat basket +hat loss +hat on chest +hat over eyes +hat over one eye +hat removed +hat tip +holding hat +torn hat +no headwear +putting on headwear +tilted headwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c4660d7ab55fd3fbf9e18c43459c3b5010eca3ae --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_colors.txt" @@ -0,0 +1,11 @@ +black headwear +blue headwear +brown headwear +green headwear +grey headwear +orange headwear +pink headwear +purple headwear +red headwear +white headwear +yellow headwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_crowns.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_crowns.txt" new file mode 100644 index 0000000000000000000000000000000000000000..cc589f52877afd7af3028455990c6396092e747e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_crowns.txt" @@ -0,0 +1,6 @@ +circlet +diadem +mian guan +mini crown +saishi +tiara \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_helmets.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_helmets.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2750539ec4ab446123193cc9968d5b7faf5a4b5f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_helmets.txt" @@ -0,0 +1,18 @@ +adrian helmet +american football helmet +baseball helmet +bicycle helmet +brodie helmet +diving helmet +dragoon helmet +hardhat +headlamp +helm +horned helmet +kabuto +motorcycle helmet +pickelhaube +pith helmet +stahlhelm +tank helmet +winged helmet \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_non_specific_styles.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_non_specific_styles.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1812d56a82e285cc3b1541427795329811dbc9a8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_non_specific_styles.txt" @@ -0,0 +1,9 @@ +checkered headwear +frilled hat +military hat +mini hat +multicolored headwear +police hat +print headwear +school hat +straw hat \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_not_headwear.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_not_headwear.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8978052c5d0803468ef943920d2f5caeb48031b8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_not_headwear.txt" @@ -0,0 +1,12 @@ +hair bow +hair ribbon +hairband +headband +forehead protector +sweatband +hachimaki +nejiri hachimaki +mongkhon +headdress +maid headdress +veil \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..473398218b1141373e918d207100e1ae02806f16 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_other.txt" @@ -0,0 +1,24 @@ +aviator cap +balaclava +bandana +bonnet +dalachi +habit +head scarf +hijab +jester cap +keffiyeh +okosozukin +shower cap +shufa guan +visor cap +veil +object on head +bowl hat +bloomers on head +book on head +bra on head +bucket on head +food on head +panties on head +traffic cone on head \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_types.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_types.txt" new file mode 100644 index 0000000000000000000000000000000000000000..429bb577c85d0041c73b4b467ec7405259c1ba34 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/headwear/headwear_types.txt" @@ -0,0 +1,4 @@ +crown +hat +helmet +Colors \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_action.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_action.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3bab988efb218041ff673bf2d8d02a4f8161cf06 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_action.txt" @@ -0,0 +1,5 @@ +adjusting legwear +holding legwear +pulling off legwear +putting on legwear +removing legwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_bands.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_bands.txt" new file mode 100644 index 0000000000000000000000000000000000000000..037eb1df012d9917b86d2677718a8c24fe27a136 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_bands.txt" @@ -0,0 +1,6 @@ +animal ear legwear +bat legwear +bear band legwear +bunny ear legwear +cat ear legwear +horn band legwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6f1b7f4297763af8743731292014d6757173927c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_colors.txt" @@ -0,0 +1,15 @@ +aqua legwear +beige legwear +black legwear +blue legwear +brown legwear +green legwear +grey legwear +lavender legwear +orange legwear +pink legwear +purple legwear +red legwear +tan legwear +white legwear +yellow legwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4b34ada851423e6d8e9b207224f0667e6f2a6cdf --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_main.txt" @@ -0,0 +1,7 @@ +socks +kneehighs +over-kneehighs +thighhighs +pantyhose +leggings +detached leggings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..176fa42d59fac28956e954c022086d75a74902ef --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_misc.txt" @@ -0,0 +1,13 @@ +alternate legwear +uneven legwear +dirty legwear +in legwear +layered legwear +legwear past ankles +legwear removed +legwear under shorts +mismatched legwear +no legwear +taut legwear +torn legwear +wet legwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_multiple_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_multiple_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7948a089d68230088151c407df575bb55d102d3c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_multiple_colors.txt" @@ -0,0 +1,3 @@ +gradient legwear +multicolored legwear +two-tone legwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_pattern.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_pattern.txt" new file mode 100644 index 0000000000000000000000000000000000000000..090348e0206308e8880f7ec47949d49b61801d35 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_pattern.txt" @@ -0,0 +1,13 @@ +american flag legwear +argyle legwear +camouflage legwear +checkered legwear +gingham legwear +plaid legwear +polka dot legwear +print legwear +rainbow legwear +striped legwear +diagonal-striped legwear +vertical-striped legwear +pinstripe legwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_style.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_style.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d9cc6c6387d1231df55dfa282ccd6b3632a1428e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/legwear/legwear_style.txt" @@ -0,0 +1,33 @@ +aran legwear +armored legwear +bow legwear +cross-laced legwear +fishnet legwear +fluffy legwear +frilled legwear +knit legwear +lace legwear +lace-up legwear +latex legwear +legwear bell +o-ring legwear +pleated legwear +ribbed legwear +ribbon legwear +seamed legwear +back-seamed legwear +front-seamed legwear +side-seamed legwear +see-through legwear +shiny legwear +side-tie legwear +spiked legwear +studded legwear +toeless legwear +bridal legwear +stirrup legwear +trimmed legwear +fur-trimmed legwear +lace-trimmed legwear +ribbon-trimmed legwear +zipper legwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/new.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/new.txt" new file mode 100644 index 0000000000000000000000000000000000000000..471bb1ed5c2783d8ed213083ca9ce9a51c3c0bfa --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/new.txt" @@ -0,0 +1,10 @@ +horns +demon girl +demon horns +tattoo +heart earrings +strapless +nail polish +silk +red nails +jewelry \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_breasts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_breasts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..eca174d53acc231a2fc324cc0c716d5c8d8df15d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_breasts.txt" @@ -0,0 +1 @@ +panties on breasts \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_hands.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_hands.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ad4a64409035345ad12473d2b25bfdebf5f57f8d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_hands.txt" @@ -0,0 +1,3 @@ +adjusting panties +hand in panties +holding panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_head.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_head.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a868cf943bbc0f7d25707dc5cab297afde996ba5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_head.txt" @@ -0,0 +1,4 @@ +panties in mouth +panties on head +panty gag +licking panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_legs.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_legs.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8b5d83db7f229ee85baf003c2cabb165e39d157b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_legs.txt" @@ -0,0 +1,3 @@ +panties around ankles +panties around one leg +panty pull \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..06277438f6abe228272c98e4ab028b7dc42baeb9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_misc.txt" @@ -0,0 +1,6 @@ +panties around tail +panties aside +panties removed +panty lift +trefoil +smelling underwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_penis.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_penis.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1390267d82c8c7d2a8e89bae1bbdf1448778f502 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/and_body_parts/panties_and_body_parts_penis.txt" @@ -0,0 +1,3 @@ +panties on penis +panties on penis +penis in panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_additional.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_additional.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c545eabfa37dc39cdd687bb9c9b02789d35b34b1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_additional.txt" @@ -0,0 +1,5 @@ +bow panties +crotch seam +frilled panties +side-tie panties +string panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..cd554840bb14320beeeb202b06eccc090160d216 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_colors.txt" @@ -0,0 +1,12 @@ +aqua panties +black panties +blue panties +brown panties +green panties +grey panties +orange panties +pink panties +purple panties +red panties +white panties +yellow panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_incomplete.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_incomplete.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9550a25f55149f4b0ee4cf1c02182a43dd2d2ed4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_incomplete.txt" @@ -0,0 +1,4 @@ +backless panties +c-string +crotchless panties +strapless bottom \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_lowleg_highleg.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_lowleg_highleg.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6f9caf0b2c52fbda13fb1f8dd3dd1d531d5d9d65 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_lowleg_highleg.txt" @@ -0,0 +1,2 @@ +lowleg panties +highleg panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_materials.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_materials.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bf2944ec59df7ff11f538ccd7078da59a236ef9a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_materials.txt" @@ -0,0 +1,2 @@ +latex panties +leaf panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_multiple_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_multiple_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8e25e949125aeaca294567e93a86d575e6faf46f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_multiple_colors.txt" @@ -0,0 +1 @@ +multicolored panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_patterns_and_prints.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_patterns_and_prints.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d1b6d237f065c9e0dd44606f4ca8f31dce628163 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_patterns_and_prints.txt" @@ -0,0 +1,12 @@ +bear panties +bunny panties +cat panties +checkered panties +lace panties +lace-trimmed panties +plaid panties +polka dot panties +print panties +strawberry panties +striped panties +vertical-striped panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_sizes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_sizes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4d5683f01cb17bdd3a58cd099285d38739b76cf4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/appearance/panties_appearance_sizes.txt" @@ -0,0 +1,2 @@ +loose panties +micro panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_and_fluids.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_and_fluids.txt" new file mode 100644 index 0000000000000000000000000000000000000000..12a83c142ee9219949fa535646a15aab857728ae --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_and_fluids.txt" @@ -0,0 +1,5 @@ +blood on panties +cumdrip onto panties +cum in panties +stained panties +wet panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_and_objects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_and_objects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d20c78267218644727974bb1d519a2526e30e810 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_and_objects.txt" @@ -0,0 +1 @@ +vibrator under panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..58bf36cdfdd0292d4e04b855d61df8210024f91e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_main.txt" @@ -0,0 +1 @@ +panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_number.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_number.txt" new file mode 100644 index 0000000000000000000000000000000000000000..09a79ce48a3bf70041e72b0d110c2b84e225a270 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_number.txt" @@ -0,0 +1,2 @@ +multiple panties +no panties \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_with_other_clothes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_with_other_clothes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a701a8e34c8c3d69af9caf7b1c848c6a9aaa634b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/panties/panties_with_other_clothes.txt" @@ -0,0 +1,10 @@ +panties over bike shorts +panties under bike shorts +panties over pantyhose +panties under pantyhose +panties under buruma +pantylines +pantyshot +implied pantyshot +upskirt +Panties a \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_patterns.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_patterns.txt" new file mode 100644 index 0000000000000000000000000000000000000000..309837b24fb9b60aebf55e3e7f99fa84ceadaaa0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_patterns.txt" @@ -0,0 +1,22 @@ +Argyle +Asa No Ha +Camouflage +Checkered +Honeycomb +Houndstooth +Pinstripe pattern +Plaid +Gingham +Polka dot +Sayagata +Seigaiha +Striped +Vertical stripes +Diagonal stripes +Horizontal Stripes +Colored stripes +Multicolored stripes +Tiger stripes +Double vertical stripe +Invasion stripes +Yagasuri \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_patterns_things.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_patterns_things.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7cf84f9c133c0f95ed8d9189f3fde58a2c3092c1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_patterns_things.txt" @@ -0,0 +1,4 @@ +patterned hair +patterned clothing +patterned +patterned background \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_print_items.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_print_items.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ae476e294436270748fcba3d7c1a69c5cbc9a5f5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_print_items.txt" @@ -0,0 +1,14 @@ +Print eyepatch +Print footwear +Print gloves +Print hakama +Print headwear +Print kimono +Print legwear +Print neckwear +Print panties +Print shirt +Print shorts +Print swimsuit +Print bikini +Print umbrella \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_specific.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_specific.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e85750cc229843a44af34e8b0b3f87bf5f3a27dd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/prints/prints_specific.txt" @@ -0,0 +1,17 @@ +Animal print +Bat print +Bear print +Bird print +Cow print +Leopard print +Tiger print +Snake print +Zebra print +Flag print +Floral print +Cherry blossom print +Hand print +Leaf print +Musical note print +Piano print +Watermelon print \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_and_objects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_and_objects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..282c41e3da5808f71873cde578675e4a6ef6dac2 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_and_objects.txt" @@ -0,0 +1,6 @@ +Tail bell +Tail bow +Tail ornament +Tail piercing +Tail ribbon +Tail ring \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_and_sex.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_and_sex.txt" new file mode 100644 index 0000000000000000000000000000000000000000..050fc58b8c1c1dbbcdd8684da831da38c317a4ee --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_and_sex.txt" @@ -0,0 +1,6 @@ +Anal tail +Butt plug tail +Penis tail +Tailjob +Tail masturbation +Tail rape \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9c2a8bac7ca350a08ad68e582c2d132161c12b43 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_main.txt" @@ -0,0 +1 @@ +Tail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a786f93e62a3e47391f4e44146fb071fd621b47d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_misc.txt" @@ -0,0 +1,20 @@ +Fake tail +Heart tail +Heart tail duo +Holding another's tail +Holding own tail +Holding with tail +Intertwined tails +Panties around tail +Prehensile tail +Stiff tail +Tail between legs +Tail biting +Tail censor +Tail grab +Tail fondling +Tail pull +Tail raised +Tail stand +Tail wagging +Tail wrap \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_number.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_number.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fb6474b0102f215f0a9df03432dc7c78aee276e7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/tail_number.txt" @@ -0,0 +1,2 @@ +Multiple tails +No tail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_elemental.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_elemental.txt" new file mode 100644 index 0000000000000000000000000000000000000000..dd8f61fa58bbe377b09555bebf7f898dd8a91402 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_elemental.txt" @@ -0,0 +1 @@ +Fiery tail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_mammals.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_mammals.txt" new file mode 100644 index 0000000000000000000000000000000000000000..16dc3b7f5cc51de05fc73a0d9c7e14068487db7b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_mammals.txt" @@ -0,0 +1,18 @@ +Bear tail +Rabbit tail +Cat tail +Cow tail +Deer tail +Dog tail +Ermine tail +Fox tail +Horse tail +Leopard tail +Lion tail +Monkey tail +Mouse tail +Pig tail +Sheep tail +Squirrel tail +Tiger tail +Wolf tail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_other_animals.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_other_animals.txt" new file mode 100644 index 0000000000000000000000000000000000000000..07fa7ec34135d008f18c6b58a59f5e2c87d7db2f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_other_animals.txt" @@ -0,0 +1,4 @@ +Fish tail +Scorpion tail +Snake tail +Tadpole tail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_supernatural.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_supernatural.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5cabc158bd2614e228e130a27fbc75a2148ceb72 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\345\244\226\350\247\200/tail/types/tail_types_supernatural.txt" @@ -0,0 +1,4 @@ +Demon tail +Dragon tail +Ghost tail +Pikachu tail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4b4a541b9c16c39e0cbfdfba04a9b19724b36100 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_colors.txt" @@ -0,0 +1,16 @@ +aqua background +beige background +black background +blue background +brown background +green background +grey background +lavender background +orange background +pink background +purple background +red background +sepia background +tan background +white background +yellow background \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_multiple_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_multiple_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a812871b4fe19c0ba00ff009027d27825dae4cfd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_multiple_colors.txt" @@ -0,0 +1,5 @@ +gradient background +multicolored background +rainbow background +heaven condition +two-tone background \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..29789fc24d17b2490fe190185b51398af958a8cf --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_other.txt" @@ -0,0 +1,22 @@ +background text +backlighting +blending +blurry background +card background +chibi inset +drama layer +fiery background +flag background +floral background +fruit background +heart background +imageboard colors +lace background +mosaic background +paneled background +photo background +projected inset +simple background +starry background +transparent background +zoom layer \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_patterns.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_patterns.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7ddfc64b07e2918ed5c583f495bb130889174395 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/backgrounds/backgrounds_patterns.txt" @@ -0,0 +1,14 @@ +argyle background +checkered background +food-themed background +halftone background +honeycomb background +paw print background +plaid background +polka dot background +simple background +snowflake background +spiral background +strawberry background +striped background +sunburst background \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/color-palette.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/color-palette.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ed5d7ee20d07c69e99b43de8ad628f66cb3828ac --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/color-palette.txt" @@ -0,0 +1,42 @@ +Teletext palette +Apple II palette +Commodore VIC-20 palette +CGA for IBM-PC palette +Commodore 64 palette +ZX Spectrum palette +Mattel Aquarius palette +MSX systems color palette +Thomson MO5 color palette +Commodore Plus/4 color palette +Amstrad CPC color palette +MSX2 systems color palette +Fujitsu FM-77 AV 40 color palette +EGA for IBM PC/AT color palette +Atari ST color palette +Commodore Amiga OCS color palette +Apple IIgs color palette +Sharp X68000 color palette +Atari 2600 color palette +Famicom/NES color palette +Sega Master System color palette +NEC PC-Engine color palette +TurboGrafx-16 color palette +Sega Mega Drive color palette +Sega Genesis color palette +Nintendo Game Boy color palette +Sega Game Gear color palette +Super Famicom/SNES color palette +Nintendo Game Boy Color palette +Nintendo Game Boy Advance color palette +Monochrome color palette +3-bit color palette +4-bit color palette +6-bit color palette +8-bit color palette +9-bit color palette +12-bit color palette +16-bit color palette +18-bit color palette +24-bit color palette +30-bit color palette +32-bit color palette diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/color.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/color.txt" new file mode 100644 index 0000000000000000000000000000000000000000..303f42862265499df29279f2a248ac6f3d07e271 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/color.txt" @@ -0,0 +1,12 @@ +aqua +black +blue +brown +green +grey +orange +pink +purple +red +white +yellow \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/colors_collections.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/colors_collections.txt" new file mode 100644 index 0000000000000000000000000000000000000000..694cbd73060f90a580b0764bde70424d36557dbd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/colors_collections.txt" @@ -0,0 +1,2 @@ +rainbow +rainbow order \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/colors_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/colors_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9601836e5f608f0d2898cfb8392d088a703ea686 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/colors_misc.txt" @@ -0,0 +1,4 @@ +color connection +colorized +colorful +spot color \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/dominant/colors_dominant_palettes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/dominant/colors_dominant_palettes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..999b1d184a261502c56c537929d9c7b4f7bb03c4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/dominant/colors_dominant_palettes.txt" @@ -0,0 +1,16 @@ +anime coloring +flat color +gradient +ff gradient +greyscale +high contrast +inverted colors +monochrome +color drain +multiple monochrome +spot color +muted color +pale color +partially colored +pastel colors +sepia \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/dominant/colors_dominant_specific.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/dominant/colors_dominant_specific.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2f02e665f36f9b09566922b41fe3d7fa92aeee0d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/colors/dominant/colors_dominant_specific.txt" @@ -0,0 +1,12 @@ +aqua theme +black theme +blue theme +brown theme +green theme +grey theme +orange theme +pink theme +purple theme +red theme +white theme +yellow theme \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/details.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/details.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7eebe2cf29cfb937c72e43a813b495e13ee551fc --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/details.txt" @@ -0,0 +1,30 @@ +Elaborate +Ornate +Intricate +Detailed +Fine +Refined +Complicated +Convoluted +Extensive +Exhaustive +Far-reaching +Informative +In-depth +Thorough +Step-by-step +Exact +Precise +Exquisite +Lavish +Sumptuous +opulent +luxurious +Intricately designed +Grandiose +Stately +Splendid +Magnificent +Precious +Dazzling +Imperceptible detail diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_body_parts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_body_parts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d382914753597505e190984344dc5719b1af22fb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_body_parts.txt" @@ -0,0 +1,11 @@ +armpit focus +ass focus +back focus +breast focus +eye focus +foot focus +hand focus +hip focus +navel focus +pectoral focus +thigh focus \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_gender.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_gender.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2ff6e82ca69a078217080e55073c631639fbfe38 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_gender.txt" @@ -0,0 +1,2 @@ +male focus +other focus \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_objects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_objects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..59e4432ac6d11bbcd9be4069de3ec63dd0940a6f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_objects.txt" @@ -0,0 +1,9 @@ +animal focus +book focus +cloud focus +food focus +text focus +vehicle focus +tank focus +weapon focus +wind chime focus \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a13173a36cc840a8d2d0e8906b89862318b05457 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/focus/focus_other.txt" @@ -0,0 +1,2 @@ +solo focus +text focus \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/hd.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/hd.txt" new file mode 100644 index 0000000000000000000000000000000000000000..57a646e0f4ccd4d0be6e4ce1163f4f1fde42eea8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/hd.txt" @@ -0,0 +1,12 @@ +2k resolution +4k resolution +8k resolution +12k resolution +16k resolution +32k resolution +High resolution +High definition +HD +HDR +Blu-ray +HD-DVD diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_angle_perspective_depth.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_angle_perspective_depth.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c7132a96d10ab3859c1275a35a98d706dc1e2083 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_angle_perspective_depth.txt" @@ -0,0 +1,13 @@ +atmospheric perspective +dutch angle +fisheye +from above +from behind +from below +panorama +perspective +pov +rotated +sideways +upside-down +vanishing point \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_composition.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_composition.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f81484247450678318d730e639e84a38f068020d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_composition.txt" @@ -0,0 +1,41 @@ +afterimage +border +framed +outside border +fading border +rounded corners +viewfinder +chart +bust chart +character chart +reference sheet +diagram +move chart +relationship graph +seating chart +stats +collage +column lineup +cropped +fake scrollbar +head out of frame +out-of-frame censoring +outside border +glitch +isometric +letterboxed +pillarboxed +lineup +mosaic art +photomosaic +negative space +omake +partially underwater shot +social media composition +symmetry +polar opposites +rotational symmetry +tachi-e +trim marks +zoom layer +projected inset \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_flaws.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_flaws.txt" new file mode 100644 index 0000000000000000000000000000000000000000..aec7ecea0b9449dfc0a17dc9ab564a644ef22c8a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_flaws.txt" @@ -0,0 +1,53 @@ +Artifacts +gif artifacts +jpeg artifacts +aliasing +scan artifacts +bleed through +crease +vhs artifacts +bad aspect ratio +blurry +censored +uncensored +decensored +derivative work +photo-referenced +reference photo +reference work +screencap redraw +duplicate +pixel-perfect duplicate +error +bad reflection +extra digits +typo +lossless-lossy +lossy-lossless +image macro +md5 mismatch +moire +third-party edit +detexted +hard translated +nude filter +poorly drawn +anatomical nonsense +bad anatomy +bad feet +bad hands +bad proportions +quality +ranguage +resized +downscaled +image sample +upscaled +waifu2x +reversed +solid color thumbnail +unmoving pattern +watermark +third-party watermark +web address +extraction \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_focus.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_focus.txt" new file mode 100644 index 0000000000000000000000000000000000000000..473935c7d0e032d3a49fa12e01da40936f731c07 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_focus.txt" @@ -0,0 +1,23 @@ +animal focus +armpit focus +ass focus +back focus +book focus +breast focus +cloud focus +eye focus +food focus +foot focus +hand focus +hip focus +male focus +navel focus +other focus +pectoral focus +solo focus +tank focus +text focus +thigh focus +weapon focus +vehicle focus +wind chime focus \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_format.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_format.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2e34dde65bbadac786b24ad6a343dabbe2dcbbbe --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_format.txt" @@ -0,0 +1,86 @@ +3d +animated +animated png +flash +video +music video +animated gif +easytoon +non-repeating animation +looping animation +roulette animation +live2d +archived file +artbook +calendar (medium) +card (medium) +comic +1koma +2koma +3koma +4koma +multiple 4koma +5koma +borderless panels +manga +eromanga +left-to-right manga +right-to-left comic +silent comic +corrupted file +cover +album cover +character single +cover page +doujin cover +dvd cover +fake cover +video game cover +magazine cover +manga cover +fake screenshot +game cg +gyotaku (medium) +highres +absurdres +incredibly absurdres +lowres +icon +logo +kirigami +lineart +no lineart +outline +tall image +wide image +mosaic art +photomosaic +oekaki +official art +phonecard +photo +papercraft +paper child +paper cutout +pixel art +postcard +poster +revision +has bad revision +has artifacted revision +has censored revision +has cropped revision +has downscaled revision +has lossy revision +has watermarked revision +scan +anime screencap +shitajiki +tegaki +transparent background +triptych (art) +vector trace +tileable +wallpaper forced +widescreen +official wallpaper \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_framing_body.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_framing_body.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8048f0a808cfc2dccbae5cbe89fa858d4b696f25 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_framing_body.txt" @@ -0,0 +1,18 @@ +face +portrait +upper body +lower body +cowboy shot +feet out of frame +full body +wide shot +very wide shot +close-up +cut-in +cropped legs +head out of frame +cropped torso +cropped arms +cropped shoulders +profile +group profile \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_other_patterns.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_other_patterns.txt" new file mode 100644 index 0000000000000000000000000000000000000000..19fa73f8b3fee267cb4edddf94c7de68e1983447 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_other_patterns.txt" @@ -0,0 +1,17 @@ +animal print +cow print +leopard print +tiger print +zebra print +shark print +argyle +checkered +fleur de lis +floral print +gingham +houndstooth +meandros +paisley +plaid +polka dot +striped \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_styles.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_styles.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b784a257d4e2c3462ad6a595402278733daa9625 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_styles.txt" @@ -0,0 +1,21 @@ +abstract +art deco +art nouveau +fine art parody +flame painter +impressionism +ligne claire +nihonga +sumi-e +ukiyo-e +minimalism +realistic +photorealistic +sketch +style parody +List of style parodies +surreal +traditional media +* (medium) +faux traditional media +unfinished \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_subject_matter.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_subject_matter.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0f60b89c86b6eec3a909a6de906bf0c742451304 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_subject_matter.txt" @@ -0,0 +1,9 @@ +cityscape +everyone +absolutely everyone +landscape +nature +no humans +off-topic +scenery +still life \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_techniques.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_techniques.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ce8e481b2cc1cb263e1463838518cba542f007bb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_techniques.txt" @@ -0,0 +1,33 @@ +backlighting +blending +bloom +bokeh +caustics +chiaroscuro +chromatic aberration +chromatic aberration abuse +diffraction spikes +depth of field +dithering +drop shadow +emphasis lines +film grain +foreshortening +gradient +halftone +image fill +lens flare +lens flare abuse +motion blur +motion lines +multiple monochrome +optical illusion +anaglyph +exif thumbnail surprise +open in internet explorer +open in winamp +stereogram +scanlines +silhouette +speed lines +vignetting \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_traditional_japanese_patterns.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_traditional_japanese_patterns.txt" new file mode 100644 index 0000000000000000000000000000000000000000..18ca8d8f76a2e03cf999f895b11796955762f27a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/image_composition/image_composition_traditional_japanese_patterns.txt" @@ -0,0 +1,19 @@ +asa no ha (pattern) +egasumi +genjiguruma +goshoguruma +igeta (pattern) +kagome (pattern) +kanoko (pattern) +karakusa (pattern) +kikumon +kikkoumon +kojitsunagi (pattern) +sakuramon +sayagata +seigaiha +shima (pattern) +shippou (pattern) +uchiwa design +uroko (pattern) +yagasuri \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/quantity.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/quantity.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5c25804bc3abe9b44ea8e3263f40864598ee3ec6 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\346\247\213\345\234\226/quantity.txt" @@ -0,0 +1,18 @@ +1 +2 +3 +4 +5 +6 +7 +8 +9 +10 +100 +300 +500 +1000 +Millions +Hundreds +Thousands +Way too much diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/board_games/board_games_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/board_games/board_games_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..59b063c0838f34dc52d494426b3594944bf44e87 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/board_games/board_games_main.txt" @@ -0,0 +1,10 @@ +chess +game of life +go +kai-awase +mahjong +monopoly +reversi +settlers of catan +shogi +tic-tac-toe \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/board_games/board_games_pieces.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/board_games/board_games_pieces.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5880c585df251586987b7da439e9f42071fdd930 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/board_games/board_games_pieces.txt" @@ -0,0 +1,6 @@ +board +chessboard +cards +chess piece +dice +living piece \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_frame_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_frame_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4534ce24e23c74aaab3f771bc28acec34819ad02 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_frame_colors.txt" @@ -0,0 +1,12 @@ +aqua-framed eyewear +black-framed eyewear +blue-framed eyewear +brown-framed eyewear +green-framed eyewear +grey-framed eyewear +orange-framed eyewear +pink-framed eyewear +purple-framed eyewear +red-framed eyewear +white-framed eyewear +yellow-framed eyewear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_lens_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_lens_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..84615288e1e063c5c8870b353a1b4163e7c079d2 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_lens_colors.txt" @@ -0,0 +1,8 @@ +blue-tinted eyewear +brown-tinted eyewear +green-tinted eyewear +orange-tinted eyewear +pink-tinted eyewear +purple-tinted eyewear +red-tinted eyewear +yellow-tinted eyewear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b375c970e12050fe416cd90d48696667c2845c64 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_misc.txt" @@ -0,0 +1,16 @@ +adjusting eyewear +crooked eyewear +eyewear on head +eyewear on headwear +eyewear removed +eyewear hang +eyewear in mouth +holding eyewear +looking for glasses +eyewear strap +eyewear switch +eyewear view +hand on eyewear +looking over eyewear +no eyewear +removing eyewear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_types.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_types.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7cc756c0a3a4db3e864c02cdf8ebacab6706fa79 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear/eyewear_eyewear_types.txt" @@ -0,0 +1,9 @@ +heart-shaped eyewear +round eyewear +rectangular eyewear +rimless eyewear +semi-rimless eyewear +over-rim eyewear +under-rim eyewear +teardrop-framed glasses +triangular eyewear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..86c02c8bf892091ec22b845e286eeb89c4bc529e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear_main.txt" @@ -0,0 +1,4 @@ +glasses +monocle +sunglasses +goggles \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear_sunglasses.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear_sunglasses.txt" new file mode 100644 index 0000000000000000000000000000000000000000..033c35521a03788a4bbb1c1d67dbbad34f2d47d5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/eyewear_sunglasses.txt" @@ -0,0 +1,3 @@ +aviator sunglasses +kamina shades +simon shades \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/glasses/eyewear_glasses_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/glasses/eyewear_glasses_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8e4c9a9c3c8e6b52ded7da1aa29af7a88f55e809 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/glasses/eyewear_glasses_misc.txt" @@ -0,0 +1 @@ +bespectacled \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/glasses/eyewear_glasses_types.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/glasses/eyewear_glasses_types.txt" new file mode 100644 index 0000000000000000000000000000000000000000..12b9b9ab501757c2b78e8e417fb7fc9189e06e65 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/eyewear/glasses/eyewear_glasses_types.txt" @@ -0,0 +1,20 @@ +aviator sunglasses +3d glasses +coke-bottle glasses +diving mask +fancy glasses +heart-shaped eyewear +star-shaped eyewear +teardrop-framed glasses +flight goggles +funny glasses +lorgnette +nodoka glasses +opaque glasses +opera glasses +pince-nez +safety glasses +shooting glasses +ski goggles +x-ray glasses +scouter \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_actions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_actions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0558c03298b7f8b216ef4e67e21c7b508419210e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_actions.txt" @@ -0,0 +1,4 @@ +breathing fire +burning +explosion +pyrokinesis \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_body_parts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_body_parts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..86b8699eff35feed470b1d5e8a90a0279acd44ff --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_body_parts.txt" @@ -0,0 +1,5 @@ +fire body +fiery hair +fiery tail +fiery wings +flame-tipped tail \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..dcf22e2e0d5c827a2429ae235a642d2bbc286b99 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_colors.txt" @@ -0,0 +1,6 @@ +black fire +blue fire +green fire +pink fire +purple fire +white fire \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_forms.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_forms.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8aae3abd334d48d78905ee669d0ad336062d70ae --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_forms.txt" @@ -0,0 +1,5 @@ +campfire +fireball +fireworks +fireflies +flame \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..69b21656c0ecea94332352c454ba30bf1e0586bb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_main.txt" @@ -0,0 +1 @@ +fire \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a358b90cadb44b50d817ac119f75fd731e9c6085 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_misc.txt" @@ -0,0 +1,2 @@ +fire flower +no fire \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_objects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_objects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ed6b40c4985d1eb1cdf104ec86a595477d6dc830 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fire/fire_objects.txt" @@ -0,0 +1,4 @@ +fire extinguisher +fire hydrant +fireplace +flamethrower \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fish.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fish.txt" new file mode 100644 index 0000000000000000000000000000000000000000..26f83f8e8700bc877c9409264212937c0382da38 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fish.txt" @@ -0,0 +1,1083 @@ +African glass catfish +African lungfish +Aholehole +Airbreathing catfish +Airsac catfish +Alaska blackfish +Albacore +Alewife +Alfonsino +Algae eater +Alligatorfish +Alligator gar +Amberjack +American sole +Amur pike +Anchovy +Anemonefish +Angelfish +Angler +Angler catfish +Anglerfish +Antarctic cod +Antarctic icefish +Antenna codlet +Arapaima +Archerfish +Arctic char +Armored gurnard +Armored searobin +Armorhead +Armorhead catfish +Armoured catfish +Arowana +Arrowtooth eel +Asiatic glassfish +Atka mackerel +Atlantic bonito +Atlantic cod +Atlantic herring +Atlantic salmon +Atlantic sharpnose shark +Atlantic saury +Atlantic silverside +Australasian salmon +Australian grayling +Australian herring +Australian lungfish +Australian prowfish +Ayu +Baikal oilfish +Bala shark +Ballan wrasse +Bamboo shark +Banded killifish +Bandfish +Banjo +Bangus +Banjo catfish +Barb +Barbel +Barbeled dragonfish +Barbeled houndshark +Barbel-less catfish +Barfish +Barracuda +Barracudina +Barramundi +Barred danio +Barreleye +Basking shark +Bass +Basslet +Batfish +Bat ray +Beachsalmon +Beaked salmon +Beaked sandfish +Beardfish +Beluga sturgeon +Bengal danio +Betta +Bichir +Bicolor goat fish +Bigeye +Bigeye squaretail +Bighead carp +Bigmouth buffalo +Bigscale +Bigscale pomfret +Billfish +Bitterling +Black angelfish +Black bass +Black dragonfish +Blackchin +Blackfin Tuna +Blackfish +Black neon tetra +Blacktip reef shark +Black mackerel +Black scalyfin +Black sea bass +Black scabbardfish +Black swallower +Black tetra +Black triggerfish +Bleak +Blenny +Blind goby +Blind shark +Blobfish +Blowfish +Blue catfish +Blue danio +Blue-redstripe danio +Blue eye trevalla +Bluefin tuna +Bluefish +Bluegill +Blue gourami +Blue shark +Blue triggerfish +Blue whiting +Bluntnose knifefish +Bluntnose minnow +Boafish +Boarfish +Bobtail snipe eel +Bocaccio +Boga +Bombay duck +Bonefish +Bonito +Bonnethead shark +Bonnetmouth +Bonytail +Bonytongue +Bowfin +Boxfish +Bramble shark +Bream +Brill +Bristlemouth +Bristlenose catfish +Broadband dogfish +Bronze corydoras +Brook lamprey +Brook stickleback +Brook trout +Brotula +Brown trout +Buffalo fish +Bullhead +Bullhead shark +Bull shark +Bull trout +Burbot +Bumblebee goby +Buri +Burma danio +Burrowing goby +Butterfish +Butterfly ray +Butterflyfish +California flyingfish +California halibut +Canary rockfish +Candiru +Candlefish +Capelin +Cardinalfish +Cardinal tetra +Carp +Carpetshark +Carpsucker +Catalufa +Catfish +Catla +Cat shark +Cavefish +Celebes rainbowfish +Central mudminnow +Chain pickerel +Channel bass +Channel catfish +Char +Cherry salmon +Chimaera +Chinook salmon +Cherubfish +Chub +Chubsucker +Chum salmon +Cichlid +Cisco +Climbing catfish +Climbing gourami +Climbing perch +Clingfish +Clownfish +Clown loach +Clown triggerfish +Cobbler +Cobia +Cod +Codlet +Codling +Coelacanth +Coffinfish +Coho salmon +Coley +Collared carpetshark +Collared dogfish +Colorado squawfish +Combfish +Combtail gourami +Combtooth blenny +Common carp +Common tunny +Conger eel +Convict blenny +Convict cichlid +Cookie-cutter shark +Coolie loach +Cornetfish +Cowfish +Cownose ray +Cow shark +Crappie +Creek chub +Crestfish +Crevice kelpfish +Croaker +Crocodile icefish +Crocodile shark +Crucian carp +Cuckoo wrasse +Cusk +Cusk-eel +Cutlassfish +Cutthroat eel +Cutthroat trout +Dab +Dace +Daggertooth pike conger +Damselfish +Danio +Darter +Dartfish +Dealfish +Death Valley pupfish +Deep sea eel +Deep sea smelt +Deepwater cardinalfish +Deepwater flathead +Deepwater stingray +Delta smelt +Demoiselle +Denticle herring +Desert pupfish +Devario +Devil ray +Dhufish +Discus +Dogfish +Dogfish shark +Dogteeth tetra +Dojo loach +Dolly Varden trout +Dolphin fish +Dorab wolf-herring +Dorado +Dory +Dottyback +Dragonet +Dragonfish +Dragon goby +Driftfish +Driftwood catfish +Drum +Duckbill +Duckbill eel +Dusky grouper +Dusky shark +Dwarf gourami +Dwarf loach +Eagle ray +Earthworm eel +Eel +Eel cod +Eel-goby +Eelpout +Eeltail catfish +Elasmobranch +Electric catfish +Electric eel +Electric knifefish +Electric ray +Elephant fish +Elephantnose fish +Elver +Ember parrotfish +Emerald catfish +Emperor +Emperor angelfish +Emperor bream +Escolar +Eucla cod +Eulachon +European chub +European eel +European flounder +European minnow +European perch +False brotula +False cat shark +False moray +False trevally +Fangtooth +Fathead sculpin +Featherback +Fierasfer +Fire goby +Filefish +Finback cat shark +Fingerfish +Fire bar danio +Firefish +Flabby whale fish +Flagblenny +Flagfin +Flagfish +Flagtail +Flashlight fish +Flatfish +Flathead +Flathead catfish +Flier +Flounder +Flying gurnard +Flying fish +Footballfish +Forehead brooder +Four-eyed fish +French angelfish +Freshwater eel +Freshwater hatchetfish +Freshwater shark +Frigate mackerel +Frilled shark +Frogfish +Frogmouth catfish +Fusilier fish +Galjoen fish +Ganges shark +Gar +Garden eel +Garibaldi +Garpike +Ghost fish +Ghost flathead +Ghost knifefish +Ghost pipefish +Ghost shark +Ghoul +Giant danio +Giant gourami +Giant sea bass +Gibberfish +Gila trout +Gizzard shad +Glass catfish +Glassfish +Glass knifefish +Glowlight danio +Goatfish +Goblin shark +Goby +Golden dojo +Golden loach +Golden shiner +Golden trout +Goldeye +Goldfish +Gombessa +Goosefish +Gopher rockfish +Gourami +Grass carp +Graveldiver +Grayling +Gray mullet +Gray reef shark +Great white shark +Green swordtail +Greeneye +Greenling +Grenadier +Green spotted puffer +Ground shark +Grouper +Grunion +Grunt +Grunter +Grunt sculpin +Gudgeon +Guitarfish +Gulf menhaden +Gulper eel +Gulper +Gunnel +Guppy +Gurnard +Haddock +Hagfish +Hairtail +Hake +Halfbeak +Halfmoon +Halibut +Halosaur +Hamlet +Hammerhead shark +Hammerjaw +Handfish +Hardhead catfish +Harelip sucker +Hatchetfish +Hawkfish +Herring +Herring smelt +Hickory Shad +Hillstream loach +Hog sucker +Hoki +Horn shark +Horsefish +Houndshark +Huchen +Humuhumunukunukuapua'a +Hussar +Icefish +Ide +Ilish, Hilsha +Inanga +Inconnu +Jack +Jackfish +Jack Dempsey +Japanese eel +Javelin +Jawfish +Jellynose fish +Jewelfish +Jewel tetra +Jewfish +John Dory +Kafue pike +Kahawai +Kaluga +Kanyu +Kelp perch +Kelpfish +Killifish +King of the herrings +Kingfish +King-of-the-salmon +Kissing gourami +Knifefish +Knifejaw +Koi +Kokanee +Kokopu +Kuhli loach +Labyrinth fish +Ladyfish +Lake chub +Lake trout +Lake whitefish +Lampfish +Lamprey +Lancetfish +Lanternfish +Largemouth bass +Leaffish +Leatherjacket +Lefteye flounder +Lemon shark +Lemon sole +Lemon tetra +Lenok +Leopard danio +Lightfish +Limia +Lined sole +Ling +Ling cod +Lionfish +Livebearer +Lizardfish +Loach +Loach catfish +Loach goby +Loach minnow +Longfin +Longfin dragonfish +Longfin escolar +Longfin smelt +Long-finned char +Long-finned pike +Long-finned sand diver +Longjaw mudsucker +Longneck eel +Longnose chimaera +Longnose dace +Longnose lancetfish +Longnose sucker +Longnose whiptail catfish +Long-whiskered catfish +Loosejaw +Lost River sucker +Louvar +Loweye catfish +Luderick +Luminous hake +Lumpsucker +Lungfish +Mackerel +Mackerel shark +Madtom +Mahi-mahi +Mahseer +Mail-cheeked fish +Mako shark +Mandarinfish +Manefish +Man-of-war fish +Manta ray +Marblefish +Marine hatchetfish +Marlin +Masu salmon +Medaka +Medusafish +Megamouth shark +Menhaden +Merluccid hake +Mexican golden trout +Midshipman fish +Milkfish +Minnow +Minnow of the deep +Modoc sucker +Mojarra +Mola mola +Monkeyface prickleback +Monkfish +Mooneye +Moonfish +Moorish idol +Mora +Moray eel +Morid cod +Morwong +Moses sole +Mosquitofish +Mouthbrooder +Mozambique tilapia +Mrigal +Mud catfish +Mudfish +Mud minnow +Mudskipper +Mudsucker +Mullet +Mummichog +Murray cod +Muskellunge +Mustache triggerfish +Mustard eel +Naked-back knifefish +Nase +Needlefish +Neon tetra +New World rivuline +New Zealand sand diver +New Zealand smelt +Nibble fish +Noodlefish +North American darter +North American freshwater catfish +North Pacific daggertooth +Northern anchovy +Northern clingfish +Northern lampfish +Northern pike +Northern sea robin +Northern squawfish +Northern stargazer +Notothen +Nurseryfish +Nurse shark +Oarfish +Ocean perch +Ocean sunfish +Oceanic whitetip shark +Oilfish +Oldwife +Old World knifefish +Olive flounder +Opah +Opaleye +Orange roughy +Orangespine unicorn fish +Orangestriped triggerfish +Orbicular batfish +Orbicular velvetfish +Oregon chub +Orfe +Oriental loach +Oscar +Owens pupfish +Pacific albacore +Pacific cod +Pacific hake +Pacific herring +Pacific lamprey +Pacific salmon +Pacific saury +Pacific trout +Pacific viperfish +Paddlefish +Pancake batfish +Panga +Paradise fish +Parasitic catfish +Parore +Parrotfish +Peacock flounder +Peamouth +Pearleye +Pearlfish +Pearl danio +Pearl perch +Pelagic cod +Pelican eel +Pelican gulper +Pencil catfish +Pencilfish +Pencilsmelt +Peppered corydoras +Perch +Peters elephantnose fish +Pickerel +Pigfish +Pike conger +Pike eel +Pike +Pikeblenny +Pikeperch +Pilchard +Pilot fish +Pineapplefish +Pineconefish +Pink salmon +Píntano +Pipefish +Piranha +Pirarucu +Pirate perch +Plaice +Platy +Platyfish +Pleco +Plownose chimaera +Poacher +Pollyfish +Pollock +Pomfret +Pompano +Pompano dolphinfish +Ponyfish +Popeye catalufa +Porbeagle shark +Porcupinefish +Porgy +Port Jackson shark +Powen +Prickleback +Pricklefish +Prickly shark +Prowfish +Pufferfish +Pumpkinseed +Pupfish +Pygmy sunfish +Queen danio +Queen parrotfish +Queen triggerfish +Quillback +Quillfish +Rabbitfish +Raccoon butterfly fish +Ragfish +Rainbow trout +Rainbowfish +Rasbora +Ratfish +Rattail +Ray +Razorback sucker +Razorfish +Red grouper +Red salmon +Red snapper +Redfin perch +Redfish +Redhorse sucker +Redlip blenny +Redmouth whalefish +Redtooth triggerfish +Red velvetfish +Red whalefish +Reedfish +Reef triggerfish +Remora +Requiem shark +Ribbon eel +Ribbon sawtail fish +Ribbonfish +Rice eel +Ricefish +Ridgehead +Riffle dace +Righteye flounder +Rio Grande perch +River loach +River shark +River stingray +Rivuline +Roach +Roanoke bass +Rock bass +Rock beauty +Rock cod +Rocket danio +Rockfish +Rockling +Rockweed gunnel +Rohu +Ronquil +Roosterfish +Ropefish +Rough scad +Rough sculpin +Roughy +Roundhead +Round herring +Round stingray +Round whitefish +Rudd +Rudderfish +Ruffe +Russian sturgeon +Sabalo +Sabertooth +Saber-toothed blenny +Sabertooth fish +Sablefish +Sacramento blackfish +Sacramento splittail +Sailfin silverside +Sailfish +Salamanderfish +Salmon +Salmon shark +Sandbar shark +Sandburrower +Sand dab +Sand diver +Sand eel +Sandfish +Sand goby +Sand knifefish +Sand lance +Sandperch +Sandroller +Sand stargazer +Sand tiger +Sand tilefish +Sandbar shark +Sarcastic fringehead +Sardine +Sargassum fish +Sauger +Saury +Sawfish +Saw shark +Sawtooth eel +Scabbard fish +Scaly dragonfish +Scat +Scissortail rasbora +Scorpionfish +Sculpin +Scup +Sea bass +Sea bream +Sea catfish +Sea chub +Sea devil +Sea dragon +Sea lamprey +Sea raven +Sea snail +Sea toad +Seahorse +Seamoth +Searobin +Sevan trout +Sergeant major +Shad +Shark +Sharksucker +Sharpnose puffer +Sheatfish +Sheepshead +Sheepshead minnow +Shiner +Shortnose chimaera +Shortnose sucker +Shovelnose sturgeon +Shrimpfish +Siamese fighting fish +Sillago +Silver carp +Silver dollar +Silver dory +Silver hake +Silverside +Silvertip tetra +Sind danio +Sixgill ray +Sixgill shark +Skate +Skilfish +Skipjack tuna +Slender mola +Slender snipe eel +Sleeper +Sleeper shark +Slickhead +Slimehead +Slimy mackerel +Slimy sculpin +Slipmouth +Smalleye squaretail +Smalltooth sawfish +Smelt +Smelt-whiting +Smooth dogfish +Snailfish +Snake eel +Snakehead +Snake mackerel +Snapper +Snipe eel +Snipefish +Snook +Snubnose eel +Snubnose parasitic eel +Sockeye salmon +Soldierfish +Sole +South American darter +South American lungfish +Southern Dolly Varden +Southern flounder +Southern hake +Southern sandfish +Southern smelt +Spadefish +Spaghetti eel +Spanish mackerel +Spearfish +Speckled trout +Spiderfish +Spikefish +Spinefoot +Spiny basslet +Spiny dogfish +Spiny dwarf catfish +Spiny eel +Spinyfin +Splitfin +Spookfish +Spotted climbing perch +Spotted danio +Spottail pinfish +Sprat +Springfish +Squarehead catfish +Squaretail +Squawfish +Squeaker +Squirrelfish +Staghorn sculpin +Stargazer +Starry flounder +Steelhead +Stickleback +Stingfish +Stingray +Stonecat +Stonefish +Stoneroller minnow +Stream catfish +Striped bass +Striped burrfish +Sturgeon +Sucker +Suckermouth armored catfish +Summer flounder +Sundaland noodlefish +Sunfish +Surf sardine +Surfperch +Surgeonfish +Swallower +Swamp-eel +Swampfish +Sweeper +Swordfish +Swordtail +Tadpole cod +Tadpole fish +Tailor +Taimen +Tang +Tapetail +Tarpon +Tarwhine +Telescopefish +Temperate bass +Temperate ocean-bass +Temperate perch +Tench +Tenpounder +Tenuis +Tetra +Thorny catfish +Thornfish +Threadfin +Threadfin bream +Thread-tail +Three spot gourami +Threespine stickleback +Three-toothed puffer +Thresher shark +Tidewater goby +Tiger barb +Tigerperch +Tiger shark +Tiger shovelnose catfish +Tilapia +Tilefish +Titan triggerfish +Toadfish +Tommy ruff +Tompot blenny +Tonguefish +Tope +Topminnow +Torpedo +Torrent catfish +Torrent fish +Trahira +Treefish +Trevally +Triggerfish +Triplefin blenny +Triplespine +Tripletail +Tripod fish +Trout +Trout cod +Trout-perch +Trumpeter +Trumpetfish +Trunkfish +Tubeblenny +Tube-eye +Tube-snout +Tubeshoulder +Tui chub +Tuna +Turbot +Two spotted goby +Uaru +Unicorn fish +Upside-down catfish +Vanjaram +Velvet belly lanternshark +Velvet catfish +Velvetfish +Vendace +Vermilion snapper +Vimba +Viperfish +Wahoo +Walking catfish +Wallago +Walleye +Walleye pollock +Walu +Warmouth +Warty angler +Waryfish +Waspfish +Weasel shark +Weatherfish +Weever +Weeverfish +Wels catfish +Whale catfish +Whalefish +Whale shark +Whiff +Whitebait +White croaker +Whitefish +White marlin +White shark +Whitetip reef shark +Whiting +Wobbegong +Wolf-eel +Wolffish +Wolf-herring +Worm eel +Wormfish +Wrasse +Wrymouth +X-ray tetra +Yellow-and-black triplefin +Yellowback fusilier +Yellowbanded perch +Yellow bass +Yellowedge grouper +Yellow-edged moray +Yellow-eye mullet +Yellowhead jawfish +Yellowfin croaker +Yellowfin cutthroat trout +Yellowfin grouper +Yellowfin tuna +Yellowfin pike +Yellowfin surgeonfish +Yellowfin tuna +Yellow jack +Yellowmargin triggerfish +Yellow moray +Yellow perch +Yellowtail +Yellowtail amberjack +Yellowtail barracuda +Yellowtail clownfish +Yellowtail horse mackerel +Yellowtail kingfish +Yellowtail snapper +Yellow tang +Yellow weaver +Yellowtail catfish +Zander +Zebra bullhead shark +Zebra danio +Zebrafish +Zebra lionfish +Zebra loach +Zebra oto +Zebra pleco +Zebra shark +Zebra tilapia diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flower.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flower.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bb3e984b6f2a9e3d661e084627001f3d42209d5c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flower.txt" @@ -0,0 +1,250 @@ +Acacia +Achillea +Adam's-needle +African Boxwood +African Lily +Agapanthus +Ageratum +Ageratum houstonim +Allium +Alpina +Alstroemeria +Amaranthus hypochondriacus +Amaryllis +Ammi majus +Anconitum +Anemone +Anigozanthus +Annual Delphinium +Anthurium +Antirrhinum majus +Artichoke thistle +Asparagus +Aster +Astilbe +Baby's Breath +Bachelor's Button +Banksia +Bellflower +Big Flax +Bighead Knapweed +Billy Buttons +Bird of Paradise +Blazing Star +Blue Lace Flower +Boronia +Bouvardia +Boxwood African +Diosma +Buckthorn Variegated +Buddleia +Bupleurum +Butterfly Bush +Butterfly Orchid +Calla Lily +Campanula +Candytuft +Canterbury Bells +Carnation +Carthamus +Casa Blanca +Caspia +Cattleya +Celosia +Celosia argenta +Centaurea cyanus +Chamelaucium +Chimney Bells +Chrysanthemum +Chrysanthemum x morifolium +Clarkia +Cockscomb Crested +Coffee Bean Berry +Common Myrtle +Common Yarrow +Cone Flower +Consolida ambigua +Convallaria +Cordyline +Cosmos +Cornflower +Craspedia +Curly Willow +Cymbidium +Cymbidium Orchid +Daffodil +Dahlia +Daisy Mums +Delphinium Belladonna +Delphinium Pacific Giant +Dendrobium +Dendrobium Orchid +Dianthus barbatus +Dianthus caryophyllus +Dianthus caryophyllus nana +Erica spp +Eucalyptus seeded +Eucalyptus silver dollar +Eustoma grandiflorum +False Bird of Paradise +False Spirea +Farewell-To-Spring +Fernleaf Yarrow +Feverfew +Flamingo Flower +Flax New Zealand +Floss Flower +Foxtail Fern +Freesia +Freesia x hybrida +Fuji Mums +Gardenia +Gay Feather +Genista +Gerbera +Gerbera Ruby Red +Ginger +Gladiolus +Gladiolus hybrid nanus +Goat's Beard +Godetia +Golden Rod +Guersney Lily +Gyp +Gypsophila paniculata +Hanging Helicona +Heath +Heather +Helianthus annuus +Heliconia spp. +Hippeastrum +Hydrangea +Iberis amara +Inca Lily +Iris +Japhette Orchid +Jonquil +Knapweed +Lace fern +Larkspur +Lathyrus odoratus +Lavandula +Lavender +Liatris +Lilac +Lily +Lilly-of-the-Valley +Lily Casa Blanca +Lily of the Field +Lily of the Nile +Lily Stargazer +Limonium +Lisianthus +Marguerite daisy +Mattholia incana +Melaleuca +Memosa +Misty Blue Limonium +Moluccella laevis +Monkshood +Montbretia +Monte Cassino +Moon orchid +Musa +Myrsine +Myrtle +Myrtus +Nephrolepis +Nerine +Nerine Lily +Nigella +Ornithogalum +Paeonia +Painted Tongue +Paper Reed +Papyrus lion's head +Peony +Peruvian Lily +Phalaenopsis +Philodendron +Phlox +Pincushion Flower +Pink Mink +Pitt +Pittosporum +Pixie Carnation +Polianthes tuberosa +Pompon Chrysanthemum +Poppy Anemone +Porium +Pussy Willow +Queen Anne's Lace +Ranunculus +Red Ribbons +Rice flower +Rose +Rose Bridal Pink +Rose Bridal White +Rose Champagne +Rose Diadem +Rose Emblem +Rose Kardinal +Rose Lady Liberty +Rose Lavanda +Rose Osiana +Rose Royalty +Safari Sunset +Safflower +Sage Perennial +Salix +Salmon Reagan +Sansevieria +Saponaria +Satin Flowers +Saxicola +Scabiosa +Schinus +Sea lavender +Shell Flowers +Snake Plant +Snapdragon +Solidago +Solidaster +Speedwell +Spider Lily +Spider Mums +Spray Carnation +Sprengeri Fern +Star of Bethlehem +Statice +Stenamezon +Stephanotis +Strawberry banksia +Strawflower +Summer poinsettia +Summer's Darling +Sunflower +Sweet Pea +Sweet William +Sword Fern +Syringa vulgaris +Tailflowers +Tassel flower +Thouroughwax +Throatwort +Tracelium +Tree Fern +Trumpet Lily +Tuberose +Tulip +Tulipa +Veronica +Wattle +Waxflower +Wild Plantain +Willow curly +Windflower +Wolfsbane +Zantedeschia +Zinna +Zinnia elegans diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..770b409536b76dd37863f6dd86562939f7b63f93 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_main.txt" @@ -0,0 +1 @@ +flower \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8762ca98f3e152edcaee5e83c980b19f207b4d4e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_misc.txt" @@ -0,0 +1,29 @@ +bouquet +coughing flowers +field +flower field +meadow +flower (symbol) +flower bed +flower bracelet +flower in mouth +flower necklace +flower on head +flower on liquid +flower trim +floral print +hair flower +hanafuda +hat flower +holding flower +garden +gardening +lei +on flower +outdoors +petals +rose petals +petals on liquid +smelling flower +wreath +head wreath \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_species.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_species.txt" new file mode 100644 index 0000000000000000000000000000000000000000..4af186a501cdf2bf0192d3ac1bb4622af9d42956 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/flowers/flowers_species.txt" @@ -0,0 +1,151 @@ +Alstroemeria (flower) +Amaryllis (flower) +Anemone +Angel's trumpet +Anthurium +Azalea +Baby's-breath +Begonia +Bellflower +Chinese bellflower +Bird of paradise flower +Black-eyed Susan +Bleeding heart +Blueberry blossoms +Bougainvillea +Calla lily +Camellia +Carnation +Red carnation +Chamomile +Cherry blossoms +Chinese lantern +Chrysanthemum +Clivia +Clematis +Clover +Columbine +Cockscomb (flower) +Cornflower +Cosmos +Crocus +Cymbidium +Daffodil +Dahlia +Daisy +Dimorphotheca +Dandelion +Dianthus +Dipsacaceae +Dusty miller +Edelweiss +Epiphyllum +Eustoma +Flax +Forget-me-not +Forsythia +Four o'clock (flower) +Foxglove +Freesia +Fuchsia +Hawthorn (plant) +Hellebore +Hibiscus +Heliconia +Hollyhock +Hyacinth +Hydrangea +Genista (flower) +Gentiana (flower) +Geranium +Gerbera +Gladiolus +Gloriosa (flower) +Grass lily +Great burnet +Jasmine +Iris +Kalanchoe +Kerria japonica +Lantana (flower) +Larkspur +Lavender +Lilac +Lily +Tiger lily +Lily of the valley +Lisianthus (flower) +Magnolia +Marigold +Marsh marigold +Mimosa (flower) +Morning glory +Moonflower +Gourd blossom +Nemophila +Nigella +Oleander +Orange blossoms +Orchid +Oncidium +Osmanthus +Gold osmanthus +Pansy +Passion flower +Paulownia +Peach blossom +Pentas (flower) +Peony +Perennial +Periwinkle +Petunia (flower) +Pieris japonica +Plant +Clover +Lotus +Lily pad +Potted plant +Plum blossoms +Plumeria +Poinsettia +Pomegranate flower +Poppy +Portulaca +Potato flower +Potentilla +Rafflesia +Ranunculus +Buttercup +Rapeseed blossoms +Red ginger +Reimu +Rhododendron +Rose +Black rose +Blue rose +Brown rose +Green rose +Grey rose +Orange rose +Pink rose +Purple rose +Red rose +White rose +Yellow rose +Multicolored rose +Rudbeckia +Snapdragon +Snowdrop +Spathiphyllum +Spider lily +Strawberry blossoms +Sunflower +Thistle +Trumpet creeper +Tulip +Water lily flower +Wax flower +Wisteria +Wolfsbane +Violet +Zinnia \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_actions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_actions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e7108c10e3e4eafcca57c925f19da64400b6cef3 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_actions.txt" @@ -0,0 +1,36 @@ +cannibalism +cooking +baking +barbecue +chocolate making +grilling +roasting +cupping glass +dishwashing +drinking +drunk +eating +feeding +force-feeding +incoming food +flipping food +food fight +foodgasm +full mouth +hanging food +hungry +hunger hallucination +in food +licking +picnic +pouring +decantering +shop +shopping +shinkon santaku +spill +tasting +toast in mouth +the last supper +vore +whisking \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_breads.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_breads.txt" new file mode 100644 index 0000000000000000000000000000000000000000..01c591ee65c943428ec8b9404e1b8f699610e909 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_breads.txt" @@ -0,0 +1,45 @@ +bread +baguette +biscuit +bread bun +bread crust +bread eating race +japari bun +melon bread +toast +cereal +cracker +arare +senbei +croissant +empanada +noodles +pasta +lasagne +spaghetti +black spaghetti +spaghetti and meatballs +ramen +cup ramen +jirou +soba +yakisoba +italian (niigata) +yakisobapan +soumen +udon +kitsune udon +pancake +pancake stack +rice +curry rice +fried rice +omurice +onigiri +rice on face +rice paddy +rice porridge +nanakusa-no-sekku +risotto +scone +waffle \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_condiments.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_condiments.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a8f75e5f67310d49e4e36c6def68f007444d1224 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_condiments.txt" @@ -0,0 +1,19 @@ +furikake +honey +jam +marmalade +ketchup +mayonnaise +mustard +peanut butter +pepper shaker +salt shaker +sauce +hot sauce +soy sauce +sugar (food) +sugar cube +syrup +chocolate syrup +strawberry syrup +wasabi \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_dairy.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_dairy.txt" new file mode 100644 index 0000000000000000000000000000000000000000..12a0b0f914be4d4771174fb3c1c0d3759f5a18e9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_dairy.txt" @@ -0,0 +1,20 @@ +butter +cheese +smoked cheese +swiss cheese +cream +whipped cream +eggnog +kefir +milk +baby bottle +condensed milk +chocolate milk +milk bottle +milk carton +milk churn +milk mustache +milkshake +strawberry milk +yakult +yogurt \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_drink_containers.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_drink_containers.txt" new file mode 100644 index 0000000000000000000000000000000000000000..67c18128409ab0471acb1eaed4c597f5341d1725 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_drink_containers.txt" @@ -0,0 +1,32 @@ +bottle +soda bottle +can +cup +mug +sakazuki +yunomi +drinking glass +champagne coupe +champagne flute +cocktail glass +hurricane glass +shot glass +snifter +wine glass +coffee cup +coffee mug +coffee pot +coffee press +gourd +jar +jug +juice box +beer mug +canned tea +saucer +teacup +teapot +long spout teapot +side handle teapot +tea set +thermos \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_drink_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_drink_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9a7ddc33ac7d5da29c8b6877196d473847929cdd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_drink_main.txt" @@ -0,0 +1,57 @@ +alcohol +beer +non-alcoholic beer +guinness (beer) +champagne +cocktail +caesar (drink) +cocktail umbrella +margarita +martini +liquor +sake +tequila +vodka +whiskey +wine +coffee +cafe au lait +canned coffee +coffee beans +dandelion coffee +georgia max coffee +latte art +starbucks +juice +amazake +lemonade +orange juice +milk +soda +7up +calpis +coca-cola +dr pepper +fanta +melon soda +mountain dew +pepsi +pepsi ice cucumber +ramune +root beer +soda can +tea +barley tea +black tea +bubble tea +darjeeling tea +earl grey tea +green tea +herbal tea +iced tea +milk tea +rose hip tea +spear mint tea +tropical drink +blue hawaii +water \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_fruit.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_fruit.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a134017fadaf3e4ddd57c5991e89b3115ce7811f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_fruit.txt" @@ -0,0 +1,69 @@ +akebia fruit +apple +apple bunny +apple core +apple peel +bitten apple +candy apple +golden apple +green apple +red apple +avocado +banana +banana boat +banana peel +banana popsicle +banana slice +chocolate banana +gel banana +on banana +berry +blackberry +blueberry +raspberry +strawberry +cacao fruit +cherry +chinese lantern +dragon fruit +fig +fruit bowl +fruit cup +gooseberry +grapefruit +grapes +currant +grape stomping +kiwi +kiwi slice +lemon +lemon slice +lemonade +lime +lime slice +lychee +mango +mangosteen +melon +cantaloupe +honeydew +muskmelon +watermelon +olive +orange +mandarin orange +orangette +orange juice +orange slice +papaya +peach +pear +nashi pear +persimmon +pineapple +plum +pomegranate +rambutan +starfruit +umeboshi +yuzu (fruit) \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..393b93cfcc4ec8bf0db1ba8524c7534747cf6b4d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_main.txt" @@ -0,0 +1 @@ +food \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_meal.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_meal.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ff01db081c656cb64b17e612b38a24348b54a4eb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_meal.txt" @@ -0,0 +1,73 @@ +canned food +cooking oil +crumbs +curry +curry rice +dumpling +jiaozi +shumai +tang yuan +croquette +fish and chips +flour +fondue +french fries +french toast +burger +megamac +hot dog +corn dog +katsu +katsudon +konnyaku +meal +breakfast +dim sum +dinner +feast +lunch +bento +okosama lunch +party +birthday party +tea party +baozi +nabe +oden +sukiyaki +narutomaki +okonomiyaki +omelet +pizza +holding pizza +pizza box +pizza delivery +pizza slice +ribs (food) +salad +salt +salt shaker +sandwich +soup +miso soup +zouni soup +sushi +conveyor belt sushi +gunkanmaki +inarizushi +makizushi +nigirizushi +sushi geta +taco +takoyaki +takuan +tamagokake gohan +tempura +shrimp tempura +tofu +aburaage +mapo tofu +stinky tofu +twice cooked pork +unadon (food) +zongzi \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_meat.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_meat.txt" new file mode 100644 index 0000000000000000000000000000000000000000..73a2eca75dabf361f490a46ad9cd40f3e5415c2b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_meat.txt" @@ -0,0 +1,41 @@ +bacon +boned meat +chicken +chicken leg +chicken nuggets +fried chicken +yakitori +crab +egg +broken egg +egg laying +egg yolk +eggshell +eggshell hat +fried egg +century egg +hardboiled egg +scrambled egg +roe +caviar +ikura +fish +fish bone +sashimi +katsuo no tataki +gyuudon +hamburger steak +kamaboko +kebab +lobster +meatball +pork +ham +sausage +shrimp +shrimp tempura +steak +tako-san wiener +kani-san wiener +turkey +turkey leg \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8261863d56008f4e527df0e2604228a9689cb510 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_misc.txt" @@ -0,0 +1,6 @@ +bad food +food +food-themed clothes +food-themed hair ornament +food girls +oversized food \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_professions_and_establishments.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_professions_and_establishments.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b68c311fbf63ce116b0548da4cac963c87e9e9a4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_professions_and_establishments.txt" @@ -0,0 +1,38 @@ +apron +naked apron +bakery +butler +chef +chef hat +toque blanche +flat top chef hat +food stand +ice cream stand +yatai +kitchen +maid +restaurant +bar +bartender +cafe +barista +maid cafe +cafeteria +conveyor belt sushi +diner +fast food +izakaya +menu +angel mort +anna miller +bronze parrot +hooters +koubeya +yoshinoya +tavern +yakiniku +supermarket +market +groceries +waitress +waiter \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sexual.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sexual.txt" new file mode 100644 index 0000000000000000000000000000000000000000..76d05aacc719d40b7ce2fd6a9e5337838b68ecee --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sexual.txt" @@ -0,0 +1,14 @@ +cum on food +food on body +bust cup +chocolate on body +chocolate on breasts +naked chocolate +cream on body +food as clothes +chocolate clothes +food between breasts +food on breasts +nyotaimori +nantaimori +wakamezake \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_snacks.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_snacks.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9fb6f86672d0d5a5090f2cb6ba885ae771c9b5c1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_snacks.txt" @@ -0,0 +1,9 @@ +chikuwa +chips +bugles +doritos +potato chips +pringles +popcorn +pretzel +umaibou \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sweets.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sweets.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ef556c7b1685684383b48bd06810a32464d7e612 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sweets.txt" @@ -0,0 +1,156 @@ +anmitsu +mitsumame +batter +bread +anpan +cinnamon roll +gumball +hot cross bun +melon bread +noppo bread +pastry box +cake +baumkuchen +birthday cake +black forest cake +charlotte cake +cheesecake +rare cheesecake +brownie +chocolate cake +chocolate framboise +christmas cake +cupcake +fondant au chocolat +layer cake +madeleine +mille-feuille +mont blanc +opera cake +pound cake +red velvet cake +cake slice +stollen +strawberry shortcake +swiss roll +strawberry swiss roll +tiramisu +wedding cake +yule log +candy +candy cane +chitose ame +cigarette candy +creme egg +jelly bean +kitkat +konpeitou +lollipop +chupa chups +shaped lollipop +swirl lollipop +m&m's +marble chocolate +nerunerunerune +pinata +tootsweets +candy apple +caramel +chewing gum +chocolate +apollo chocolate +chocolate bar +chocolate cake +chocolate fountain +heart-shaped chocolate +chocolate marquise +chocolate syrup +kinoko no yama +takenoko no sato +white chocolate +churro +cookie +checkerboard cookie +chocolate chip cookie +country ma'am +gingerbread cookie +gingerbread house +gingerbread man +pocky +sandwich cookie +oreo +thumbprint cookie +thumbprint cookie +toppo +wafer +wafer stick +cotton candy +cream +bavarois +icing +crepe +dorayaki +dough +doughnut +chocolate doughnut +french cruller +sakura french +old-fashioned doughnut +gelatin +ice cream +banana split +coolish +ice cream cone +waffle cone +ice cream float +ice cream sandwich +parfait +strawberry parfait +single scoop +double scoop +triple scoop +quadruple scoop +too many scoops +soft serve +sprinkles +sundae +imagawayaki +marshmallow +mooncake +muffin +pastry +chocolate cornet +cream puff +eclair +macaron +pie +apple pie +popsicle +popsicle stick +tupet +watermelon bar +pudding +custard +pudding a la mode +shaved ice +slushie +tanghulu +tart +blueberry tart +fruit tart +strawberry tart +wagashi +baozi +momiji manjuu +mochi +dango +ichigo daifuku +kashiwa mochi +namagashi +sakura mochi +songpyeon +suama +taiyaki +uirou +warabimochi +youkan \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sweets_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sweets_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..83b5603b4c14a816dee9576013225d6f74652ff5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_sweets_main.txt" @@ -0,0 +1,3 @@ +Sweets +Snacks +Dessert \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_utensils.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_utensils.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6e260480f6db5ccb8de8abddfe5d51d00f363969 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_utensils.txt" @@ -0,0 +1,84 @@ +baking sheet +bamboo steamer +basket +picnic basket +shopping basket +shopping cart +bottle +squeeze bottle +bowl +plate +saucer +sakazuki +cake pan +chopsticks +chopstick rest +coffee grinder +cookie cutter +cutting board +dishes +drinking straw +bendy straw +crazy straw +heart straw +egg carton +food wrapper +fork +spork +frying pan +gravy boat +griddle +grill +kabayaki +shichirin +honey dipper +hot plate +ice cream scoop +kitchen scale +knife +butter knife +cleaver +kettle +ladle +lunchbox +bento +makisu +measuring cup +mess kit +microwave +milking machine +mini flag +mixer +napkin +oven +brick oven +oven mitts +pastry bag +peel +plate +holding plate +plate stack +pot +honeypot +cauldron +recipe +refrigerator +rice cooker +rolling pin +serving dome +sink +spatula +serving spatula +spoon +shamoji +chinese spoon +stove +takeout container +oyster pail +toaster +tongs +tray +cake stand +tiered tray +whisk +wok \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_vegetable.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_vegetable.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f517f226b4273ee309985f42d180b4d6fd276fa7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/food/food_vegetable.txt" @@ -0,0 +1,52 @@ +asparagus +beans +red bean paste +nattou +bitter melon +broccoli +cabbage +bok choy +carrot +cauliflower +coconut +corn +cucumber +pickle +eggplant +garlic +kimchi +lettuce +mint +mushroom +onion +onion rings +spring onion +parsley +plant +almond +chestnut +peanut +pepper +bell pepper +green pepper +red pepper +yellow pepper +chili pepper +habanero pepper +jalapeno pepper +pepper shaker +potato +radish +daikon +seaweed +hijiki +nori +squash +pumpkin +jack-o'-lantern +sweet potato +yakiimo +tomato +cherry tomato +turnip +warabi \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/foods.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/foods.txt" new file mode 100644 index 0000000000000000000000000000000000000000..181251200c411c5e9cfcdee0ae4a1277092317fe --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/foods.txt" @@ -0,0 +1,378 @@ +Abalone +Açaí berries +Açaí juice +Acorn squash +Adzuki bean paste +Adzuki beans +Aged Japanese kurozu +Albacore tuna +Alcohol +Ale +Alfalfa sprouts +Algae +Almond milk +Almond paste +Almonds +Ancho chili powder +Anchovies +Anchovy paste +Angus beef +Apple juice +Apples +Apricots, Japanese +Apricots +Arborio rice +Arctic char +Artichoke, Jerusalem +Artichokes +Arugula +Asian greens +Asparagus, Chinese +Asparagus +Autumn crocus +Avocado oil +Avocados +Bacon +Balsamic vinegar +Banana Split +Bananas +Barbecued meat +Basil +Bean curd +Beef +Beer +Beets +Bell peppers +Bitter almond oil +Bitter cucumber +Bitter melon +Black beans +Black cumin +Black currants +Black or purple rice +Black pepper +Black tea +Blackberries +Blueberries +Boiled Egg +Bok choy +Boysenberries +Brazil nuts +Bread +Broccoli sprouts +Broccoli +Brown mustard +Brown rice syrup +Brown rice +Brussels sprouts +Buckwheat +Butter +Butternut squash +Cabbage +Canola oil +Cantaloupe +Carrots +Cashews +Cauliflower +Caviar +Celeriac +Celery seed +Celery +Chamomile +Cheese +Cheese +Cherries +Chicken +Chickpeas +Chilli peppers +Chives +Chocolate +Cilantro +Cinnamon +Clams +Coconut oil +Coconut +Coffee +Collard greens +Corn oil +Corn +Crab +Cranberries +Cream +Cucumbers +Cumin +Curcumin +Currants +Daidzein +Dal +Dill +Dried herring +Dried mackerel +Dry beans +Edamame +Eggs +Escargot +Fennel seed +Fennel +Fermented bean paste +Fermented milk +Flaxseed oil +Flaxseed +Fried potatoes +Garbanzo beans +Garden cress +Garlic +Genistein +Ghee +Ginger +Grape seed oil +Grapefruit +Grapes +Gravy +Green beans +Green onions +Green papaya +Green peas +Green tea +Greens +Guacamole +Halibut +Ham +Hamburger +Herring +Holy basil +Honey +Honeydew melon +Horseradish +Hot peppers +Hotdog +Hummus +Indian mustard +Kale +Kefir +Kelp +Kidney beans +King mackerel +Kiwifruit +Kohlrabi +Kumquats +Lake trout +Lamb +Lard +Lavender +Leeks +Lemons +Lentils +Lettuce +Lima beans +Limes +Lingonberries +Liquor +Liver +Lobster +Loganberries +Long pepper +Low-fat yogurt +Macadamia nut oil +Macadamia nuts +Mackerel +Maitake mushrooms +Mangoes +Mangosteen +Margarine +Marionberrries +Maté +Mayonnaise +Melons +Melted Cheese +Mexican oregano +Milk +Milkshake +Mint tea +Mint +Mung beans +Mushrooms +Muskmelon +Mussels +Mustard greens +Mustard oil +Mustard +Mutton +Navy beans +Nutella +Oatmeal +Oats +Octopus +Ohyo +Oil +Olive oil +Olives +Omelet +Onions +Oranges +Oregano +Oysters +Papaya seeds +Papaya +Paprika +Parsley +Parsnips +Passion fruit +Pâtés +Paw paw +Peaches +Peanut Butter and Jelly +Peanut Butter +Peanut oil +Peanut +Peanuts +Pears +Peas +Pecans +Peppermint +Persipan +Pesto sauce +Pickled papaya +Pickled watermel. rind +Pickles +Pineapple +Pinto beans +Pistachio nuts +Pizza +Plantago +Plantains +Plums +Pomegranate juice +Pomegranates +Popcorn +Pork +Portobello mushrooms +Potatoes +Prunes +Pumpkin seeds +Pumpkins +Radicchio +Radish +Raisins +Rapeseed oil +Rapini +Raspberries +Red bean paste +Red beans +Red cabbage +Red currants +Red onions +Red pepper flakes +Red pepper paste +Red rice +Red spinach +Red wine vinegar +Reishi mushrooms +Rhubarb +Ribs +Rice bran oil +Rice bran +Rice wine vinegar +Rice +Risotto +Roast beef +Roasted almonds +Roasted pork +Roe +Rolled oats +Romaine lettuce +Rosemary +Rutabagas +Rye +Safflower oil +Saffron +Sage tea +Sage +Salmon, wild +Salt +Sardines +Sauerkraut +Sausages +Scallions +Scallops +Seaweed +Sesame oil +Sesame seeds +Shallots +Shellfish +Shiitake mushrooms +Shrimp +Smoked mackerel +Snails +Snickers Chocolate Bar +Snow peas +Soba noodles +Soy infant formula +Soy milk +Soy protein bars +Soy protein isolate +Soybean curd +Soybean oil +Soybeans +Spaghetti squash +Spearmint +Spinach +Split peas +Squash +Steak +Strawberries +String beans +Subtropical ginger +Sugar beets +Sugar snap peas +Sugar +Summer squash +Sunflower oil +Sunflower seeds +Sweet peas +Sweet potatoes +Tabasco sauce +Tacos +Tahini +Tallow +Tangerines +Tartary buckwheat +Tea +Thyme +Tofu +Tomato paste +Tomatoes +Tropical ginger +Turkey bacon +Turkey +Turmeric +Turnip greens +Turnips +Veal +Vinegar +Wakame +Walnut oil +Walnuts +Wasabi +Watercress +Watermelon seeds +Watermelon +Well-done meat +Wheat bran +Wheat germ +Wheat grass +White beans +White bread +White button mush. +White pepper +White tea +White vinegar +Whole wheat bread +Wild ginger +Wild rice +Wine +Winter squash +Yams +Yerba maté +Yogurt +Zucchini diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fruit.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fruit.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1d1a078918c91aa17a4120f912fdd925e0464e35 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/fruit.txt" @@ -0,0 +1,372 @@ +Abiu +Acai +Acerola +Ackee +African Cherry Orange +Alligator Apple +Amazon Grape +Ambarella +American Mayapple +Apple +Apricot +Araza +Arhat +Avocado +Babaco +Bael +Banana +Barbadine +Barbados Cherry +Barberry +Bayberry +Beach Plum +Bearberry +Beechnut +Berry +Betel Nut +Bignay +Bilberry +Bilimbi +Bitter Gourd +Black Apple +Black Cherry +Black Mulberry +Black Raspberry +Black Sapote +Blackberry +Blackcurrant +Blood Orange +Blue Tongue +Blueberry +Bolwarra +Bottle Gourd +Boysenberry +Bramble +Brazil Nut +Breadfruit +Broadleaf Bramble +Buffaloberry +Burdekin Plum +Burmese Grape +Cacao +Caimito +Cajamanga +Calabashtree +Camucamu +Canistel +Cantaloupe +Cape Gooseberry +Carambola +Cardon +Carob +Cashew +Cedar Bay Cherry +Cempedak +Ceylon Gooseberry +Che +Chenet +Cherimoya +Cherry +Chinese Bayberry +Chinese Mulberry +Chokeberry +Citron +Clementine +Cloudberry +Cluster Fig +Coconut +Cocoplum +Coffee +Common Apple-Berry +Conkerberry +Cornelian Cherry +Crabapple +Cranberry +Crowberry +Cudrang +Cudrania +Cupuacu +Currant +Custard Apple +Damson +Date +Date Palm +Date-Plum +Davidson's Plum +Desert Fig +Desert Lime +Dewberry +Doubah +Dragonfruit +Durain +Eastern May Hawthorn +Eggfruit +Eggplant +Elderberry +Elephant Apple +Emblic +Emu Apple +Entawak +Etrog +Feijoa +Fibrous Satinash +Fig +Fiji Longan +Finger Lime +Galendar +Galia +Gandaria +Genip +Genipap +Giant Granadilla +Golden Apple +Gooseberry +Goumi +Gourds +Grape +Grapefruit +Grapple +Greengage +Grenadilla +Guanabana +Guarana +Guava +Guavaberry +Hackberry +Hardy Kiwi +Hawthorn +Hog plum +Honeycrisp Apple +Honeydew Melon +Honeysuckle +Horned Melon +Huckleberry +Huito +Illawarra Plum +Indian Almond +Indian Fig +Indian Jujube +Indian Prune +Indian Strawberry +Ita Palm +Jaboticaba +Jackfruit +Jagua +Jamaica Cherry +Jambul +Japanese Bayberry +Japanese Raisin +Jasmine +Jatoba +Jenipapo +Jocote +Jujube +June Plum +Kaffir Lime +Kahikatea +Kakadu Lime +Kakadu Plum +Kandis Fruit +Karkalla +Keppel Fruit +Key Lime +Kiwi +Kumquat +Kundong +Kutjera +Lablab +Lady Apple +Langsat +Lanzones +Lapsi +Legume +Lemon +Lemon Aspen +Leucaena +Lillipilli +Lilly Pilly +Lime +Lingonberry +Loganberry +Longan +Loquat +Lucuma +Lulo +Lychee +Mabolo +Macadamia +Malay Apple +Mamey Sapote +Mamoncillo +Mandarin +Mango +Mangosteen +Manila Tamarind +Manoao +Marang +Marion Berry +Mayapple +Mayhaw +Maypop +Medlar +Melinjo +Melon +Melon Pear +Midyim +Mock Buckthorn +Mock strawberry +Monkey Apple +Monstera +Morinda +Mountain Soursop +Mulberry +Mundu +Muntries +Muskmelons +Myrtle +Nageia +Nance +Nannyberry +Naranja +Naranjilla +Native Cherry +Native Currant +Native Gooseberry +Nectarine +Neem +Nungu +Nutmeg +Oil Palm +Olallieberry +Old World Sycomore +Olive +Orange +Orangelo +Oregon Grape +Otaheite Apple +Papaya +Passion Fruit +Pawpaw +Peach +Peanut +Pear +Pequi +Persimmon +Pewa +Pigeon Plum +Pigface +Pili Nut +Pineapple +Pitaya +Pitomba +Plantain +Plum +Podocarpus +Poha +Pois Doux +Pomcite +Pomegranate +Pomelo +Pommecythere +Pommerac +Pond Apple +Prickly Pear +Prumnopitys +Prune +Pulasan +Pummelo +Pumpkin +Pupunha +Purple Apple-Berry +Quandong +Quenepa +Quince +Raisin +Rambutan +Rangpur +Raspberry +Red Bayberry +Red Mombin +Red Mulberry +Redcurrant +Rhubarb +Riberry +Ridged Gourd +Rimu +Rose Apple +Rose Hip +Rose Myrtle +Rose-Leaf Bramble +Rowan +Sageretia +Saguaro +Salak +Salal Berry +Salmonberry +Sandpaper Fig +Santol +Sapodilla +Sapote +Saskatoon +Saskatoonberry +Satsuma +Sea Grape +Sea-Buckthorn +Serviceberry +Shipova +Siberian Kiwi +Silkworm Thorn +Snow berry +Soncoya +Soursop +Star Apple +Strawberry +Strawberry-Guava +Strawberry-Pear +Sugar Apple +Surinam Cherry +Sweet Apple-Berry +Sweet Lemon +Sweetsop +Sycamore Fig +Sycomore +Tamarillo +Tamarind +Tangelo +Tangerine +Tanjong +Taxus Baccata +Texas Persimmon +Thimbleberry +Tomato +Toyon +Ugli Fruit +Ugn +Uva/Grape +Vanilla +Velvet Tamarind +Voavanga +Water Apple +Watermelon +Wax Apple +Wax Gourd +White Aspen +White Mulberry +White Sapote +Wild Orange +Wineberry +Winter Melon +Wolfberry +Wongi +Wood Apple +Xigua +Xylocarp +Yali Pear +Yamamomo +Yangmei +Yellow Plum +Yumberry +Zhe +Zigzag Vine +Ziziphus +Zucchini diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_board.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_board.txt" new file mode 100644 index 0000000000000000000000000000000000000000..376a29af5241d6daa16f75c48a46811f97594f52 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_board.txt" @@ -0,0 +1,9 @@ +board game +chess +go +mahjong +monopoly +reversi +shogi +tabletop rpg +tic-tac-toe \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_card.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_card.txt" new file mode 100644 index 0000000000000000000000000000000000000000..69bce0edd31093b26b43660e38b92870fbe8a68b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_card.txt" @@ -0,0 +1,6 @@ +card game +hanafuda +playing card +solitaire +tarot +uno \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_group.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_group.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e73ff2465f97231e64eab3cc508272a08ced4b10 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_group.txt" @@ -0,0 +1,21 @@ +airsoft +Daruma-san ga koronda +goldfish scooping +hanetsuki +hide and seek +hopscotch +jump rope +double dutch +kendama +kibasen +king game +leap frog +obi spin +paintball +playing games +playing sports +olympics +racing +scavenger hunt +tug of war +twister \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_hand.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_hand.txt" new file mode 100644 index 0000000000000000000000000000000000000000..95916125ee9d23904f2e4b3f041cd29d10bc5cb5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_hand.txt" @@ -0,0 +1,4 @@ +cat's cradle +forehead flick +rock paper scissors +pen spinning \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_mechanical.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_mechanical.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b7a843ed6afe5b7e5bf08a8bba0e9ebad978c285 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_mechanical.txt" @@ -0,0 +1,5 @@ +crane game +pachinko +pop-up pirate +video game +whac-a-mole \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_places.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_places.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a233476b4c5205639231e8d56ca51e51ae2a6539 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_places.txt" @@ -0,0 +1,5 @@ +arcade +card +casino +casino token +dice \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_puzzle.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_puzzle.txt" new file mode 100644 index 0000000000000000000000000000000000000000..de9cf9c536b74593c1c9b6d53e73b87174253923 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/game_activities/game_activities_puzzle.txt" @@ -0,0 +1,7 @@ +crossword puzzle +jenga +metal puzzle +puzzle +jigsaw puzzle +sliding puzzle +rubik's cube \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/hair_objects_accessories.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/hair_objects_accessories.txt" new file mode 100644 index 0000000000000000000000000000000000000000..debbd657f606fa4c667331eebf8ebed08dfe2235 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/hair_objects_accessories.txt" @@ -0,0 +1,20 @@ +bun cover +hairpods +chopsticks +comb +hair ornament +hair bell +hair bobbles +hair bow +hair ribbon +hairclip +hairpin +hair flower +hair tubes +kanzashi +hair tie +hairband +hair weapon +headband +scrunchie +wig \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_actions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_actions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e390fbe24c3b818299e674a779404dd66b957959 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_actions.txt" @@ -0,0 +1,7 @@ +arm around neck +arms around neck +hand on another's neck +hand on own neck +neck biting +necking +kissing neck \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_anatomy.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_anatomy.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f7a863f8d83fc356a5bba174cd4ebb7e45d5839f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_anatomy.txt" @@ -0,0 +1,4 @@ +collarbone +nape +neck +long neck \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_attire_accessories_objects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_attire_accessories_objects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5c55092bfa9c1757c5c41ec0d99a1eb48324ad4b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_attire_accessories_objects.txt" @@ -0,0 +1,38 @@ +ascot +bolo tie +bowtie +choker +ribbon choker +collar +neck bell +animal collar +spiked collar +studded collar +pet cone +feather boa +goggles around neck +headphones around neck +jabot +lanyard +neckerchief +necklace +chain necklace +flower necklace +lei +friendship charm +pearl necklace +pendant +amulet +locket +magatama +pentacle +neck ribbon +neck ruff +necktie +necktie between breasts +necktie on head +necktie grab +necktie removed +scarf +stole +cross tie \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_collar.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_collar.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a66bd2d56d5c9c7d7448477a35eb2905c1b1fbd1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/neck_and_neckwear_collar.txt" @@ -0,0 +1,14 @@ +collar grab +collar tug +adjusting collar +collared shirt +detached collar +fur collar +high collar +open collar +popped collar +sailor collar +turtleneck +sleeveless turtleneck +v-neck +wing collar \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/styles/neck_and_neckwear_styles_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/styles/neck_and_neckwear_styles_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6d0fbe0a3e701936424a98d92f6c667642aa8017 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/styles/neck_and_neckwear_styles_colors.txt" @@ -0,0 +1,12 @@ +aqua neckwear +black neckwear +blue neckwear +brown neckwear +green neckwear +grey neckwear +orange neckwear +pink neckwear +purple neckwear +red neckwear +white neckwear +yellow neckwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/styles/neck_and_neckwear_styles_patterns.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/styles/neck_and_neckwear_styles_patterns.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bc167d592c3e8e3ba6c017f1376dee8c7b895041 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/neck_and_neckwear/styles/neck_and_neckwear_styles_patterns.txt" @@ -0,0 +1,3 @@ +checkered neckwear +plaid neckwear +striped neckwear \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/new.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/new.txt" new file mode 100644 index 0000000000000000000000000000000000000000..09594ff17642ebc812b2856cb343abd816d89ecd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/new.txt" @@ -0,0 +1,4 @@ +obi +white kimono +kiseru +smoking pipe \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/general_locations/piercings_general_locations_body.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/general_locations/piercings_general_locations_body.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f31623571c77c46a0888534125d043d84ad15f9e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/general_locations/piercings_general_locations_body.txt" @@ -0,0 +1,11 @@ +Areola piercing +Back piercing +Corset piercing +Collarbone piercing +Navel piercing +Nipple piercing +Nipple bar +Nipple chain +Nipple lock +Nipple rings +Nipple stretcher \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/general_locations/piercings_general_locations_head_and_facial.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/general_locations/piercings_general_locations_head_and_facial.txt" new file mode 100644 index 0000000000000000000000000000000000000000..149031477271b76a18949c034501b553ff2e3e17 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/general_locations/piercings_general_locations_head_and_facial.txt" @@ -0,0 +1,11 @@ +Ear piercing +Industrial piercing +Eyebrow piercing +Anti-eyebrow piercing +Eyelid piercing +Lip piercing +Labret piercing +Neck Piercing +Nose piercing +Bridge piercing +Tongue piercing \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_genital_female.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_genital_female.txt" new file mode 100644 index 0000000000000000000000000000000000000000..08209a5d322ded94a2f79f7c274b36cceb87820a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_genital_female.txt" @@ -0,0 +1,4 @@ +Clitoris piercing +Clitoris ring +Labia piercing +Fourchette piercing \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_genital_male.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_genital_male.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c38a384fb1b9cd4dc4d4ba3ba78c1ca2e2ab67e7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_genital_male.txt" @@ -0,0 +1,9 @@ +Penis piercing +Ampallang +Dydoe +Foreskin piercing +Frenulum piercing +Lorum piercing +Prince Albert +Guiche piercing +Scrotum piercing \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..82858b5916090a35da527ecc5197788c4fb45054 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_main.txt" @@ -0,0 +1,2 @@ +Piercing +Piercing through clothes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..77fdf366ddea54586e78ac6cb54fbf066ac9172b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/piercings/piercings_misc.txt" @@ -0,0 +1,3 @@ +Captive bead ring +Linked piercing +Plug (piercing) \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_actions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_actions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..947ae0da4f52f557203420841b2f769c3e5cbdab --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_actions.txt" @@ -0,0 +1,2 @@ +farming +gardening \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_locations.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_locations.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3a0e86ba079b57610913013160bc3952bb83a933 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_locations.txt" @@ -0,0 +1,6 @@ +farm +field +garden +greenhouse +nature +outdoors \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d6170e134aa124025a30645231b4beadcba44f12 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_misc.txt" @@ -0,0 +1,13 @@ +bare tree +floral arch +foliage +piranha plant +plant boy +plant girl +plant monster +plant wings +potted plant +flower pot +hanging plant +planter +topiary \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_parts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_parts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..67b62f5e539d571a4fe08971f80ee54f478102ac --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_parts.txt" @@ -0,0 +1,11 @@ +branch +branch +flower +fruit +petals +rose petals +plant cell +roots +aerial root +seed +tree stump \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_types.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_types.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ec2ea4b0bc5cf95a51e302be897930f563bf33ad --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/plant/plant_types.txt" @@ -0,0 +1,21 @@ +bush +rose bush +carnivorous plant +pitcher plant +venus flytrap +clover +flower +grass +bamboo +herb +cattail +marijuana +holly +moss +seaweed +succulent plant +agave +cactus +tree +vines +ivy \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/rpg-Item.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/rpg-Item.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5e5df464817927ba8504af312b060731298fc1b4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/rpg-Item.txt" @@ -0,0 +1,732 @@ +A Page of the Bible +A Page of the Unholy Book +A Small Piece of Wrath +Abyss Fall +Abyssal Pickaxe +Acorn Bomb +Adamantium +Aegis of Earth +Aegis of Flame +Aegis of Sea +Aegis of Storm +Agareth +Amulet +Ancient Branch +Ancient Ent +Ancient Fragment of Ruins +Ancient Hourglass +Ancient Leaf Circlet +Ancient Leaf +Ancient Token +Ancient Tree Bow +Ancient Tree Rifle +Ancient Tree Staff +Ancient Tree Sword +Ancient Writings +Antimatter +Archangel Halo +Archangel Token +Archangel Wings +Archangel +Armageddon +Armor of Corruption +Armor of Darkness +Armor of Deep Sea +Armor of Flame +Armor of Grudge +Armor of Poison +Arrows +Assassin Katars +Bag of coins +Bag of Corruption +Bag of Demise +Bag of Despair +Bag of Destruction +Bag of Explosion +Bag of Heaven +Bag of Hell +Bag of Life +Bag of Mighty Strength +Bag of Mother Nature +Bag of Strength +Bag of Universe +Bag +Bat Wings +Bear Fur Cloak +Beast Hunter +Beast Token +Bell of Light +Berserker Axe +Berserker Mask +Big Golem Token +Black Pearl +Blade of Requiem +Blaster +Blazing Heart +Blazing Soul +Blood Cannon Tepes +Blood Essence +Blood Orb +Bloodseeker Plates +Bloodsoul Plates +Bloodstone Plates +Bloodstone +Bloodstorm +Bloodsword Tepethia +Bloody Armor +Bloody Fang +Bloody Hatred +Bloody Helm +Bloody Robe +Bolts +Bone Dragon Token +Bone Dragon +Bone Necklace +Book of fireball +Book of Inferno +Boots +Bow of Corruption +Bow of Crimson Flame +Bow of Darkness +Bow of Deep Sea +Bow of Flame +Bow of Poison +Brave Heart +Brooch of Darkness +Bullet of Destruction +Bullet of Ruin +Burning Bone Fragments +Cage of Twisted Souls +Candy Bag +Candy-filled Bag +Cape of Corrupt Flame +Cape of Deep Abyss +Cape of Insatiable Hunger +Cape of the Count +Cape of the Dead +Cauldron +Chaos Dagger +Chaos Magic Stone +Chaos Pickaxe +Chaos +Charm +Chaser +Chest of tresures +Chrono Chaser +Chronos, the Cannon of Time +Claws of Beriel +Claws of Destruction +Cloak of the Fallen Shadows +Clown Token +Coin of Effort +Cold Ring of Death +Corrupt Angel Token +Corrupt Angel +Corrupt Creations +Corrupt Crystal Fragment +Corrupt Crystal Wings +Corrupt Token +Corruptor +Cosmic Reaper +Cosmic Star +Count Token +Count's Memento +Count's Old Cape +Count +Crab Armor +Creation of the Sky and Earth +Crescendo of Madness +Crimson Cross +Crimson Curse +Crimson Hatred +Crimson Pumpkin Rod +Crimson Reaper +Crimson Rose +Crimson Star +Crimson Stone +Crossbow of Desolation +Crossbow of Requiem +Crowbar +Crown of Absolute Zero +Crown of Cataclysm +Crown of Cleansing +Crown of Composure +Crown of Extreme Anger +Crown of Fervor +Crown of Great Rage +Crown of Rage +Crown of Serenity +Crown of Transcendance +Crystal Golem Token +Curse of Greed +Cursed Mask +Dagger of Night +Dagger +Dark Crystal +Dark Matter +Dark Wings +Death Fiend +Death Scythe +Death Visage +Deathbringer +Deep Armor +Deep Bow +Deep Helm +Deep Rifle +Deep Robe +Deep Staff +Deep Sword +Demon Hunter +Demon Lord Token +Demon Lord +Demonic Fist +Devil Blaster +Devil Chaser +Divine Slayer +Draconic Skin Rusalka +Dragon Armor Magron +Dragon Blaster +Dragon Bone +Dragon Buster +Dragon Cannon Belenus +Dragon Heart +Dragon Mask +Dragon Scale +Dragon Wings +Dragon's Eye +Dragon's Helm Agron +Dragonbow Aience +Dragonic Orb +Dragonstaff Irave +Dragonsword Lainhart +Droplet Orb +Druid's Mantle +Duke Token +Eana, the Holy Cannon +Eana, the True Holy Cannon +Earthen Plates +Emblem of Sacrifice +Embrace of Nature +Endless Snow +Ent Token +Essence of Anger +Essence of Corruption +Essence of Darkness +Essence of Earth +Essence of Flame +Essence of Light +Essence of Sea +Essence of Storm +Eternal Flame +Eternal Rose +Eternal Winter +Ethereal Dagger +Everfrost Gauntlet +Everfrost Token +Everfrost +Everfrostbane +Eye of the Abyss +Eye of the Flame Lord +Eye of the Reaper +Eye of Wallachia +Eyes of the Watcher +Fairy Garment +Fairy Robe +Fairy Wings +Fang Necklace +Felspike Armor +Fiend Chaser +Fiend Token +Final Omen +Flame Cloak +Flame Explosion +Flame Fairy Stone +Flame God Token +Flame Nightmare +Flame Soul +Flame Token +Flamereaver +Flameshard Armor +Forest Armor +Forest Helm +Forest Robe +Fragment of Angel's Feather +Fragment of Chaos +Fragment of Corrupt Angel's Feather +Fragment of Hell +Fragment of Oblivion +Fragment of Soul +Fragment of the Wind +Frorist +Frost Fairy Wings +Frost Herald's Garment +Frost Web Brooch +Frostbane +Frostshard Armor +Frostspider Lord Token +Frosty Armor +Frosty Bow +Frosty Helm +Frosty Rifle +Frosty Robe +Frosty Staff +Frosty Sword +Frozen Cobweb +Frozen Crystal +Frozen Debris +Frozen Heart +Frozen Robe +Fruit Candy Pouch +Fruit-filled Bag +Full Potion +Gaia Token +Gate of Abyss +Gatekeeper +Ghostly Thread +Giant Golem +Glacier Guard +Glasses +God's Page +Gown of Life +Grail of Eternity +Grandine, the Plates of Wraith +Grandmaster +Green Crystal Golem Token +Green Magic Stone +Grim Heart +Grim Visage +Grimoire of Darkness +Guardian Angel Token +Guardian Angel +Guardian of the Skies +Guardian Ring +Guardian Token +Guardian Vest +Halo of Judgment +Hard Turtle Shell +Harmonia, the Sparkle of Life +Harmony of Sky and Earth +Hatred +Heart of Fire +Heart of Ifrit +Heart of Life +Heart of Volcano +Heartrender +Heartseeker +Heaven's Door +Heaven's Fist +Heirloom of Lazarus +Hell Riser +Hellflame +Hellspike Armor +Helm of Battle +Helm of Corruption +Helm of Darkness +Helm of Deep Sea +Helm of Everfrost +Helm of Flame +Helm of Passion +Helm of Poison +Helm of the Dead +Hood of Apocalypse +Hood of Apocrypha +Hood of Everfrost +Hood of Grim Reaper +Hood of Mage +Hood of Reaper +Hood of the Fallen +Horn of Deep Sea +Horn of Demon God +Horn of Demon Lord +Horn of Devil +Horn of Dragon +Horn of God +Horn of Lightning God +Horns of Destruction +Hungry Bat Wings +Hydra Poison Tooth +Hydra Scale +Hydra Token +Hydra +Hydro Buster +Icicle Bow +Icicle Storm +Ifrit +Infected Crab Armor +Infected Tentacle +Inferno +Irbeth, the Sword of Lightning +Ispion, the Runed Ring +Ivory Cane +Jack Pumpkin Helm +Jack Pumpkin Hood +Jack Pumpkin Token +Javelins +Jewels +Katars +Key of Flame +Key of Sea +Key of Storm +King Kong's Claws +Last Word +Lifestone +Lightning God Token +Living Dragon Heart +Lord of Revolver +Lucky Acorn +Mad Clown +Mad Symbol +Mage Lord +Magic Stone +Magic Sword +Magical Horn +Magical Token +Mana Ancient +Mana Heart Frost +Mana Heart Large +Mana Heart +Mana Orb +Mark of Lord +Mark of Sin +Mark of Storm +Mask of Blood +Mask of Demigod +Master +Maw of Beriel +Maw of Greed +Maw of Monster +Meat Bag +Mithril Armor +Mithril Pickaxe +Monster Blaster +Monster Hunter +Nebula Rose +Nereid +Niflheim +Nightmare Rod +Nightmare Token +Oath of Courage +Orb of Oblivion +Orb of the Deep Sea +Orb +Orichalcum +Parasitic Staff +Parasitic Sword +Parchment of Doom +Pickaxe +Plates of Lightning +Plates of Regeneration +Plates of Sanguine Fury +Potion of Corruption +Potion of Healing +Potion of Mana +Power of Beriel +Precious Wind-Protection Clothing +Prius Gold Coin +Prius Silver Coin +Proof of Purification +Protection of the Fairies +Protector's Will +Pumpkin Rod +Punisher +Pure Crystal Wings +Pure Steel +Purefrost Gauntlet +Purelight Armor +Purified Water of Nature +Purified Water +Radiance of the One +Ragnaar +Rainbow Crystal Golem Token +Rainmaker +Raven Bow +Raven Rifle +Raven Staff +Raven Sword +Reactive Armor +Red Cross +Red Crystal Golem Token +Red Magic Stone +Rejuvenation Potion +Rifle of Corruption +Rifle of Crimson Flame +Rifle of Darkness +Rifle of Deep Sea +Rifle of Flame +Rifle of Poison +Ring of Composure +Ring of Count +Ring of Darkness +Ring of Deep Darkness +Ring of Desolate Soul +Ring of Despair +Ring of Destruction +Ring of Divine Winds +Ring of Eternity +Ring of Fate +Ring of Fervor +Ring of Frozen Soul +Ring of Holy Light +Ring of Hope +Ring of Insanity +Ring of Intrusion +Ring of Light +Ring of Nightmare +Ring of Poison +Ring of Resonance +Ring of Revival +Ring of Strike +Ring of the Dead +Ring of the One +Ring of the Reaper +Ring of the Sage +Ring of the Storm +Ring of Time +Ring +Ripple Snow +Robe of Archlich +Robe of Corruption +Robe of Darkness +Robe of Deep Sea +Robe of Everfrost +Robe of Flame +Robe of Poison +Robe of Storm +Robe of the Mage +Robe of the Necromancer +Robe of the Reaper +Robe of the Sage +Ruinbringer +Ruler of the Skies +Sacred Armor +Sacred Helm +Sanguis, the Robe of Sacrifice +Scale Protector +Scarab Bone +Scroll Oblivion +Sea God Token +Sea Jewel +Seal Breaking Gemstone +Seal of Corruption +Seal of Deceiver +Sealed Weapon +Seed of Malice +Shackles of Heaven +Shadow Dragon +Shell Sword +Shining Fragment of Ruins +Shroud of Perseverance +Silky Bag +Silver Fang +Silverfang Vest +Skeletal King +Skeleton Bone +Skeleton Token +Skeris, the Unholy Helm +Skull Seal +Slayer Boots +Snowman Pendant +Soul Blossom +Soul Devourer +Soul Drinker +Soul Eater +Soul of Abyss +Soul of the Dead +Soul Stone +Soulbane +Soulstalker +Spear of Judgment +Spearhead of Death +Spider Lord +Spider Token +Spirit Beast +Spirit Dagger +Spirit Flame +Spirit of Abyss +Spirit of Forest +Spirit of Nightmare +Spirit of Protector +Spirit Orb +Spiritus, the Sword of Elements +Splitting of the Sky and Earth +Staff of Abyss +Staff of Contagion +Staff of Corruption +Staff of Crimson Flame +Staff of Darkness +Staff of Deep Sea +Staff of Domination +Staff of Flame +Staff of Madness +Staff of Plague +Staff of Poison +Staff of Souls +Staff of Tentacles +Staff of Void +Stalker +Star of Apotheosis +Stash of Gold +Steel of God +Stone Plates +Storage Expansion Coupon +Storm Headdress +Storm Reaver +Storm +Sturdy Bag +Sword of Corruption +Sword of Crimson Flame +Sword of Darkness +Sword of Deep Sea +Sword of Flame +Sword of Poison +Tentacle Horn +Tentacle Lord +Tentacle Token +The Ancient Sword of Poison +The Archdevil's Bow +The Archdevil's Cannon +The Archdevil's Greatsword +The Archdevil's Staff +The Armor of Blood +The Armor of Demonblood +The Armor of Despair +The Armor of Lamentation +The Armor of Ruin +The Armor of the Reaper +The Bloodspear of Domination +The Bloodspear of Requiem +The Bloodspear +The Bloodsword of Domination +The Bow of Certain Destruction +The Bow of Divine Fury +The Bow of Frost +The Bow of God +The Bow of Skies +The Bow of Wind +The Bringer of Storms +The Bullet of Judgment +The Crimson Armor of Blood +The Crimson Sword of Requiem +The Devil Lord's Ring +The Earthen Heart +The Earthen Pillar +The Earthen Soul +The Ender +The Engraved Ring +The Greater Source of Mana +The Greatsword of Explosion +The Greatsword of Flame +The Holy Staff +The Holy Sword +The Illusional Sword of Poison +The Iron Bullet +The Origin of Knowledge +The Pipe of Chaos +The Reigning Sword of Poison +The Ring of Blood +The Robe of Heavens +The Robe of Hell +The Robe of Holy Maiden +The Robe of the Witch +The Rod of Earth +The Source of Mana +The Spear of Frost +The Spear of Heavenly Flows +The Spear of Lightning +The Staff of Chaos +The Staff of Duality +The Staff of Space +The Sword of Chaos +The Sword of Dreams +The Sword of Eternity +the Sword of Illusion +The Sword of Light +The Sword of Love and Hate +The Sword of Moonlight +The Sword of Nightmares +The Sword of Protection +The Sword of Slaughter +The Sword of Thousand Illusions +The True Holy Staff +The True Holy Sword +The True Staff of Space +The True Sword of Eternity +the True Sword of Illusion +The True Unholy Bow +The True Unholy Sword +The Unholy Armor +The Unholy Bow +The Unholy Sword +Tidal Robe +Titan's Plates +Token of Darkness +Token of Death +Token of Underlord +Tresure Keys +Trident +Triple Snow +Troll Axe +Troll Battlewear +Troll Blood +True Blood Cannon Tepes +True Bloodstone Plates +True Dragon Armor Magron +True Dragon Cannon Belenus +True Dragon Helm Agron +True Dragonbow Aience +True Dragonstaff Irave +True Dragonsword Lainhart +True Guardian Ring +Trueblood Ring +Tthe Ring of Chaos +Tthe Ring of Overlord +Tunic of Subtlety +Tunic of the Wraith +Turtle Lord +Turtle Token +Tusk Hunter +Twin Snow +Twisted Fragment of Ruins +Valtora +Very Silky Bag +Very Sturdy Bag +Vest of Cyclone +Vest of Divine Winds +Vest of Protection +Vest of Storm +Visage of Vengeance +Visvire, the Hammer of Storm +Void Eye +Voidshard Armor +Walrus Skin Cloak +Wand +Wanderer's Cape +Wands +Warm Fur +Water Cannon +White Crystal Golem Token +White Magic Stone +Wind-Protection Clothing +Wing Frame +Wing of Sea +Wings of Blood Lord +Wings of Cosmos +Wings of Death Token +Wings of Death +Wings of Flame Lord +Wings of Frost Lord +Wings of Fury +Wings of Immortal +Wings of Nature +Wings of Night Lord +Wings of Phoenix +Wings of Purification +Wings of Sacred Provenance +Wings of Slayer +Winter Robe +Winter Rose +Winter's Heart +Winter's Tear +Yellow Crystal Golem Token +Zombie Lord +Zombie Token diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_BDSM.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_BDSM.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c2ed45ac364e58b11d1425545c9be9aede2a7e6f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_BDSM.txt" @@ -0,0 +1,41 @@ +blindfold +bondage outfit +chain +nipple chain +chastity belt +chastity bra +chastity cage +Clamp +clitoris clamp +nipple clamps +tongue clamp +collar +cuffs +shackles +handcuffs +hobble +gag +ball gag +bit gag +cleave gag +panty gag +plug gag +ring gag +tape gag +gimp suit +leash +clitoris leash +leash on penis +nipple leash +nose hook +piercing +pillory +stocks +rope +crotch rope +sounding +spreader bar +whip +riding crop +wooden horse +judas cradle \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_fluids.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_fluids.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9d96910615462678e0183947929c91d4a8565f84 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_fluids.txt" @@ -0,0 +1,5 @@ +aphrodisiac +enema +lotion +lotion bottle +lube \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_sex_toys.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_sex_toys.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0021f0e8a2f8105f4179e80a216b5c253e728da4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sex_objects/sex_objects_sex_toys.txt" @@ -0,0 +1,48 @@ +anal beads +anal tail +artificial vagina +tenga +butt plug +aneros +butt plug tail +cock ring +dildo +double dildo +huge dildo +dragon dildo +horse dildo +spiked dildo +strap-on +suction cup dildo +dildo riding +food insertion +dildo gag +mask challenge (meme) +dildo harness +dildo under panties +prostate massager +breast pump +clitoris pump +sex machine +too many sex toys +sounding +catheter +urethral beads +vibrator +bunny vibrator +butterfly vibrator +egg vibrator +hitachi magic wand +remote control vibrator +riding machine +sybian +public vibrator +vibrator under clothes +vibrator in leg garter +vibrator in thighhighs +vibrator in thigh strap +vibrator under panties +vibrator under pantyhose +vibrator on nipple +vibrator in anus +vibrator on penis \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_asymmetrical_necklines.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_asymmetrical_necklines.txt" new file mode 100644 index 0000000000000000000000000000000000000000..48b519e212d65d6cb1caef0f02bbfc19281c34e6 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_asymmetrical_necklines.txt" @@ -0,0 +1,3 @@ +Single-shoulder shirt +Single-shoulder dress +Single-shoulder sweater \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_low_and_wide_cut_necklines.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_low_and_wide_cut_necklines.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a06f62f0f9be4a2a9befb0ccf5d47bca02924966 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_low_and_wide_cut_necklines.txt" @@ -0,0 +1,8 @@ +Off-shoulder shirt +Off-shoulder dress +Off-shoulder sweater +Off-shoulder jacket +Off-shoulder coat +Off-shoulder bikini +Off-shoulder one-piece swimsuit +Off-shoulder leotard \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_sleeveless.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_sleeveless.txt" new file mode 100644 index 0000000000000000000000000000000000000000..705e00f8282747880ce6b56850e26d56c3fbac7f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_sleeveless.txt" @@ -0,0 +1,5 @@ +Sleeveless shirt +Sleeveless dress +Sleeveless sweater +Sleeveless jacket +Sleeveless coat \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_strapless.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_strapless.txt" new file mode 100644 index 0000000000000000000000000000000000000000..be07be8fc75b9d240933c87ed841f64682b46df9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/clothing_desgined_to_leave_shoulders_bare/shoulders_attire_clothing_desgined_to_leave_shoulders_bare_strapless.txt" @@ -0,0 +1,7 @@ +Tube top +Strapless shirt +Strapless dress +Strapless bra +Strapless bikini +Strapless swimsuit +Strapless leotard \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_exposed.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_exposed.txt" new file mode 100644 index 0000000000000000000000000000000000000000..39198be18bd2ac1f7027efc1e4728a26bbd03087 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_exposed.txt" @@ -0,0 +1,3 @@ +Bare shoulders +Off shoulder +Strap slip \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_for.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_for.txt" new file mode 100644 index 0000000000000000000000000000000000000000..cc0293d82dafb588d0299f714b38198a4e3d69ee --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_for.txt" @@ -0,0 +1,4 @@ +Epaulettes +Pauldrons +Puffy sleeves +Shoulder pads \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5fcae6554b1e6183d493307fbda7bbbbb190e7ab --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/attire/shoulders_attire_other.txt" @@ -0,0 +1,2 @@ +Jacket on shoulders +Shirt on shoulders \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/shoulders_anatomy.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/shoulders_anatomy.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bfafbd7fc6943ab1381e812a2d45ad40c5d55260 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/shoulders_anatomy.txt" @@ -0,0 +1,3 @@ +Broad shoulders +Nape +Shoulder blades \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_animals.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_animals.txt" new file mode 100644 index 0000000000000000000000000000000000000000..546b22cb22a728385d8fe18a7eb735046343bde0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_animals.txt" @@ -0,0 +1,2 @@ +Bird on shoulder +Cat on shoulder \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_hands.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_hands.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7823e7734a4c547ff5b03176ce729145c28c470b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_hands.txt" @@ -0,0 +1,6 @@ +Arm around neck +Arm on shoulder +Hand on shoulder +Hands on shoulder +Hands on shoulders +Shoulder massage \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6c73455e6e9030c62c715b0d0b0fdf4d3474c9e8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_misc.txt" @@ -0,0 +1,3 @@ +Hair over shoulder +Over shoulder +Shoulder bag \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_objects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_objects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d45c72448a6e86769f52e37041ffa2754de12311 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_objects.txt" @@ -0,0 +1,3 @@ +Towel around neck +Weapon over shoulder +Sword over shoulder \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_touching.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_touching.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f49a65732508bf88bb4ee07db8e543570dde043d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/shoulders/with_something_else/shoulders_with_something_else_touching.txt" @@ -0,0 +1,5 @@ +Carrying over shoulder +Fireman's carry +Piggyback +Shoulder carry +Standing on shoulder \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_actions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_actions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1b40543bcc6a3b25609675c3beb2647875c67b4b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_actions.txt" @@ -0,0 +1,5 @@ +arm out of sleeve +hands in opposite sleeves +pinching sleeves +pinching sleeves +sleeve grab \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..f361100767e36242a5ec8d1af39afc430d34f411 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_colors.txt" @@ -0,0 +1,12 @@ +aqua sleeves +black sleeves +blue sleeves +brown sleeves +green sleeves +grey sleeves +orange sleeves +pink sleeves +purple sleeves +red sleeves +white sleeves +yellow sleeves \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_lack_thereof.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_lack_thereof.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1dcf794f7c87b7be31120658c991f517a1be5850 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_lack_thereof.txt" @@ -0,0 +1,9 @@ +sleeveless +sleeveless shirt +sleeveless dress +sleeveless duster +sleeveless hoodie +sleeveless jacket +sleeveless kimono +sleeveless shirt +sleeveless turtleneck \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_lenght.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_lenght.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d5ceaded8dacc560d5abb797da01c3c4597196d5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_lenght.txt" @@ -0,0 +1,5 @@ +short sleeves +long sleeves +sleeves past wrists +sleeves past fingers +sleeves past elbows \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2a37bd71a02fcd261b0693d11fc34ce3ea1c0211 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_misc.txt" @@ -0,0 +1,6 @@ +nipple sleeves +no detached sleeves +single sleeve +sleeve cuffs +up sleeve +See also \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_prints.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_prints.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c8a46440a11c12ed7b9c5a58303e461e54cd4563 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_prints.txt" @@ -0,0 +1,5 @@ +print sleeves +checkered sleeves +plaid sleeves +polka dot sleeves +striped sleeves \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_styled.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_styled.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a914e2a55e5ece52c0e010cc9a7a6d2a092df19d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_styled.txt" @@ -0,0 +1,3 @@ +sleeves rolled up +sleeves pushed up +sleeves rolled up \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_styles.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_styles.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bf152e2c8a895fd10c7863b71df14eaa47a9bbfd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_styles.txt" @@ -0,0 +1,22 @@ +uneven sleeves +bell sleeves +detached sleeves +compression sleeve +elbow sleeve +single detached sleeve +lace-up sleeves +layered sleeves +long sleeves +mismatched sleeves +puff and slash sleeves +puffy sleeves +puffy detached sleeves +puffy long sleeves +puffy short sleeves +raglan sleeves +ribbed sleeves +see-through sleeves +short sleeves +short over long sleeves +torn sleeves +wide sleeves \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_trims.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_trims.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b22ba94310a6d545e1e3eed29a94165d35c2f386 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/sleeves/sleeves_trims.txt" @@ -0,0 +1,5 @@ +feather-trimmed sleeves +frilled sleeves +fur-trimmed sleeves +lace-trimmed sleeves +ribbon-trimmed sleeves \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c450e05a67187a4d87e706d57fabf3ae7eb72835 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_colors.txt" @@ -0,0 +1,14 @@ +aqua one-piece swimsuit +black one-piece swimsuit +blue one-piece swimsuit +brown one-piece swimsuit +green one-piece swimsuit +grey one-piece swimsuit +orange one-piece swimsuit +pink one-piece swimsuit +red one-piece swimsuit +purple one-piece swimsuit +white one-piece swimsuit +yellow one-piece swimsuit +gold one-piece swimsuit +silver one-piece swimsuit \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_male.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_male.txt" new file mode 100644 index 0000000000000000000000000000000000000000..59dda0c4a8a3969bcc5a1f879018bf468d2b8c90 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_male.txt" @@ -0,0 +1,6 @@ +jammers +legskin +swim briefs +swim trunks +loincloth +fundoshi \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..83a4a333d26e020e6c324610756bf6d7a70a1dbd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_misc.txt" @@ -0,0 +1,16 @@ +adjusting swimsuit +bikini lift +bikini bottom only +bikini top only +hand under swimsuit +impossible swimsuit +swim cap +swimsuit aside +bikini aside +swimsuit costume +swimsuit pull +bikini pull +swimsuit under clothes +swimsuit cover-up +torn swimsuit +sarong \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_styles.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_styles.txt" new file mode 100644 index 0000000000000000000000000000000000000000..52cb1db4466183bbf7ce8f1bc93be1af59bbc447 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_styles.txt" @@ -0,0 +1,13 @@ +bow swimsuit +frilled swimsuit +frilled bikini +highleg swimsuit +highleg bikini +polka dot swimsuit +polka dot bikini +striped one-piece swimsuit +striped bikini +white one-piece swimsuit +white bikini +g-string +thong \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_types.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_types.txt" new file mode 100644 index 0000000000000000000000000000000000000000..dfaaa867c222e323c70c3f70861d6cf826f8da1e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/swimsuit/swimsuit_types.txt" @@ -0,0 +1,23 @@ +bikini +bikini skirt +checkered bikini +leaf bikini +lowleg bikini +micro bikini +o-ring bikini +o-ring top +o-ring bottom +shell bikini +side-tie bikini bottom +sports bikini +string bikini +tankini +thong bikini +rash guard +old-fashioned swimsuit +one-piece swimsuit +casual one-piece swimsuit +dress swimsuit +competition swimsuit +school swimsuit +slingshot swimsuit \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/armor/technology_armor_futuristic.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/armor/technology_armor_futuristic.txt" new file mode 100644 index 0000000000000000000000000000000000000000..af4da36513da35c490b9e82f922cdf18d9b83cfb --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/armor/technology_armor_futuristic.txt" @@ -0,0 +1,7 @@ +Beam shield +Exoskeleton +Power armor +Power suit +Gravity Suit +SEVA Suit +Power suit \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/armor/technology_armor_modern.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/armor/technology_armor_modern.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fa817cae266b964cc970612f1a99121007059a7c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/armor/technology_armor_modern.txt" @@ -0,0 +1,3 @@ +Bulletproof vest +Riot shield +Shin guards \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_electronic.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_electronic.txt" new file mode 100644 index 0000000000000000000000000000000000000000..97e93c240b4c4172dd45cd88f2dbcecc94188529 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_electronic.txt" @@ -0,0 +1,9 @@ +Bolt +Crank +Gears +Mechanical parts +Nut +Piston +Rotary switch +Screw +Tube \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..6ead7cb304d004b4c81b25ba54daaf9f9a77e16f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_main.txt" @@ -0,0 +1 @@ +Machine \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_mechanical.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_mechanical.txt" new file mode 100644 index 0000000000000000000000000000000000000000..97e93c240b4c4172dd45cd88f2dbcecc94188529 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/parts/technology_parts_mechanical.txt" @@ -0,0 +1,9 @@ +Bolt +Crank +Gears +Mechanical parts +Nut +Piston +Rotary switch +Screw +Tube \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_artificial_life.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_artificial_life.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9820400bf6270d37113e7cebc15afbe1322b4a29 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_artificial_life.txt" @@ -0,0 +1,10 @@ +Cyber elves +Mecha-drago +Transformers +Autobot +Aerialbot +Protectobot +Decepticon +Combaticon +Constructicon +Stunticon \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_computers.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_computers.txt" new file mode 100644 index 0000000000000000000000000000000000000000..896c9cc998d0698c4efe7e4ca7916e63058ec680 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_computers.txt" @@ -0,0 +1,18 @@ +Computer +Laptop +Computer chip +Keyboard +Mouse +Mousepad (object) +Ass mousepad +Breast mousepad +Monitor +Curved monitor +Multiple monitors +RAM +Tablet +Trackball +Screen +Floating screen +Touchscreen +USB \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_costumes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_costumes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..92d1724f85ba687f4bd20d67dd4e1576caf9fb74 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_costumes.txt" @@ -0,0 +1,2 @@ +Neko cyber +Transforming clothes \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_descriptions.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_descriptions.txt" new file mode 100644 index 0000000000000000000000000000000000000000..97e81d30ca43a3ccfadcf076a35f6acab2a65343 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_descriptions.txt" @@ -0,0 +1,6 @@ +Clockwork +Glowing +Mechanical +Neon trim +Radioactive +Transformation \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_holograms.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_holograms.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e5eec798e4554fce3a05de905303c9d5d7e83eb0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_holograms.txt" @@ -0,0 +1,5 @@ +Hologram +Holographic interface +Holographic keyboard +Holographic touchscreen +Holographic monitor \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_jobs.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_jobs.txt" new file mode 100644 index 0000000000000000000000000000000000000000..37a607662e96febf1cc1056fcc54508a2d2444d3 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_jobs.txt" @@ -0,0 +1,4 @@ +Engineer +Mechanic +Pilot +Programming \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_mecha.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_mecha.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ff4e5d9a08ea9a05f0adff03e10e7ade8259f461 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_mecha.txt" @@ -0,0 +1,15 @@ +Power suit +Plugsuit +Exoskeleton +Mecha +Gundam +Mechanization +Sand mecha +Mechanical broom +Power glove +Projector +Jetpack +Striker unit +Helicopter ears +Helicopter hair +Utility belt \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_medical.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_medical.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a0c5b0101713abbed09704ad0d9d945fb87cbbcc --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_medical.txt" @@ -0,0 +1,7 @@ +Bandages +Electrodes +Gauze +Pill +Stretcher +Syringe +X-ray \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_personal_communication.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_personal_communication.txt" new file mode 100644 index 0000000000000000000000000000000000000000..2792573d562d61bc0631be6396728a3ced9978e9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_personal_communication.txt" @@ -0,0 +1,22 @@ +Microphone +Microphone stand +Studio microphone +Vintage microphone +Phone +Cellphone +Cordless phone +Cellphone radio bar +Cellphone strap +Corded phone +Payphone +Phone booth +Smartphone +iPhone +Phonecard +Speaker +Earpiece +Robot ears +Earphones +Headphones +Walkie-talkie +Headset \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_places.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_places.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0c9798006e44c11cbd35eda2bb7df9967e27a2b0 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_places.txt" @@ -0,0 +1,4 @@ +City +Cityscape +Cybertron +Death Star \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_prosthetics.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_prosthetics.txt" new file mode 100644 index 0000000000000000000000000000000000000000..aa9c69c960669ef1323e703808c90e29cda59385 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_prosthetics.txt" @@ -0,0 +1,9 @@ +Automail +Cybernetic +Doll joints +Mechanical foot +Mechanical parts +Mechanical wings +Single mechanical arm +Nobiiru arm +Severed limb \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_radio.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_radio.txt" new file mode 100644 index 0000000000000000000000000000000000000000..1a90d1a5ed9df69aa18fe58acb2e209bdb503ccd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_radio.txt" @@ -0,0 +1,8 @@ +Radio +Stereo +Radio tower +Satellite dish +Radio booth +Radio telescope +Field radio +Radio antenna \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_robots.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_robots.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e330854a8e9424503d57f177f4c8bfafc32eda9f --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_robots.txt" @@ -0,0 +1,10 @@ +Android +Robot joints +Mechanical parts +Robot ears +Cyborg +Mecha +Mechanical dragon +Replicant +Robot +Robot animal \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_scientific.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_scientific.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ca42dafa5c0009eae38760f5e3b20d4f926a244e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_scientific.txt" @@ -0,0 +1,2 @@ +Electricity +Plasma \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_space.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_space.txt" new file mode 100644 index 0000000000000000000000000000000000000000..e67cf457eff69b14b635a9ba1b19a4e67751c644 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_space.txt" @@ -0,0 +1,18 @@ +Satellite +Satellite cannon +Space station +International space station +Space Elevator +Space fountain +Black Hole +Space helmet +Space sword +Space colony +Spacecraft +Spacecraft interior +Space capsule +Space train +Spacesuit +Space uniform +Space shuttle +Space \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_themes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_themes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5c5c422ce2dbd648f00ebbf953fa9ce22db701ab --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_themes.txt" @@ -0,0 +1,8 @@ +Science fiction +Cyberpunk +Steampunk +Fantasy +Mechanization +Personification +Mecha musume +OS-tan \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_tools.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_tools.txt" new file mode 100644 index 0000000000000000000000000000000000000000..342e65a26ffede4f5e55935105eb353456a15cfe --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/technology_tools.txt" @@ -0,0 +1,10 @@ +Drill +Hammer +Mallet +Pliers +Screwdriver +Electric screwdriver +Wrench +Adjustable wrench +Combination wrench +Double sided wrench \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_aerospace.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_aerospace.txt" new file mode 100644 index 0000000000000000000000000000000000000000..0c454feac51c9c71ce1b08e918f3fec2a192badd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_aerospace.txt" @@ -0,0 +1,12 @@ +Airplane +Bomber +Helicopter +Jet +Fighter jet +Spacecraft +Rocket ship +Space Shuttle +Tie Fighter +Tie Interceptor +UFO +Flying saucer \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_land.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_land.txt" new file mode 100644 index 0000000000000000000000000000000000000000..48b3daebd90447c6cc2787c6189fd72de459e074 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_land.txt" @@ -0,0 +1,23 @@ +Ambulance +Bus +Car +Police car +Taxi +Jeep +Tank +Trailer +Train +Streetcar +Aerial tram +Truck +Concrete mixer truck +Dump truck +Haul truck +Flatbed truck +Fire truck +Garbage truck +Monster truck +Pickup truck +Semi truck +Tank truck +Van \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_naval.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_naval.txt" new file mode 100644 index 0000000000000000000000000000000000000000..a8ed7cd8c78768e2b17a555ba2f429e9aced4a3a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_naval.txt" @@ -0,0 +1,3 @@ +Ship +Battleship +Submarine \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_power_sources.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_power_sources.txt" new file mode 100644 index 0000000000000000000000000000000000000000..17b8c2df2311a54e2dfbc2b76a7b636db18bff43 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/technology/vehicles/technology_vehicles_power_sources.txt" @@ -0,0 +1,2 @@ +Engine +Jet engine \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree.txt" new file mode 100644 index 0000000000000000000000000000000000000000..acd70f41365f930e2c02e067c0e4a9ab6903e5f9 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree.txt" @@ -0,0 +1,75 @@ +acacia +alder +apple +arborvitae +ash +aspen +avocado +balsam +bamboo +banana +baobab +bay +beech +birch +blackthorn +bluegum +buckeye +butternut +cedar +cherry +chestnut +chinkapin +cork +cottonwood +cucumber +cypress +dogwood +elder +elm +fir +guava +hackberry +hawthorn +hemlock +hickory +holly +hophornbean +hornbean +jacaranda +juniper +larch +laurel +lemon +lime +locust +madrone +magnolia +maple +moosewood +mulberry +myrtle +nutmeg +oak +orange +palm +pear +persimmon +pine +pineapple +piñon +pomegranate +poplar +redwood +sassafras +sequoia +shadbush +spruce +sumac +sycamore +tamarind +torreya +tupelo +walnut +willow +yew diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_habitats.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_habitats.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ea69f84575e6f82a1895a797a0f7782a8450c3a8 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_habitats.txt" @@ -0,0 +1,3 @@ +forest +rainforest +jungle \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..608170d2dc19c1eae2ba40788e6a698aa3f878e7 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_misc.txt" @@ -0,0 +1,5 @@ +autumn +bare tree +giant tree +treehouse +tree shade \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_parts.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_parts.txt" new file mode 100644 index 0000000000000000000000000000000000000000..ee3a0c2d75906eaf8158032d28a8da51d773d45b --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_parts.txt" @@ -0,0 +1,7 @@ +branch +leaf +log +petals +roots +stick +tree stump \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_types.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_types.txt" new file mode 100644 index 0000000000000000000000000000000000000000..22bf48239ab0bf647184625888e348ccde9252a1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/tree/tree_types.txt" @@ -0,0 +1,14 @@ +baobab +bonsai +cherry blossoms +christmas tree +fir tree +fruit tree +ginkgo +ginkgo nut +maple leaf +palm tree +coconut tree +pine tree +plum blossoms +willow \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b7aaab6bb6393a38f79b9771097d3e3723da5171 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water.txt" @@ -0,0 +1,51 @@ +Dead Sea +Caspian Sea +Atlantic Ocean +Pacific Ocean +The Great Lakes +Indian Ocean +Salton Sea +Arctic Ocean +Mediterranean Sea +Baltic Sea +Black Sea +Beaufort Sea +Sea of Marmara +Sea of Azov +Red Sea +Lake Natron +Lake Michigan +Lake Superior +Great Salt Lake +Don Juan Pond +Lac Assal +Garabogazkol Basin +Aral Sea +Lacul Sarat +Lake Pikrolimni +Issyk Kul +Lake Techirghiol +Qinghai Lake +Lough Hyne +Little Manitou Lake +Great Bitter Lake +Lake Urmia +Chilika Lake +Kati Thanda-Lake Eyre +Lake Retba +Lake Vanda +Lake Tuz +Sarygamysh Lake +Lake Abert +Van Lake +Mono Lake +Lake Baskunchak +Nile +Missouri River +Lake Baikal +Lake Huron +Great Bear Lake +Lake Tanganyika +Mackenzie River +Yangtze River +Lena River diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/geography/water_geography_bodies.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/geography/water_geography_bodies.txt" new file mode 100644 index 0000000000000000000000000000000000000000..3be4aedc98f36a12f6a7e8fe4da5c97d6da821bd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/geography/water_geography_bodies.txt" @@ -0,0 +1,4 @@ +Lake +Ocean +River +Stream \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/geography/water_geography_landforms.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/geography/water_geography_landforms.txt" new file mode 100644 index 0000000000000000000000000000000000000000..721230eedfff259f257c2f337d62e2b2c25a9d35 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/geography/water_geography_landforms.txt" @@ -0,0 +1,2 @@ +Beach +Island \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_clothes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_clothes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..07a13fd69638d654ec7c3cbe12150bb1e68ed390 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_clothes.txt" @@ -0,0 +1,3 @@ +Bikini +Goggles +Swimsuit \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_ice.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_ice.txt" new file mode 100644 index 0000000000000000000000000000000000000000..c6ed9caa9c6a2a3c1a8efd324377193de70bb234 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_ice.txt" @@ -0,0 +1,3 @@ +Ice +Ice sculpture +Ice flower \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..d42f85e4bb994b2246e6cd0a9569cec38756594a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_main.txt" @@ -0,0 +1,2 @@ +Water +Underwater \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_meteorology.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_meteorology.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8b1e55b2c195add86c1a0a90e33778429521269a --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_meteorology.txt" @@ -0,0 +1,2 @@ +Rain +Snow \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..5935690350a0c77381d7404af43b3f2cbccd3376 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_misc.txt" @@ -0,0 +1,20 @@ +Bath +Bathtub +Mixed bathing +Drinking +Freediving +Hydrokinesis +Pool +Poolside +Glass +Bottle +Cup +Mug +Shower (place) +Showering +Spilling +Splashing +Swimming +Umbrella +Wading +Washing \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_steam.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_steam.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7dfcb8af3d6df35d9757f31076de9f4f7366f21d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_steam.txt" @@ -0,0 +1 @@ +Steam \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_vehicles.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_vehicles.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8a3a024f3bfb6b533113c35f2ee2585443fac3b1 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_vehicles.txt" @@ -0,0 +1,6 @@ +Boat +Rowboat +Sailboat +Ship +Battleship +Submarine \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_wet.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_wet.txt" new file mode 100644 index 0000000000000000000000000000000000000000..59421aeb87f6672ad6a0fcddcf119736872a20d4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/water/water_wet.txt" @@ -0,0 +1,10 @@ +Wet +Wet clothes +Wet dress +Wet skirt +Wet shirt +Wet shirt +Wet panties +Wet swimsuit +Wet towel +Wet hair \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_elemental.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_elemental.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b12bf07fa1a7aa860b5cc64c485d03ee01a8ba0e --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_elemental.txt" @@ -0,0 +1,6 @@ +Bat wings +energy wings +Fiery wings +Ice wings +light hawk wings +liquid wings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_false.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_false.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9b10fcbaf5aeb7df2bb39ee2516b2a501b46f03d --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_false.txt" @@ -0,0 +1,2 @@ +artificial wings +Fake wings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_insects.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_insects.txt" new file mode 100644 index 0000000000000000000000000000000000000000..efa1e28d952b816d5d1ddf081d580bb31531bcfd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_insects.txt" @@ -0,0 +1,4 @@ +insect wings +Butterfly wings +ladybug wings +moth wings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_other.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_other.txt" new file mode 100644 index 0000000000000000000000000000000000000000..94b8707e9da42ea1e7f034e5833f1dcc6f27c2dd --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_other.txt" @@ -0,0 +1,5 @@ +Hair wings +Mechanical wings +metal wings +Plant wings +feathered wings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_supernatural.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_supernatural.txt" new file mode 100644 index 0000000000000000000000000000000000000000..fa4e62319c297f18db49c0695022c8d3ab1357ba --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/types/wings_types_supernatural.txt" @@ -0,0 +1,4 @@ +Angel wings +Demon wings +Dragon wings +Fairy wings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_colors.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_colors.txt" new file mode 100644 index 0000000000000000000000000000000000000000..9d8b9d8a41d08e197c968ad5b3eb7b7e4ca0faed --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_colors.txt" @@ -0,0 +1,4 @@ +Black wings +Gradient wings +Red wings +white wings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_main.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_main.txt" new file mode 100644 index 0000000000000000000000000000000000000000..8607ab3b490d4e28432e28d8229fcfd74e5724d3 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_main.txt" @@ -0,0 +1,2 @@ +Wings +Alternate wings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_misc.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_misc.txt" new file mode 100644 index 0000000000000000000000000000000000000000..821189eda458a33e76b38952ae02e54bf831e3a4 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_misc.txt" @@ -0,0 +1,12 @@ +bloody wings +Bowed wings +Flapping +Glowing wings +Heart wings +Torn wings +Wing censor +Wing grab +Wing hug +Wing umbrella +Wingjob +wing ribbon \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_number.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_number.txt" new file mode 100644 index 0000000000000000000000000000000000000000..b8566a65ae09b6e3b9704190ec02f28d8632a7d5 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_number.txt" @@ -0,0 +1,3 @@ +Multiple wings +No wings +Single wing \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_sizes.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_sizes.txt" new file mode 100644 index 0000000000000000000000000000000000000000..055b924e3d3e7f2790655aef9cc7728a0a5c007c --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_sizes.txt" @@ -0,0 +1,2 @@ +Large wings +mini wings \ No newline at end of file diff --git "a/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_where_attached.txt" "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_where_attached.txt" new file mode 100644 index 0000000000000000000000000000000000000000..bf57acbc005313ab689971276072ef4b14dd5462 --- /dev/null +++ "b/gf/sd-webui-split-prompt/\345\210\206\351\241\236/\347\211\251\345\223\201/wings/wings_where_attached.txt" @@ -0,0 +1,10 @@ +Ankle wings +Detached wings +Head wings +Low wings +Leg wings +wing ears +Winged bag +winged hat +Winged helmet +Winged footwear \ No newline at end of file diff --git a/gf/sd_webui_prompt_translator_architecture/README.md b/gf/sd_webui_prompt_translator_architecture/README.md new file mode 100644 index 0000000000000000000000000000000000000000..4d2b2d8689597355c97188fef50bef68189a2ab3 --- /dev/null +++ b/gf/sd_webui_prompt_translator_architecture/README.md @@ -0,0 +1,20 @@ +# prompt translator architecture + +The main models currently used for offline translation are MarianMT and Mbart-50, with the latter providing somewhat +better results. This plugin uses the latter as the offline translation to translate the above sentence into English. + +The offline translation performance of mBART-50 is superior to that of MarianMT, and it supports predefined translation words, incorporating a large vocabulary of architectural terms. Translate the above sentence into English. + + +Extension for [webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui). Translate non-english to english in +stable diffusion webui. + +![](preview.png) + +Find the prompt translator tab after installing the extension. + +# Installation + +Install from webui's Extensions tab. + + diff --git a/gf/sd_webui_prompt_translator_architecture/preview.png b/gf/sd_webui_prompt_translator_architecture/preview.png new file mode 100644 index 0000000000000000000000000000000000000000..e5bd194cccfb614dccb3cb591e1958bef2aadb49 Binary files /dev/null and b/gf/sd_webui_prompt_translator_architecture/preview.png differ diff --git a/gf/sd_webui_prompt_translator_architecture/scripts/__pycache__/prompt_translator.cpython-310.pyc b/gf/sd_webui_prompt_translator_architecture/scripts/__pycache__/prompt_translator.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..93963fd5fe614885cfd37a1f4e8303a8626346dc Binary files /dev/null and b/gf/sd_webui_prompt_translator_architecture/scripts/__pycache__/prompt_translator.cpython-310.pyc differ diff --git a/gf/sd_webui_prompt_translator_architecture/scripts/prompt_translator.py b/gf/sd_webui_prompt_translator_architecture/scripts/prompt_translator.py new file mode 100644 index 0000000000000000000000000000000000000000..ba68797c1883160c4fd13acdab2b581f6868882d --- /dev/null +++ b/gf/sd_webui_prompt_translator_architecture/scripts/prompt_translator.py @@ -0,0 +1,517 @@ +import os +import re +import csv +import modules.scripts as scripts +import gradio as gr + +from transformers import MBart50TokenizerFast, MBartForConditionalGeneration + +# The directory to store the models +cache_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'models') + + +class MBartTranslator: + """MBartTranslator class provides a simple interface for translating text using the MBart language model. + + The class can translate between 50 languages and is based on the "facebook/mbart-large-50-many-to-many-mmt" + pre-trained MBart model. However, it is possible to use a different MBart model by specifying its name. + + Attributes: + model (MBartForConditionalGeneration): The MBart language model. + tokenizer (MBart50TokenizerFast): The MBart tokenizer. + """ + + def __init__(self, model_name="facebook/mbart-large-50-many-to-many-mmt", src_lang=None, tgt_lang=None): + + self.supported_languages = [ + "ar_AR", + "de_DE", + "en_XX", + "es_XX", + "fr_XX", + "hi_IN", + "it_IT", + "ja_XX", + "ko_XX", + "pt_XX", + "ru_RU", + "zh_XX", + "af_ZA", + "bn_BD", + "bs_XX", + "ca_XX", + "cs_CZ", + "da_XX", + "el_GR", + "et_EE", + "fa_IR", + "fi_FI", + "gu_IN", + "he_IL", + "hi_XX", + "hr_HR", + "hu_HU", + "id_ID", + "is_IS", + "ja_XX", + "jv_XX", + "ka_GE", + "kk_XX", + "km_KH", + "kn_IN", + "ko_KR", + "lo_LA", + "lt_LT", + "lv_LV", + "mk_MK", + "ml_IN", + "mr_IN", + "ms_MY", + "ne_NP", + "nl_XX", + "no_XX", + "pl_XX", + "ro_RO", + "si_LK", + "sk_SK", + "sl_SI", + "sq_AL", + "sr_XX", + "sv_XX", + "sw_TZ", + "ta_IN", + "te_IN", + "th_TH", + "tl_PH", + "tr_TR", + "uk_UA", + "ur_PK", + "vi_VN", + "war_PH", + "yue_XX", + "zh_CN", + "zh_TW", + ] + print("Building translator") + print("Loading generator (this may take few minutes the first time as I need to download the model)") + self.model = MBartForConditionalGeneration.from_pretrained(model_name, cache_dir=cache_dir) + print("Loading tokenizer") + self.tokenizer = MBart50TokenizerFast.from_pretrained(model_name, src_lang=src_lang, tgt_lang=tgt_lang, + cache_dir=cache_dir) + print("Translator is ready") + + def translate(self, text: str, input_language: str, output_language: str) -> str: + """Translate the given text from the input language to the output language. + + Args: + text (str): The text to translate. + input_language (str): The input language code (e.g. "hi_IN" for Hindi). + output_language (str): The output language code (e.g. "en_US" for English). + + Returns: + str: The translated text. + """ + if input_language not in self.supported_languages: + raise ValueError(f"Input language not supported. Supported languages: {self.supported_languages}") + if output_language not in self.supported_languages: + raise ValueError(f"Output language not supported. Supported languages: {self.supported_languages}") + + self.tokenizer.src_lang = input_language + encoded_input = self.tokenizer(text, return_tensors="pt") + generated_tokens = self.model.generate( + **encoded_input, forced_bos_token_id=self.tokenizer.lang_code_to_id[output_language] + ) + translated_text = self.tokenizer.batch_decode(generated_tokens, skip_special_tokens=True) + + return translated_text[0] + + +class LanguageOption: + """ + A class representing a language option in a language selector. + + Attributes: + label (str): The display label for the language option. + language_code (str): The ISO 639-1 language code for the language option. + """ + + def __init__(self, label, language_code): + """ + Initializes a new LanguageOption instance. + + Args: + label (str): The display label for the language option. + language_code (str): The ISO 639-1 language code for the language option. + """ + self.label = label + self.language_code = language_code + + +# This is a list of LanguageOption objects that represent the various language options available. +# Each LanguageOption object contains a label that represents the display name of the language and +# a language code that represents the code for the language that will be used by the translation model. +# The language codes follow a format of "xx_YY" where "xx" represents the language code and "YY" represents the +# country or region code. If the language code is not specific to a country or region, then "XX" is used instead. +# For example, "en_XX" represents English language and "fr_FR" represents French language specific to France. +# These LanguageOption objects will be used to display the language options to the user and to retrieve the +# corresponding language code when the user selects a language. +language_options = [ + LanguageOption("Arabic", "ar_AR"), + LanguageOption("Deutsch", "de_DE"), + LanguageOption("English", "en_XX"), + LanguageOption("Spanish", "es_XX"), + LanguageOption("French", "fr_XX"), + LanguageOption("Hindi", "hi_IN"), + LanguageOption("Italian", "it_IT"), + LanguageOption("Japanese", "ja_XX"), + LanguageOption("Korean", "ko_XX"), + LanguageOption("Portuguese", "pt_XX"), + LanguageOption("Russian", "ru_RU"), + LanguageOption("Chinese", "zh_XX"), + LanguageOption("Afrikaans", "af_ZA"), + LanguageOption("Bengali", "bn_BD"), + LanguageOption("Bosnian", "bs_XX"), + LanguageOption("Catalan", "ca_XX"), + LanguageOption("Czech", "cs_CZ"), + LanguageOption("Danish", "da_XX"), + LanguageOption("Greek", "el_GR"), + LanguageOption("Estonian", "et_EE"), + LanguageOption("Persian", "fa_IR"), + LanguageOption("Finnish", "fi_FI"), + LanguageOption("Gujarati", "gu_IN"), + LanguageOption("Hebrew", "he_IL"), + LanguageOption("Croatian", "hr_HR"), + LanguageOption("Hungarian", "hu_HU"), + LanguageOption("Indonesian", "id_ID"), + LanguageOption("Icelandic", "is_IS"), + LanguageOption("Javanese", "jv_XX"), + LanguageOption("Georgian", "ka_GE"), + LanguageOption("Kazakh", "kk_XX"), + LanguageOption("Khmer", "km_KH"), + LanguageOption("Kannada", "kn_IN"), + LanguageOption("Korean", "ko_KR"), + LanguageOption("Lao", "lo_LA"), + LanguageOption("Lithuanian", "lt_LT"), + LanguageOption("Latvian", "lv_LV"), + LanguageOption("Macedonian", "mk_MK"), + LanguageOption("Malayalam", "ml_IN"), + LanguageOption("Marathi", "mr_IN"), + LanguageOption("Malay", "ms_MY"), + LanguageOption("Nepali", "ne_NP"), + LanguageOption("Dutch", "nl_XX"), + LanguageOption("Norwegian", "no_XX"), + LanguageOption("Polish", "pl_XX"), + LanguageOption("Romanian", "ro_RO"), + LanguageOption("Sinhala", "si_LK"), + LanguageOption("Slovak", "sk_SK"), + LanguageOption("Slovenian", "sl_SI"), + LanguageOption("Albanian", "sq_AL"), + LanguageOption("Serbian", "sr_XX"), + LanguageOption("Swedish", "sv_XX"), + LanguageOption("Swahili", "sw_TZ"), + LanguageOption("Tamil", "ta_IN"), + LanguageOption("Telugu", "te_IN"), + LanguageOption("Tamil ", "ta_IN"), + LanguageOption("Telugu", "te_IN"), + LanguageOption("Thai", "th_TH"), + LanguageOption("Tagalog", "tl_PH"), + LanguageOption("Turkish", "tr_TR"), + LanguageOption("Ukrainian", "uk_UA"), + LanguageOption("Urdu", "ur_PK"), + LanguageOption("Vietnamese", "vi_VN"), + LanguageOption("Waray", "war_PH"), + LanguageOption("Cantonese", "yue_XX"), + LanguageOption("Chinese", "zh_CN"), + LanguageOption("Chinese", "zh_TW"), +] + + +def remove_unnecessary_spaces(text): + """Removes unnecessary spaces between characters.""" + pattern = r"\)\s*\+\+|\)\+\+\s*" + replacement = r")++" + return re.sub(pattern, replacement, text) + + +def extract_plus_positions(text): + """ + Given a string of text, extracts the positions of all sequences of one or more '+' characters. + + Args: + - text (str): the input text + + Returns: + - positions (list of lists): a list of [start, end, count] for each match, where start is the index of the + first '+' character, end is the index of the last '+' character + 1, and count is the number of '+' characters + in the match. + """ + # Match any sequence of one or more '+' characters + pattern = re.compile(r'\++') + + # Find all matches of the pattern in the text + matches = pattern.finditer(text) + + # Loop through the matches and add their positions to the output list + positions = [] + last_match_end = None + for match in matches: + if last_match_end is not None and match.start() != last_match_end: + # If there is a gap between the current match and the previous one, add a new position + j = last_match_end - 1 + while text[j] == "+": + j -= 1 + j += 1 + positions.append([j, last_match_end, last_match_end - j]) + + last_match_end = match.end() + + # If the final match extends to the end of the string, add its position to the output list + if last_match_end is not None and last_match_end == len(text): + j = last_match_end - 1 + while text[j] == "+": + j -= 1 + j += 1 + positions.append([j, last_match_end, last_match_end - j]) + + return positions + + +def match_pluses(original_text, translated_text): + """ + Given two strings of text, replaces sequences of '+' characters in the second string with the corresponding + sequences of '+' characters in the first string. + + Args: + - original_text (str): the original text + - translated_text (str): the translated text with '+' characters + + Returns: + - output (str): the translated text with '+' characters replaced by those in the original text + """ + in_positions = extract_plus_positions(original_text) + out_positions = extract_plus_positions(translated_text) + + out_vals = [] + out_current_pos = 0 + + if len(in_positions) == len(out_positions): + # Iterate through the positions and replace the sequences of '+' characters in the translated text + # with those in the original text + for in_, out_ in zip(in_positions, out_positions): + out_vals.append(translated_text[out_current_pos:out_[0]]) + out_vals.append(original_text[in_[0]:in_[1]]) + out_current_pos = out_[1] + + # Check that the number of '+' characters in the original and translated sequences is the same + if in_[2] != out_[2]: + print("detected different + count") + + # Add any remaining text from the translated string to the output + out_vals.append(translated_text[out_current_pos:]) + + # Join the output values into a single string + output = "".join(out_vals) + return output + + +def post_process_prompt(original, translated): + """Applies post-processing to the translated prompt such as removing unnecessary spaces and extra plus signs.""" + clean_prompt = remove_unnecessary_spaces(translated) + clean_prompt = match_pluses(original, clean_prompt) + # clean_prompt = remove_extra_plus(clean_prompt) + return clean_prompt + + +# 读取 csv 文件到内存中缓存起来 +def load_csv(csv_file): + with open(csv_file, 'r', encoding='utf-8') as f: + reader = csv.reader(f) + cache = dict(reader) + return cache + + +def custom_translate(text, cache): + if text in cache: + return cache[text] + else: + return None + + +class Script(scripts.Script): + def __init__(self) -> None: + """Initializes the Script class and sets the default value for enable_translation attribute.""" + super().__init__() + self.enable_translation = False + self.is_negative_translate_active = False + + def title(self): + """Returns the title of the script.""" + return "Translate prompt to english" + + def show(self, is_img2img): + """Returns the visibility status of the script in the interface.""" + return scripts.AlwaysVisible + + def set_active(self, active): + """Sets the is_active attribute and initializes the translator object if not already created. + Also, sets the visibility of the language dropdown to True.""" + self.is_active = active + if not hasattr(self, "translator"): + self.translator = MBartTranslator() + if self.is_active: + return "# I am ready", self.options.update(visible=True) + else: + return "After enabling translation, please Wait until I am ready", self.options.update(visible=False) + + def set_negative_translate_active(self, negative_translate_active): + """Sets the is_active attribute and initializes the translator object if not already created. + Also, sets the visibility of the language dropdown to True.""" + self.is_negative_translate_active = negative_translate_active + + def ui(self, is_img2img): + """Sets up the user interface of the script.""" + self.is_active = False + self.current_axis_options = [x for x in language_options] + + with gr.Row(): + with gr.Column(scale=19): + with gr.Accordion("Prompt Translator", open=False): + with gr.Accordion("Help", open=False): + md = gr.Markdown(""" + # Description + This script translates your prompt from another language to english before generating the image allowing you to write prompts in your native language. + # How to use + Select Enable translation and wait until you the label shows ready. + Once the label has Ready on it, select the prompt language, write the prompt in the prompt field then press generate. The script will translate the prompt and generate the text. + # Note + First time you enable the script, it may take a long time (around a minute), but once loaded, it will be faster. + """) + with gr.Column(): + self.enable_translation = gr.Checkbox(label="Enable translation") + with gr.Column() as options: + self.options = options + self.translate_negative_prompt = gr.Checkbox(label="Translate negative prompt") + self.enable_translation.value = False + self.language = gr.Dropdown( + label="Source language", + choices=[x.label for x in self.current_axis_options], + value="Chinese", + type="index", + elem_id=self.elem_id("x_type") + ) + self.output = gr.Markdown(value="After enabling translation, please Wait until I am ready", + visible=True) + self.enable_translation.change( + self.set_active, + [self.enable_translation], + [self.output, self.options], + show_progress=True + ) + self.translate_negative_prompt.change( + self.set_negative_translate_active, + [self.translate_negative_prompt], + ) + + self.options.visible = False + return [self.language] + + def get_prompts(self, p): + """Returns the original prompts and negative prompts associated with a Prompt object.""" + original_prompts = p.all_prompts if len(p.all_prompts) > 0 else [p.prompt] + original_negative_prompts = ( + p.all_negative_prompts + if len(p.all_negative_prompts) > 0 + else [p.negative_prompt] + ) + + return original_prompts, original_negative_prompts + + def is_english(self, text): + # 判断字符串是否只包含英文字符和空格 + return all(c.isascii() or c.isspace() for c in text) + + def transfer(self, text, ln_code): + # 加载 csv 文件并缓存到内存中 + csv_path = os.path.join(os.path.dirname(__file__), 'translations.csv') + cache = load_csv(csv_path) + # 自定义翻译 + result = custom_translate(text, cache) + if result is not None: + return result + else: + # 调用 API 进行翻译 + en_prompt = self.translator.translate(text, ln_code, "en_XX") + return en_prompt + + def process_text(self, text, ln_code): + # 将中文全角标点符号替换为半角标点符号 + text = text.translate(str.maketrans(',。!?;:‘’“”()【】', ',.!?;:\'\'\"\"()[]')) + # 使用正则表达式来分割尖括号内外的内容 + parts = re.split(r'(<[^>]*>)', text) + + # 遍历分割后的字符串列表,对尖括号外的内容进行翻译 + translated_parts = [] + for part in parts: + if part.startswith('<') and part.endswith('>'): + translated_parts.append(part) + else: + # 使用逗号分割字符串,并对每个字符串进行翻译(如果不是英文) + translated_segments = [self.transfer(segment, ln_code) if not self.is_english(segment) else segment for + segment in part.split(',')] + translated_parts.append(','.join(translated_segments)) + + # 将翻译后的字符串拼接成一个字符串 + return ''.join(translated_parts) + + def process(self, p, language, **kwargs): + """Translates the prompts from a non-English language to English using the MBartTranslator object.""" + + if hasattr(self, "translator") and self.is_active: + original_prompts, original_negative_prompts = self.get_prompts(p) + translated_prompts = [] + previous_prompt = "" + previous_translated_prompt = "" + + for original_prompt in original_prompts: + if previous_prompt != original_prompt: + print(f"Translating prompt to English from {language_options[language].label}") + print(f"Initial prompt:{original_prompt}") + + ln_code = language_options[language].language_code + translated_prompt = self.process_text(original_prompt, ln_code) + #translated_prompt = self.translator.translate(original_prompt, ln_code, "en_XX") + + translated_prompt = post_process_prompt(original_prompt, translated_prompt) + print(f"Translated prompt:{translated_prompt}") + translated_prompts.append(translated_prompt) + + previous_prompt = original_prompt + previous_translated_prompt = translated_prompt + else: + translated_prompts.append(previous_translated_prompt) + + if p.negative_prompt != '' and self.is_negative_translate_active: + previous_negative_prompt = "" + previous_translated_negative_prompt = "" + translated_negative_prompts = [] + for negative_prompt in original_negative_prompts: + if previous_negative_prompt != negative_prompt: + print(f"Translating negative prompt to English from {language_options[language].label}") + print(f"Initial negative prompt:{negative_prompt}") + ln_code = language_options[language].language_code + #translated_negative_prompt = self.translator.translate(negative_prompt, ln_code, "en_XX") + translated_negative_prompt = self.process_text(negative_prompt, ln_code) + translated_negative_prompt = post_process_prompt(negative_prompt, translated_negative_prompt) + print(f"Translated negative prompt:{translated_negative_prompt}") + translated_negative_prompts.append(translated_negative_prompt) + + previous_negative_prompt = negative_prompt + previous_translated_negative_prompt = translated_negative_prompt + else: + translated_negative_prompts.append(previous_translated_negative_prompt) + + p.negative_prompt = translated_negative_prompts[0] + p.all_negative_prompts = translated_negative_prompts + p.prompt = translated_prompts[0] + p.prompt_for_display = translated_prompts[0] + p.all_prompts = translated_prompts diff --git a/gf/sd_webui_prompt_translator_architecture/scripts/translations.csv b/gf/sd_webui_prompt_translator_architecture/scripts/translations.csv new file mode 100644 index 0000000000000000000000000000000000000000..b3b690683ee83e0a93e62f886f61ef736e536897 --- /dev/null +++ b/gf/sd_webui_prompt_translator_architecture/scripts/translations.csv @@ -0,0 +1,5810 @@ +低分辨率,lowres +高分辨率,highres +超高分辨率,absurdres +极高分辨率,incredibly_absurdres +超级高分辨率/大文件,huge_filesize +壁纸,wallpaper +写实,realistic +照片,photo +真实,real +高画质,RAW +最佳质量,best quality +照片写实,photorealistic +虚幻5引擎风格,unreal engine 5 +oc引擎风格,octane render +杰作,masterpiece + +体积照明,volumetric lighting +冷光,cold light +氛围光,mood lighting +明亮的,bright +全局柔光/局部柔光,Soft illuminaotion/ soft lights +荧光灯,fluorescent lighting +闪光/晨光,rays of shimmering light/ morning light +顶光,top light +轮廓光,rim lights +云隙光,黄昏射线,Crepuscular Ray +外太空观,outer space view +电影灯光/戏剧灯光,cinematic lighting/ Dramatic lighting +双性照明,bisexual lighting +伦勃朗照明,Rembrandt Lighting +分体照明,Split Lighting +前灯,front lighting +背光照明,Back lighting +干净的背景趋势,clean background trending +边缘灯,rim lights +霓虹灯冷光,neon cold lighting +强光hard lighting +生物光,黑暗中黄火虫发出的光,海洋里发光水母的光等发出的光,bioluminescence + +白天,daylight +夜晚,night +傍晚,evening +清晨,Early morning +正午,High Noon +冷的,Cold +暖的,Warm +阴天,cloudy +雾天,fog +阳光明媚,sunny +雨天,rainy + +古典,classical +中世纪,Middle Ages +文艺复兴,Renaissance +古罗马,Ancient Rome +当代,contemporary style +汉代,Han dynasty +唐代,Tang dynasty +宋代,Song dynasty +明代,Ming dynasty +清代,Qing dynasty + +中式古典,Chinese Classical +日式古典,Japanese clasical +伊斯兰,Islamic style +北欧,Nodic style/Scandinavia style +地中海,Mediterranean style +非洲民族风,Ethnic African style +东南亚,Southeast Asia +现代,modern +中古时期,mid-century modern +未来/未来主义,future/Futurism +高技派,High-Tech +洛可可,Rococo style +巴洛克,Baroque +新中式,New Chinese +新古典,Neo-classic +工业风,Industrial style +新装饰主义,Art deco +极简主义,Minimalism +粗野主义/野兽派,Brutalism +解构主义,Deconstructionism +风格派,De Stijl/neoplasticism +国际风格,International style +新陈代谢派,Metabolism style +表现主义,Expressionism +哥特,Gothic +伊东丰雄,Toyo Ito +安藤忠雄,ando tadao +妹岛和世,Toyo Ito +西泽立卫,Nishizawa Ryue +阿尔瓦罗·西扎,Álvaro Siza +安托万·普雷多克,Antoine Predock +丹尼尔·里伯斯金,Daniel Libeskind +弗兰克·盖里,Frank Gehry +努维尔,Jean Nouvel +隈研吾,Kengo Kuma +贝聿铭,Ieoh Ming Pei +马岩松,MAD +卒姆托,Peter Zumthor +库哈斯,Rem Koolhass +理查德·迈耶,Richard Meier +理查德·罗杰斯,Richard Rogers +安东尼奥·高迪,Antonio Gaudi +圣地亚哥·卡拉特拉瓦,Antonio Gaudi +坂茂,Shigeru Ban +史蒂文·霍尔,Steven Holl +路易斯 康,louis Kahn +密斯·凡·德·罗,Mies van der Rohe +石上纯也,Junya Ishigami +诺曼福斯特,Norman foster +彼得·艾森曼,Peter Eisenman +罗伯特·文丘里,Robert Venturi +拉斐尔·莫内欧,Rafael Moneo +扎哈哈迪德,Zaha Hadid + +超高层,supertall building +洋房住宅/多层建筑,multi-storey block +公寓,flat apartment +别墅,bungalow cottage +办公,office +酒店,hotel +民宿,homestay +剧院,theater +图书馆,library +商业,commercial +综合体,building complex +医院,hospital +学校,school +幼儿园,kindergarten +商业建筑,business premises +纪念碑,monument +博物馆,museum +中式合院,Chinese courtyard house +高铁站,High speed railway station HRS station + +玻璃,glass +铝板,aluminium sheet +清玻璃,clear glass +冲孔板,Punching board +大理石,marble +花岗岩,granite +洞石,Travertine +陶板,Ceramic plate +杜邦纸,DuPont paper +碳纤维,carbon fiber +塑料,plastic +石材,Stone +竹,bamboo +木材,timber/wood +原木,timber/wood +刨花板,strand board +多层板,Plywood/Laminated wood +不锈钢,stainless steel +砖,bricks +瓦,tiles +砂浆,mortar +石子,Stones +水泥,cement +混凝土,concrete +沥青,asphalt +金属,metallic +油漆,paint +卵石,pebble +塑料,plastic + +U型玻璃,U-shaped glass +飘窗,bay window +壁炉,fireplace +天花,ceiling +地板,floor +楼梯,stair +自动扶梯,escalator +天井,atrium +旋转楼梯,spiral stair +坡道,ramp +道闸,barrier gate +卷闸门,roller shutter +游乐园,amusement park +秋千,swing +跷跷板,seesaw +滑梯,slide +遮阳篷,sunshade/awning/canopy +太阳伞,parasol +花架,flower stand +花圃,flower bed +水幕,water curtain +灌木,bush/shrub +绿篱,hedge + +人视图,Human perspective +侧视图,Side view +前视图,front view +上视图,top view +鸟瞰透视图,bird's-eye perspective view/aerial perspective view +自上而下,无人机,A top down drone visual +鸟瞰图,bird's eye perspective/ aerial view +一点透视,one-point perspective +两点透视,two-point perspective +三点透视,three-point perspective +五点透视/鱼眼,Fisheye Perspective +森林,forest +道路,road +小径,path +公园,park +花园,garden +停车场,parking lot +滨海,Coastal/Lakeside +草原,grassland + +雪山,snow - capped mountains +河流,river +海湾,bight/bay +雪地,snowfield +环绕,surrounding +上,over +下,under +左,left +右,right +远处,distant +近处,near +前景,front view +背景,background +漂浮,floating + +透明的,transparent +半透明的,translucent +朦胧的,hazy +不确定的,uncertain +流动的,flowing +永恒的,eternal +反射的,reflective +磨砂的,frosted +多孔的,porous +粗糙的,rough +光滑的,smooth +镜面的,mirror-like +锋利的,sharp +曲线的,curve +直线的,straight +宽的,wide +窄的,narrow +长的,long +短的,short +冷的,cold +暖的,warm +浅色的,light color +写实的,realistic +干净的,clearing + +明亮的,bright +白色,white +黑色,black +灰色,grey +赭石色,Ochre color +砖红色,brick red +柠檬黄,lemon yellow + +圆柱体,cylinder +棱柱体,prism +球形,spherical +立方体,cube +锥形,cone + +客厅,living room +卧室,bed room +走廊/过道,hallway +通道/游廊,corridors +大堂/大厅,lobby +浴室,bathroom +厕所/盥洗室,lavatory +厕所/洗手间,toilet +公卫,restroom +洗手间,washroom +踢脚线,skirting board +铰链,hinges +拉手,handle +脚轮,castor +抽屉,drawer +滑轨,drawer sliders/runner +水龙头,faucet +开关,switches +插座,sockets +长凳/长榻,bench +单人有扶手沙发,armchair +懒人椅/休闲椅/躺椅,lounger chair +带两翼的单人沙发,wing chair +懒人沙发,bean bag +长沙发/长榻,couch +圆背椅/圈椅,tub chair +吊椅/躺椅,sling chair +活动躺椅,reclining chair +折叠躺椅,deck chair +沙发垫,cushion +垫子,mat +沙发组,group sofa +餐椅,dining chair +转椅,rotary chair +便椅/普通单椅,easy chair +藤编椅,rattan chair +高背的,highback +矮背的,lowback +折叠椅,folding chair +茶几/咖啡桌,coffee table +吧椅,bar stool/stool +床架,bedstead +双层床,bunk bed +双人床,double-bed +沙发床,divan +儿童床,child cot/children’s bed +婴儿床,cot +带蚊帐的床,canopy bed +床背板,headboard +床头柜,bedside cabint/table +储物柜,storage +靠墙桌/边柜,console table +衣柜,wardrobe +带锁更衣柜,locker +洗手台,lavabo +书架,bookshelf +挂衣杆,clothes rail +书柜,bookcase +柜台,counter +行李架,luggage rack +酒柜,cocktail cabinet +玻璃陈设柜,glass case/glass cabinet +碗橱,aparador /hutch +橱柜,cabinet/cupboard +洗手盆,basin +厨房洗手槽,sink +冰箱,refrigeator +集成灶,range +烤箱,oven +整体橱柜,cabinet combination +厨房岛台,island counter +电炉/烤盘,hotplate +炉灶,cooktop/cooker/stove +抽油烟机,cooker hood/range hood +地毯,carpet +地毯(小),rug +窗帘,curtain +工艺品,crafts /artwork +枕头,pillow +屏风,screen +棉,cotton +麻,hemp +黄麻,jute +亚麻,linen +羊毛,wool +丝绸,silk +尼龙,nylon +香云纱,Xiangyun yarn + + +射灯/筒灯,spotlight +吊灯,pendant lamp/light +台灯,table lamp/light +吸顶灯,ceiling lamp/light +格栅灯,grille light +导轨灯,track lamp/light +柱灯,pillar lamp/light +落地灯,floor lamp/light +壁灯,wall lamp/light +枝形吊灯,chandeliers +灯泡,lamp bulb +户外灯,outdoor lamp/light +庭院灯,garden lamp/light +草地灯,lawn lamp/ligh + + + +一个女孩,1girl +一个男孩,1boy +1个男性,1boy +1个女性,1girl +单格漫画,1koma +1个非人类,1other +2个男性,2boys +2个女性,2girls +双分镜漫画,2koma +2个非人类,2others +3个男性,3boys +3D,3d +3个女性,3girls +三格漫画,3koma +4个男性,4boys +4个女性,4girls +四格漫画,4koma +5个男性,5boys +5个女性,5girls +五格漫画,5koma +6个以上的男性,6+boys +6个以上的女性,6+girls +初音未来,39 +69(体位),69 +20世纪70年代(风格),1970s_(style) +20世纪80年代(风格),1980s_(style) +90年代(风格),1990s_(style) +腹肌,abs +大合影|全家福,absolutely_everyone +抽象,abstract +抽象背景,abstract_background +超长的头发,absurdly_long_hair +虐待,abuse +原声吉他,acoustic_guitar +亚克力画风,acrylic_paint_(medium) +原设服装改编,adapted_costume +改装制服,adapted_uniform +拨弄衣服,adjusting_clothes +扶眼镜,adjusting_eyewear +拨弄手套,adjusting_gloves +理头发,adjusting_hair +整理帽子,adjusting_hat +整理头饰,adjusting_headwear +整理裤袜,adjusting_legwear +拨内裤,adjusting_panties +拨弄泳衣,adjusting_swimsuit +空中烟火,aerial_fireworks +原初之火(武器),aestus_estus +漂浮,afloat +鸟窝头|爆炸头,afro +肛交后,after_anal +口交后,after_fellatio +接吻后,after_kiss +乳交后,after_paizuri +事后,after_sex +事后阴道特写,after_vaginal +事后余韵,afterglow +残像,afterimage +压在玻璃上,against_glass +压在墙上,against_wall +年龄差,age_difference +年龄段展现,age_progression +幼化,age_regression +阿嘿颜,ahegao +呆毛,ahoge +绶带,aiguillette +瞄准,aiming +瞄准了读者(的视角),aiming_at_viewer +阿伊努人的服饰,ainu_clothes +气泡,air_bubble +空调,air_conditioner +飞行器,aircraft +飞机,airplane +飞艇,airship +笠帽,ajirogasa +拉下眼睑的鬼脸,akanbe +新年伊始,akeome +闹钟,alarm_clock +白化病,albino +专辑封面,album_cover +酒,alcohol +外星人,alien +母豹pose,all_fours +小巷,alley +阿罗拉形态(宝可梦),alolan_form +阿尔法签名(推特用),alpha_signature +阿尔卑斯风格,alphes_(style) +贫乳化|爆乳化,alternate_breast_size +异色,alternate_color +角色换装,alternate_costume +异瞳,alternate_eye_color +形态变化,alternate_form +发色变化,alternate_hair_color +头发长度变化,alternate_hair_length +发型变化,alternate_hairstyle +头饰变化,alternate_headwear +腿部穿着变化,alternate_legwear +肤色变化,alternate_skin_color +武器变化,alternate_weapon +不同的翅膀,alternate_wings +性别暧昧,ambiguous_gender +美国,america +星条旗,american_flag +星条旗比基尼,american_flag_bikini +弹药,ammunition +弹药带,ammunition_belt +放大器(音响),amplifier_(instrument) +欠损|独脚|肢体残缺|欠损少女,amputee +护身符,amulet +肛门|肛交|着衣肛交|骑乘位肛交,anal +肛珠,anal_beads +手指刺激肛门,anal_fingering +肛毛,anal_hair +肛门物体插入,anal_object_insertion +肛尾,anal_tail +锚,anchor +锚形项圈,anchor_choker +锚形发饰,anchor_hair_ornament +锚形印花,anchor_print +锚形符号,anchor_symbol +可男可女,androgynous +人造人,android +天使,angel +天使与恶魔,angel_and_devil +天使的翅膀,angel_wings +青筋,anger_vein +鮟鱇,anglerfish +安哥拉兔,angora_rabbit +发怒,angry +舔肛门,anilingus +动物,animal +动物系装扮,animal_costume +耳绒毛,animal_ear_fluff +兽耳,animal_ears +以动物为焦点,animal_focus +动物主题帽饰,animal_hat +兽耳头罩,animal_hood +抱着动物,animal_hug +头上有动物,animal_on_head +膝盖上有动物,animal_on_lap +肩上有动物,animal_on_shoulder +动物印花,animal_print +动物化,animalization +动画上色,anime_coloring +安库(假面骑士OOO),ankh +踝靴,ankle_boots +脚踝系带,ankle_lace-up +踝带,ankle_ribbon +踝袜,ankle_socks +脚踝丝环,ankle_strap +脚踝套,ankle_wrap +脚镯,anklet +纪念日,anniversary +贺图,announcement_celebration +恼怒的,annoyed +呆毛,antenna_hair +触角,antennae +反器材步枪,anti-materiel_rifle +旧式火器,antique_firearm +鹿角,antlers +肛门,anus +吸睛的肛门,anus_peek +安齐奥军服,anzio_military_uniform +安齐奥校服,anzio_school_uniform +苹果,apple +化妆,applying_makeup +愚人节,april_fools +围裙,apron +水蓝色背景,aqua_background +水蓝色蝴蝶结,aqua_bow +水蓝色连衣裙,aqua_dress +水蓝色眼睛,aqua_eyes +水蓝色头发,aqua_hair +水蓝色的裤袜,aqua_legwear +水蓝色指甲,aqua_nails +水蓝色领结,aqua_neckwear +水蓝色内裤,aqua_panties +水蓝色丝带,aqua_ribbon +水蓝色衬衫,aqua_shirt +水蓝色短裙,aqua_skirt +水蓝色,aqua_theme +水族馆,aquarium +AR-15,ar-15 +阿拉伯服饰,arabian_clothes +蜘蛛属性,arachne +荒木飞吕彦风,araki_hirohiko_(style) +爱尔兰毛衣,aran_sweater +拱背,arched_back +射箭,archery +建筑,architecture +乳晕走光,areola_slip +乳晕,areolae +菱形花纹,argyle +菱形背景,argyle_background +含菱形花纹的裤子,argyle_legwear +亚利亚公司制服,aria_company_uniform +单手搂着脖子,arm_around_neck +单手搂腰,arm_around_waist +单手垂放,arm_at_side +单手背到身后,arm_behind_back +单手托在脑后,arm_behind_head +臂环,arm_belt +手臂刀刃,arm_blade +手炮,arm_cannon +手臂袜带,arm_garter +抓住手臂,arm_grab +护臂,arm_guards +手臂往后拉,arm_held_back +挽手,arm_hug +手臂丝带,arm_ribbon +手臂支撑动作,arm_support +手臂有纹身,arm_tattoo +托胸,arm_under_breasts +举着手,arm_up +暖袖,arm_warmers +袖套,arm_wrap +腕标,armband +扶手椅,armchair +臂章,armlet +盔甲,armor +装甲靴,armored_boots +战甲裙,armored_dress +腋窝皱褶,armpit_crease +衣服有能露出腋下的开口,armpit_cutout +腋毛,armpit_hair +吸睛的腋窝,armpit_peek +腋下,armpits +双手搂着脖子,arms_around_neck +双手垂放,arms_at_sides +双手背到身后,arms_behind_back +双手抱头,arms_behind_head +双手托胸,arms_under_breasts +举起手,arms_up +军队,army +破面(死神系列),arrancar +箭矢,arrow_(projectile) +箭头,arrow_(symbol) +画笔,art_brush +新艺术,art_nouveau +美术集,artbook +飞机杯,artificial_vagina +作者署名,artist_name +以作者本人为蓝本,artist_self-insert +麻叶纹,asa_no_ha_(pattern) +阿斯科特领巾,ascot +阿什福特学院制服,ashford_academy_uniform +亚洲人,asian +窒息,asphyxiation +屁股,ass +屁股贴屁股,ass-to-ass +露出屁股的服饰,ass_cutout +屁股为焦点,ass_focus +抓着屁股,ass_grab +晃屁股,ass_shake +托着屁股,ass_support +正面可见的屁股,ass_visible_through_thighs +突击步枪,assault_rifle +剥开衣服,assisted_exposure +不对称的刘海,asymmetrical_bangs +不对称的服饰,asymmetrical_clothes +胸部互碰|胸顶着胸,asymmetrical_docking +不对称的手套,asymmetrical_gloves +不对称的发型,asymmetrical_hair +不对称的角,asymmetrical_horns +不对称的裤子,asymmetrical_legwear +不对称的袖子,asymmetrical_sleeves +不对称的翅膀,asymmetrical_wings +体操服,athletic_leotard +攻击,attack +有音乐,audible_music +有人声,audible_speech +观众,audience +灵气,aura +汽车人,autobot +秋天,autumn +秋叶,autumn_leaves +斧头,axe +宝宝,baby +性感女性睡衣,babydoll +后背,back +背面印花的内裤,back-print_panties +袜缝,back-seamed_legwear +背靠背,back-to-back +(服饰)打在背后的结,back_bow +露出部分后背的衣服,back_cutout +背部纹身,back_tattoo +从背后可看到的胸部,backboob +背景文字,background_text +露背连衣裙,backless_dress +露背装,backless_outfit +逆光,backlighting +背包,backpack +卸下的包,backpack_removed +反扣的帽子,backwards_hat +错误的人体结构,bad_anatomy +坏结局,bad_end +脚的画法错误,bad_feet +手的画法错误,bad_hands +错误的(画法)透视,bad_perspective +人体比例不协调,bad_proportions +徽章,badge +包,bag +背包挂件,bag_charm +金鱼水袋,bagged_fish +眼袋,bags_under_eyes +法国面包,baguette +(保持)平衡的姿势,balancing +秃头,bald +球,ball +口球,ball_gag +芭蕾舞女演员,ballerina +芭蕾舞鞋,ballet_slippers +气球,balloon +圆珠笔,ballpoint_pen_(medium) +竹子,bamboo +竹扫帚,bamboo_broom +竹林,bamboo_forest +蒸笼,bamboo_steamer +香蕉,banana +乐队制服,band_uniform +贴有绷带的脸,bandage_on_face +缠着绷带的单眼,bandage_over_one_eye +缠着绷带的手臂,bandaged_arm +缠着绷带的手,bandaged_hands +绑着绷带的头,bandaged_head +绑着绷带的腿,bandaged_leg +绷带,bandages +创可贴,bandaid +手臂上贴着创可贴,bandaid_on_arm +脸上贴着创可贴,bandaid_on_face +额头贴着创可贴,bandaid_on_forehead +膝盖贴着创可贴,bandaid_on_knee +腿上贴着创可贴,bandaid_on_leg +鼻梁上贴着创可贴,bandaid_on_nose +阴部贴着创可贴,bandaid_on_pussy +乳头贴着创可贴,bandaids_on_nipples +头巾,bandana +抹胸比基尼,bandeau +子弹带,bandolier +手镯,bangle +刘海,bangs +掀起的刘海,bangs_pinned_back +包子,baozi +酒吧,bar +黑条打码,bar_censor +大叔,bara +条形码,barcode +条形码纹身,barcode_tattoo +裸臂,bare_arms +裸背,bare_back +裸臀,bare_hips +裸腿,bare_legs +露肩,bare_shoulders +枯木,bare_tree +裸足,barefoot +裸足凉鞋,barefoot_sandals +桶子,barrel +酒保,bartender +棒球,baseball +棒球棍,baseball_bat +棒球帽,baseball_cap +棒球手套,baseball_mitt +棒球服,baseball_uniform +篮子,basket +篮球,basketball +低音符,bass_clef +贝斯,bass_guitar +蝙蝠,bat +蝙蝠耳朵,bat_ears +蝙蝠发饰,bat_hair_ornament +蝙蝠印花,bat_print +蝙蝠翅膀,bat_wings +洗澡,bath +浴凳,bath_stool +浴衣,bath_yukata +入浴,bathing +浴室,bathroom +浴缸,bathtub +指挥棒,baton +战斗|比试|战争|游戏,battle +战斧,battle_axe +战术步枪,battle_rifle +战舰,battleship +刺刀,bayonet +BC自由学园制服,bc_freedom_military_uniform +SM,bdsm +海滩,beach +沙滩椅,beach_chair +沙滩巾,beach_towel +沙滩伞,beach_umbrella +沙滩球,beachball +珠子手链,bead_bracelet +珠子项链,bead_necklace +珠子,beads +光束,beam +激光步枪,beam_rifle +八分音符,beamed_eighth_notes +十六分音符,beamed_sixteenth_notes +无檐帽,beanie +豆,beans +熊,bear +熊耳朵,bear_ears +熊印花头饰,bear_hair_ornament +熊印花内裤,bear_panties +熊印花,bear_print +胡须,beard +招手,beckoning +床,bed +床单,bed_sheet +卧室,bedroom +啤酒,beer +听装啤酒,beer_can +啤酒杯,beer_mug +从后脑戴上的耳机,behind-the-head_headphones +米色背景,beige_background +米色毛衣,beige_sweater +钟|铃,bell +铃铛项圈,bell_choker +腹部,belly +脐链,belly_chain +捏着肚子,belly_grab +皮带|腰带,belt +皮带靴,belt_boots +皮带扣,belt_buckle +皮项圈,belt_collar +腰包,belt_pouch +皮带束胸,beltbra +皮带裙,beltskirt +长凳,bench +弯折的吸管,bendy_straw +弯腰,bent_over +贝雷帽,beret +戴眼镜的,bespectacled +兽交|兽奸|异种奸,bestiality +乳沟里,between_breasts +(突出)指间,between_fingers +(强调)两腿之间,between_legs +(强调)大腿之间,between_thighs +围兜,bib +自行车,bicycle +自行车篮,bicycle_basket +大肚子,big_belly +头发很多的,big_hair +紧身裤,bike_shorts +自行车运动服,biker_clothes +骑行套装,bikesuit +比基尼,bikini +比基尼盔甲,bikini_armor +从三点剥开的比基尼,bikini_aside +比基尼泳裤,bikini_bottom +往上剥开的比基尼,bikini_lift +扯着比基尼,bikini_pull +比基尼裙,bikini_skirt +泳装晒痕,bikini_tan +只穿着比基尼上衣,bikini_top +脱下了比基尼上衣,bikini_top_removed +比基尼当内衣,bikini_under_clothes +双筒望远镜,binoculars +鸟|拟鸟化,bird +手上的鸟,bird_on_hand +头上的鸟,bird_on_head +肩上的鸟,bird_on_shoulder +鸟尾巴,bird_tail +(哔哩哔哩)的电流,biribiri +生日,birthday +生日蛋糕,birthday_cake +胎记,birthmark +双性恋(女性),bisexual_(female) +毘沙门天宝塔,bishamonten's_pagoda +毘沙门天之矛,bishamonten's_spear +主教(国际象棋),bishop_(chess) +美少年,bishounen +勒住嘴,bit_gag +咬痕,bite_mark +咬,biting +琵琶,biwa_lute +黑框眼镜,black-framed_eyewear +黑色围裙,black_apron +黑色盔甲,black_armor +黑色背景,black_background +黑色腰带,black_belt +黑色比基尼,black_bikini +黑色眼罩,black_blindfold +黑色上衣,black_blouse +黑色紧身衣,black_bodysuit +黑色边框,black_border +黑蝴蝶结,black_bow +黑色内衣,black_bra +黑色吊带背心,black_camisole +黑色斗篷,black_cape +黑色披肩,black_capelet +黑色开口衬衫,black_cardigan +黑猫,black_cat +黑色项链,black_choker +黑色外套,black_coat +黑色外套(王国之心),black_coat_(kingdom_hearts) +黑色项圈,black_collar +黑色连衣裙,black_dress +黑色眼睛,black_eyes +黑色羽毛,black_feathers +黑色的花,black_flower +黑色的鞋,black_footwear +黑色手套,black_gloves +黑发,black_hair +黑色发带,black_hairband +黑色的袴,black_hakama +黑色头巾,black_headband +黑色头饰,black_headwear +黑色连帽衫,black_hoodie +黑色的角,black_horns +黑色夹克,black_jacket +黑色和服,black_kimono +黑丝袜,black_legwear +黑色连身衣,black_leotard +黑色指甲,black_nails +黑色领子,black_neckwear +黑色内裤,black_panties +黑色长裤,black_pants +黑色丝带,black_ribbon +黑玫瑰,black_rose +黑色水手领,black_sailor_collar +黑色围巾,black_scarf +黑色巩膜,black_sclera +黑色发束,black_scrunchie +黑色水手服,black_serafuku +黑色衬衫,black_shirt +黑色短裤,black_shorts +黑色皮肤,black_skin +黑色短裙,black_skirt +黑色袖子,black_sleeves +黑色西装,black_suit +黑色毛衣,black_sweater +黑色泳衣,black_swimsuit +黑色尾巴,black_tail +黑色无袖紧身背心,black_tank_top +黑色主题,black_theme +黑色的伞,black_umbrella +黑色背心,black_vest +黑色翅膀,black_wings +剑,blade +无神的双眼,blank_eyes +空话泡泡,blank_speech_bubble +无主的眼神,blank_stare +毯子,blanket +西装夹克,blazer +出血,bleeding +遮眼|眼罩,blindfold +百叶窗,blinds +眨眼睛,blinking +金发,blonde_hair +金色阴毛,blonde_pubic_hair +血,blood +嘴角流血,blood_from_mouth +脸上有血,blood_on_face +溅血,blood_splatter +血迹,blood_stain +衣服沾血,bloody_clothes +血泪,bloody_tears +沾血的武器,bloody_weapon +盛开,bloom +灯笼裤,bloomers +罩衫,blouse +吹,blowing +蓝框眼镜,blue-framed_eyewear +蓝色盔甲,blue_armor +蓝色背景,blue_background +蓝色饰带,blue_belt +蓝色比基尼,blue_bikini +蓝色上衣,blue_blouse +蓝色紧身衣,blue_bodysuit +蓝色蝴蝶结,blue_bow +蓝色胸罩,blue_bra +蓝色短裤,blue_buruma +蓝色的蝴蝶,blue_butterfly +蓝色斗篷,blue_cape +蓝色披肩,blue_capelet +蓝色开襟衫,blue_cardigan +蓝色项圈,blue_choker +蓝色外套,blue_coat +蓝色洋装,blue_dress +蓝眼睛,blue_eyes +蓝色的火焰,blue_fire +蓝色的花,blue_flower +蓝色鞋类,blue_footwear +蓝色手套,blue_gloves +蓝色头发,blue_hair +蓝色发带,blue_hairband +蓝色的袴,blue_hakama +蓝色头巾,blue_headband +蓝色头饰,blue_headwear +蓝色连帽衫,blue_hoodie +蓝色的角,blue_horns +蓝色夹克,blue_jacket +蓝色和服,blue_kimono +蓝色紧身裤,blue_legwear +蓝色连身衣,blue_leotard +蓝色指甲,blue_nails +蓝色的领饰,blue_neckwear +蓝色内裤,blue_panties +蓝色长裤,blue_pants +蓝色丝带,blue_ribbon +蓝玫瑰,blue_rose +蓝色水手领,blue_sailor_collar +蓝色沙滩裙,blue_sarong +蓝色围巾,blue_scarf +蓝色发束,blue_scrunchie +蓝色水手服,blue_serafuku +蓝色衬衫,blue_shirt +蓝色短裤,blue_shorts +蓝色皮肤,blue_skin +蓝色短裙,blue_skirt +蓝天,blue_sky +蓝色袖子,blue_sleeves +蓝色毛衣,blue_sweater +蓝色泳衣,blue_swimsuit +蓝色,blue_theme +蓝色背心,blue_vest +蓝色翅膀,blue_wings +蓝莓,blueberry +直刘海,blunt_bangs +模糊打码,blur_censor +模糊的,blurry +模糊的背景,blurry_background +前景模糊,blurry_foreground +脸红,blush +表情贴纸,blush_stickers +猪,boar +棋盘游戏,board_game +船,boat +波波头,bob_cut +美式短袜,bobby_socks +身体泛红,body_blush +体毛,body_hair +在身上写字,body_writing +人体彩绘,bodypaint +连体裤袜,bodystocking +连体紧身衣,bodysuit +紧身衣裤,bodysuit_under_clothes +焦外成像,bokeh +木剑,bokken +《如今的我们》,bokura_wa_ima_no_naka_de +波洛领带,bolo_tie +栓式枪机,bolt_action +炸弹,bomb +飞行员夹克,bomber_jacket +捆绑play,bondage +束缚服,bondage_outfit +骨头,bone +骨头状饰品,bone_hair_ornament +带骨肉,boned_meat +bonnet女帽,bonnet +书,book +抱着书,book_hug +书堆,book_stack +书包,bookbag +书签,bookmark +书架,bookshelf +长靴,boots +脱下的靴子,boots_removed +边框,border +无聊的,bored +借鉴的角色,borrowed_character +借鉴的设计,borrowed_design +男友的衣服,borrowed_garments +瓶子,bottle +裸下半身,bottomless +乳摇,bouncing_breasts +绑住的,bound +手臂被束缚,bound_arms +腿被束缚,bound_legs +手腕被束缚,bound_wrists +花束,bouquet +蝴蝶结,bow +弦类乐器,bow_(instrument) +弓,bow_(weapon) +带蝴蝶结的比基尼,bow_bikini +带蝴蝶结的胸罩,bow_bra +带蝴蝶结的发带,bow_hairband +带蝴蝶结的裤袜,bow_legwear +带蝴蝶结的内裤,bow_panties +鞠躬,bowing +碗,bowl +锅盖头,bowl_cut +碗状帽子,bowl_hat +领结,bowtie +箱子,box +刻刀,boxcutter +拳击手,boxers +拳套,boxing_gloves +男上体位,boy_on_top +胸罩,bra +掀上去的胸罩,bra_lift +微微露出胸罩,bra_peek +拉下的胸罩,bra_pull +摘下的胸罩,bra_removed +内衣肩带,bra_strap +透过衣服可以看到胸罩,bra_through_clothes +手镯,bracelet +护腕,bracer +三股辫,braid +辫子刘海,braided_bangs +辫式发髻,braided_bun +麻花辫马尾,braided_ponytail +分支,branch +名牌neta,brand_name_imitation +面包,bread +乳房绑缚,breast_bondage +对胸部在意,breast_conscious +乳帘,breast_curtains +胸围歧视,breast_envy +丰胸,breast_expansion +揉胸,breast_grab +抱着胸部,breast_hold +托着胸部,breast_lift +胸垫,breast_padding +乳袋,breast_pocket +戳胸,breast_poke +挤压胸部,breast_press +搁着胸部,breast_rest +胸部走光,breast_slip +洗面奶,breast_smother +从两侧挤胸,breast_squeeze +吸奶,breast_sucking +手放在乳房上方,breast_suppress +胸部纹身,breast_tattoo +露胸的衣服,breastless_clothes +胸甲,breastplate +乳房,breasts +向两侧摊开的乳房,breasts_apart +胸压玻璃,breasts_on_glass +压在头上的乳房,breasts_on_head +露出的乳房,breasts_outside +呼吸,breath +喷火,breathing_fire +砖,brick +砖墙,brick_wall +婚纱手套,bridal_gauntlets +婚纱头纱,bridal_veil +新娘,bride +桥,bridge +公文包,briefcase +男式内裤,briefs +明亮的瞳孔,bright_pupils +破碎的,broken +碎玻璃,broken_glass +断角,broken_horn +破碎的武器,broken_weapon +胸针,brooch +扫帚,broom +骑扫帚,broom_riding +姐弟|兄妹,brother_and_sister +兄弟,brothers +棕色镜框眼镜,brown-framed_eyewear +棕色背景,brown_background +棕色皮带,brown_belt +棕色比基尼,brown_bikini +棕色蝴蝶结,brown_bow +棕色斗篷,brown_cape +棕色开襟衫,brown_cardigan +棕色外套,brown_coat +棕色连衣裙,brown_dress +棕色的眼睛,brown_eyes +棕色鞋类,brown_footwear +棕色手套,brown_gloves +棕色头发,brown_hair +棕色发带,brown_hairband +棕色头饰,brown_headwear +棕色夹克,brown_jacket +棕色和服,brown_kimono +棕色的裤袜,brown_legwear +棕色领饰,brown_neckwear +棕色内裤,brown_panties +棕色长裤,brown_pants +褐色丝带,brown_ribbon +棕色水手领,brown_sailor_collar +棕色围巾,brown_scarf +棕色衬衫,brown_shirt +棕色短裤,brown_shorts +棕色短裙,brown_skirt +棕色毛衣,brown_sweater +棕色主题,brown_theme +棕色马甲,brown_vest +李小龙款连身衣,bruce_lee's_jumpsuit +挫伤,bruise +刷子,brush +刷牙,brushing_teeth +泡泡,bubble +吹泡泡,bubble_blowing +蓬蓬裙,bubble_skirt +珍珠奶茶,bubble_tea +不用手喝珍珠奶茶挑战,bubble_tea_challenge +龅牙,buck_teeth +水桶,bucket +扣,buckle +绷带裹胸,budget_sarashi +建筑,building +射满脸,bukkake +裆部隆起,bulge +子弹,bullet +牛犊式枪械,bullpup +欺负,bullying +发髻盖,bun_cover +手持辫子,bunching_hair +兔子,bunny +兔子服装,bunny_costume +兔耳,bunny_ears +兔娘|兔女郎,bunny_girl +兔子饰品,bunny_hair_ornament +兔耳帽,bunny_hood +兔耳pose,bunny_pose +兔子图案,bunny_print +兔尾巴,bunny_tail +烧伤疤痕,burn_scar +燃烧,burning +烧焦的衣服,burnt_clothes +包不住的巨乳,bursting_breasts +女式灯笼裤,buruma +内裤掰到一边,buruma_aside +褪下一部分的内裤,buruma_pull +公共汽车,bus +公交站,bus_stop +灌木,bush +西装,business_suit +紧身胸衣,bustier +管家,butler +股沟,butt_crack +肛塞,butt_plug +蝴蝶,butterfly +蝴蝶发饰,butterfly_hair_ornament +捕虫网,butterfly_net +蝴蝶印花,butterfly_print +蝴蝶坐,butterfly_sitting +蝴蝶翅膀,butterfly_wings +臀交,buttjob +纽扣徽章,button_badge +纽扣式画法的眼睛,button_eyes +天使的小窗,button_gap +纽扣,buttons +c:,c: +报童帽,cabbie_hat +储藏柜,cabinet +线缆,cable +仙人掌,cactus +咖啡馆,cafe +笼子,cage +蛋糕,cake +蛋糕块,cake_slice +日历,calendar_(medium) +图中有日历,calendar_(object) +毛笔,calligraphy_brush +毛笔(中号),calligraphy_brush_(medium) +摄像机,camcorder +山茶花,camellia +骆驼趾,cameltoe +精彩片段,cameo +照相机,camera +无袖紧身背心,camisole +迷彩,camouflage +缶,can +俺妹元素,can't_be_this_cute +蜡烛,candle +烛台,candlestand +糖果,candy +苹果糖,candy_apple +棒棒糖,candy_cane +杖,cane +炮,cannon +画布,canvas_(object) +披风,cape +披肩,capelet +七分裤,capri_pants +汽车,car +汽车内饰,car_interior +卡片,card +卡片(中号图),card_(medium) +玩牌,card_parody +纸板箱,cardboard_box +开襟衫,cardigan +爱抚睾丸,caressing_testicles +地毯,carpet +胡萝卜,carrot +胡萝卜发饰,carrot_hair_ornament +胡萝卜项链,carrot_necklace +抱起,carrying +肩扛,carrying_over_shoulder +夹在腋下,carrying_under_arm +载货推车,cart +弹夹,cartridge +子弹抛壳,casing_ejection +包扎着绷带,cast +城堡,castle +私服,casual +连体泳衣,casual_one-piece_swimsuit +猫,cat +猫系男子,cat_boy +猫系服装,cat_costume +胸部开口出猫头剪影的服装,cat_cutout +猫之日,cat_day +猫耳式耳机,cat_ear_headphones +猫耳造型内裤,cat_ear_panties +猫耳,cat_ears +猫娘,cat_girl +猫系发饰,cat_hair_ornament +猫耳帽子,cat_hat +猫耳兜帽,cat_hood +猫系内衣,cat_lingerie +头上趴着猫,cat_on_head +猫爪,cat_paws +猫元素印花,cat_print +猫尾巴,cat_tail +逗猫,cat_teaser +标志性台词,catchphrase +履带,caterpillar_tracks +被抓现行,caught +焦散,caustics +警示带,caution_tape +洞穴,cave +光碟,cd +天花板,ceiling +顶灯,ceiling_light +手机,cellphone +手机相片,cellphone_picture +打码的,censored +人马|马娘,centaur +胸口带花边的服装,center_frills +正中间开口的服装,center_opening +子宫口,cervix +锁链,chain +铁网,chain-link_fence +项链,chain_necklace +链锯,chainsaw +椅子,chair +迦勒底制服,chaldea_uniform +粉笔,chalk +黑板,chalkboard +香槟,champagne +香槟杯,champagne_flute +吊灯,chandelier +皮套裤,chaps +用人脸打码,character_censor +角色公仔,character_doll +含角色名,character_name +印着某角色,character_print +角色简介,character_profile +人设表,character_sheet +含该角色的签名,character_signature +强调角色魅力的,charm_(object) +图表,chart +追逐,chasing +聊天框风格,chat_log +出轨,cheating +棋盘曲奇,checkerboard_cookie +方格图案,checkered +方格背景,checkered_background +格子地板,checkered_floor +格子和服,checkered_kimono +格子领口,checkered_neckwear +格子围巾,checkered_scarf +格子衬衫,checkered_shirt +格子裙,checkered_skirt +脸贴脸,cheek-to-cheek +鼓着腮帮,cheek_bulge +捏脸颊,cheek_pinching +戳脸颊,cheek_poking +扯脸颊,cheek_pull +助威,cheering +啦啦队,cheerleader +奶酪,cheese +厨师,chef +厨师帽,chef_hat +厨师工装,chef_uniform +女式无袖宽内衣,chemise +樱花|樱桃,cherry +樱花印花,cherry_blossom_print +樱花,cherry_blossoms +棋子,chess_piece +胸部,chest +胸毛,chest_hair +胸前有宝石,chest_jewel +抽屉,chest_of_drawers +胸部有纹身,chest_tattoo +栗子嘴,chestnut_mouth +口香糖,chewing_gum +知波单学院制服,chi-hatan_military_uniform +迷你角色,chibi +内含Q版形象,chibi_inset +小鸡,chick +鸡,chicken +小孩,child +儿童画作,child_drawing +孩子们,child_on_child +烟囱,chimney +抬下巴,chin_grab +托着下巴,chin_rest +颐革,chin_strap +摸下巴,chin_stroking +中国服饰,china_dress +中国风,chinese_clothes +中文,chinese_text +十二生肖,chinese_zodiac +薯片,chips +巧克力,chocolate +巧克力板,chocolate_bar +心形巧克力,chocolate_heart +胸上的巧克力,chocolate_on_breasts +项圈,choker +猪口杯,choko_(cup) +筷子,chopsticks +圣诞节,christmas +圣诞彩灯,christmas_lights +圣诞树,christmas_tree +色差,chromatic_aberration +教堂,church +雪茄,cigar +香烟,cigarette +圆圈,circle +单行本概念预告图,circle_cut +封绘署名,circle_name +头冠,circlet +城市,city +城市灯光背景,city_lights +城市背景,cityscape +教室,classroom +爪(武器),claw_(weapon) +嘎哦,claw_pose +爪,claws +打扫,cleaning +乳沟,cleavage +露出的乳沟,cleavage_cutout +被封住嘴,cleave_gag +砍肉刀,cleaver +蜜缝,cleft_of_venus +握紧的手,clenched_hand +紧握的双手,clenched_hands +牙齿紧咬,clenched_teeth +悬崖,cliff +攀爬,climbing +写字板,clipboard +阴蒂包皮,clitoral_hood +玩弄阴蒂,clitoral_stimulation +阴核,clitoris +阴核环,clitoris_piercing +斗篷,cloak +时钟,clock +钟楼,clock_tower +克隆,clone +特写镜头,close-up +闭上的眼睛,closed_eyes +折扇,closed_fan +闭着的嘴,closed_mouth +收着的伞,closed_umbrella +布,cloth +有衣着的动物,clothed_animal +穿着衣服的女性和裸女,clothed_female_nude_female +穿着衣服的女性和裸男,clothed_female_nude_male +穿着衣服的男性和裸女,clothed_male_nude_female +穿着衣服自慰,clothed_masturbation +有衣着的口袋妖怪,clothed_pokemon +着衣性交,clothed_sex +衣服,clothes +衣服系在腰上,clothes_around_waist +衣服勒在乳沟里,clothes_between_breasts +衣物挂到大腿之间,clothes_between_thighs +褪下衣物,clothes_down +扯着衣物,clothes_grab +衣架,clothes_hanger +掀起衣物,clothes_lift +拉开衣物,clothes_pull +脱下的衣服,clothes_removed +偷衣服,clothes_theft +拖拽衣服,clothes_tug +印着字的T恤,clothes_writing +晒衣服,clothesline +云,cloud +云絮状发型,cloud_hair +云状印花,cloud_print +多云天气,cloudy_sky +三叶草,clover +三叶草发饰,clover_hair_ornament +棍棒,club +梅花(扑克),club_(shape) +外套,coat +燕尾,coattails +可口可乐,coca-cola +外套,cockpit +鸡尾酒,cocktail +(社交场合穿的)短裙,cocktail_dress +鸡尾酒杯,cocktail_glass +咖啡,coffee +咖啡杯,coffee_cup +马克杯,coffee_mug +棺材,coffin +硬币,coin +厚如玻璃瓶底的圆眼镜,coke-bottle_glasses +寒冷的,cold +拼贴画,collage +项圈,collar +锁骨,collarbone +翻领短衫,collared_blouse +有领连衣裙,collared_dress +翻领夹克,collared_jacket +翻领衬衫,collared_shirt +色彩关联,color_connection +色标卡,color_guide +原画,color_trace +彩色的睫毛,colored_eyelashes +内层挑染,colored_inner_hair +彩色铅笔画,colored_pencil_(medium) +阴毛颜色,colored_pubic_hair +多彩条纹,colored_stripes +五彩斑斓的,colorful +柱子,column +立绘并列,column_lineup +梳子,comb +军用靴,combat_boots +诱受,come_hither +安慰,comforting +漫画,comic +令咒,command_spell +联动,company_connection +有版权署名,company_name +对比图,comparison +竞泳死库水,competition_school_swimsuit +赛用泳衣,competition_swimsuit +计算机,computer +概念艺术,concept_art +飞机云,condensation_trail +避孕套,condom +口叼避孕套,condom_in_mouth +戴避孕套,condom_on_penis +避孕套包装,condom_packet_strip +避孕套包装,condom_wrapper +表白,confession +彩色纸屑(祝贺),confetti +非强行的触手侵犯,consensual_tentacles +星座,constellation +收缩的瞳孔,constricted_pupils +现代改编,contemporary +对立式平衡,contrapposto +控制器,controller +便利店,convenience_store +遮住关键部位的手臂,convenient_arm +简单巧妙的打码,convenient_censoring +挡住关键部位的腿,convenient_leg +曲奇,cookie +烹饪,cooking +冷藏器,cooler +版权,copyright +版权署名,copyright_name +珊瑚,coral +有线电话,corded_phone +玉米,corn +尸体,corpse +恶堕|婊子化,corruption +束腰,corset +角色扮演,cosplay +服装,costume +服饰互换,costume_switch +棉花糖,cotton_candy +沙发,couch +倒计时,countdown +柜台,counter +情侣,couple +表亲,cousins +封面,cover +封面图片,cover_image +封面,cover_page +有衣物裹住的锁骨,covered_collarbone +蒙住的眼,covered_eyes +蒙脸,covered_face +蒙住的嘴,covered_mouth +隐约入目的肚脐,covered_navel +激凸的乳头|隐约可见的乳头,covered_nipples +遮盖的动作,covering +遮着臀部,covering_ass +遮着乳房,covering_breasts +遮着裆部,covering_crotch +遮住眼睛,covering_eyes +挡住脸,covering_face +挡住嘴巴,covering_mouth +挡住乳头,covering_nipples +牛,cow +牛耳朵,cow_ears +牛娘,cow_girl +牛角,cow_horns +奶牛系印花,cow_print +牛尾,cow_tail +牛仔靴,cowboy_boots +牛仔帽,cowboy_hat +七分身镜头,cowboy_shot +女上位,cowgirl_position +螃蟹,crab +裂痕,crack +概念礼装,craft_essence +起重机,crane_(machine) +大木箱,crate +男式领巾,cravat +病娇,crazy +坏掉的眼神,crazy_eyes +奶油,cream +同个原作者的角色们的同人,creator_connection +创造物,creature +拟人化(含对比),creature_and_personification +书脊,credits_page +可丽饼,crepe +月牙,crescent +月牙耳环,crescent_earrings +月牙发饰,crescent_hair_ornament +新月,crescent_moon +新月玫瑰,crescent_rose +交叉吊带,criss-cross_halter +露腹短上衣,crop_top +乳帘,crop_top_overhang +短夹克,cropped_jacket +只画了一部分腿,cropped_legs +短衬衫,cropped_shirt +只画了一部分躯干,cropped_torso +小款背心,cropped_vest +十字,cross +交叉花边服饰,cross-laced_clothes +交叉系带鞋,cross-laced_footwear +截面图,cross-section +十字耳环,cross_earrings +十字发饰,cross_hair_ornament +十字架项链,cross_necklace +十字架元素图案,cross_print +弩,crossbow +(变装)女装|男装|女装男子|女装,crossdressing +交叉脚踝,crossed_ankles +交叉双臂,crossed_arms +交叉的刘海,crossed_bangs +交叉的腿,crossed_legs +跨界作品,crossover +人行横道,crosswalk +胯部,crotch +裆部有特别覆盖物,crotch_plate +裆部勒着的绳子,crotch_rope +摩擦裆部,crotch_rub +裆部的缝,crotch_seam +开裆的,crotchless +开档内裤,crotchless_panties +乌鸦,crow +人群,crowd +王冠|皇冠,crown +法式冠编发,crown_braid +哭,crying +睁着眼落泪,crying_with_eyes_open +矿石,crystal +水晶球,crystal_ball +水晶耳环,crystal_earrings +水晶状的头发,crystal_hair +立方体,cube +黄瓜,cucumber +拥抱,cuddling +手铐,cuffs +精液,cum +肛门内射,cum_in_ass +口内射精,cum_in_mouth +内射|中出,cum_in_pussy +射在屁股上,cum_on_ass +射在身上,cum_on_body +射在男性身上,cum_on_boy +射在胸上,cum_on_breasts +射在衣服上,cum_on_clothes +射在头发上,cum_on_hair +射在下半身上,cum_on_lower_body +射在阴部上,cum_on_pussy +射在自己身上,cum_on_self +射在嘴里,cum_on_stomach +射在舌头上,cum_on_tongue +射在上半身,cum_on_upper_body +淌成一滩的精液,cum_pool +拉丝的精液,cum_string +射在后庭里,cum_while_penetrated +溢出的精液(已拔出),cumdrip +舔阴,cunnilingus +杯子,cup +杯面,cup_ramen +纸杯蛋糕,cupcake +高露出度胸罩,cupless_bra +弯曲的角,curled_horns +卷发,curly_hair +咖喱,curry +咖喱饭,curry_rice +光标,cursor +窗帘,curtains +(女子的)屈膝礼,curtsey +展现魅力的姿势,curvy +坐垫,cushion +cute&girly (偶像大师),cute_&_girly_(idolmaster) +热裤,cutoffs +割伤,cuts +砧板,cutting_board +剪头发,cutting_hair +赛博朋克,cyberpunk +赛博人类,cyborg +独眼巨人,cyclops +斯拉夫文,cyrillic +方向键,d-pad +方向键发饰,d-pad_hair_ornament +d:,d: +匕首,dagger +雏菊,daisy +抱枕,dakimakura +抱枕(物体),dakimakura_(object) +受损的,damaged +舞者,dancer +跳舞,dancing +蒲公英,dandelion +《危险的野兽》FGO礼装,dangerous_beast +团子,dango +弹幕,danmaku +透过树叶洒下的阳光,dappled_sunlight +黑暗,dark +深色的乳晕,dark_areolae +黑暗的光环|黑暗的气场,dark_aura +深蓝色头发,dark_blue_hair +暗精灵,dark_elf +深色的乳头,dark_nipples +深色的阴茎,dark_penis +黑化,dark_persona +褐色皮肤,dark_skin +黑暗,darkness +付上日期的,dated +白天,day +死,death +碎片|残骸,debris +负债的,debt +霸天虎,decepticon +捅到子宫,deep_penetration +角色的另一面(多指色情),deep_skin +深喉,deepthroat +鹿,deer +败北,defeat +恶魔,demon +恶魔男孩,demon_boy +魅魔,demon_girl +恶魔的角,demon_horns +恶魔尾巴,demon_tail +恶魔翅膀,demon_wings +牛仔布,denim +牛仔短裤,denim_shorts +牛仔裙,denim_skirt +景深(画法),depth_of_field +沙漠,desert +书桌,desk +台灯,desk_lamp +甜点,dessert +破坏,destruction +假领,detached_collar +分离袖子,detached_sleeves +不与本体相连的翅膀,detached_wings +冠饰,diadem +斜纹领结,diagonal-striped_neckwear +斜刘海,diagonal_bangs +斜纹,diagonal_stripes +钻石(宝石),diamond_(gemstone) +菱形(形状),diamond_(shape) +菱形(符号),diamond_(symbol) +尿布,diaper +骰子,dice +映出的另一面,different_reflection +衍射十字星,diffraction_spikes +化成数据般消失,digital_dissolve +随身听,digital_media_player +假阴茎,dildo +女性的腰窝|维纳斯之眼,dimples_of_venus +恐龙,dinosaur +阿尔卑斯村姑裙,dirndl +肮脏的,dirty +弄脏的衣服,dirty_clothes +弄脏的脸,dirty_face +弄脏的脚,dirty_feet +分离的头,disembodied_head +断肢,disembodied_limb +断掉的阴茎,disembodied_penis +一次性纸杯,disposable_cup +溶解的,dissolving +溶解掉的衣服,dissolving_clothes +潜水,diving +潜水面罩,diving_mask +头戴潜水面罩,diving_mask_on_head +狗盆帽,dixie_cup_hat +船坞,dock +医生,doctor +狗,dog +狗项圈,dog_collar +狗耳朵,dog_ears +犬娘,dog_girl +狗牌,dog_tags +狗尾巴,dog_tail +后入式,doggystyle +花边元素,doily +人偶|玩偶,doll +抱着玩偶,doll_hug +玩偶球形关节,doll_joints +海豚,dolphin +海豚短裤,dolphin_shorts +女王(SM中),dominatrix +SM面具,domino_mask +《don't_say_"lazy"》,don't_say_"lazy" +募捐箱,donation_box +门,door +小圆点鼻,dot_nose +虚线,dotted_line +双排纽扣,double-breasted +团子头,double_bun +双头假阴茎,double_dildo +双手各握着一根阴茎,double_handjob +二穴,double_penetration +啊嘿颜比V,double_v +服饰上有两条平行条纹,double_vertical_stripe +甜甜圈,doughnut +武道服,dougi +同人封面,doujin_cover +同人志,doujinshi +白鸽,dove +胸部走光,downblouse +占星杖,dowsing_rod +得意脸,doyagao +角色在拖某物,dragging +龙,dragon +龙娘,dragon_girl +龙角,dragon_horns +龙尾,dragon_tail +龙翼,dragon_wings +蜻蜓,dragonfly +牛角族,draph +抽屉,drawer +绘画,drawing +拉弓,drawing_bow +手绘板,drawing_tablet +源drawr不可寻,drawr +衣服的抽绳,drawstring +做梦,dreaming +连衣裙,dress +连衣裙上的蝴蝶结,dress_bow +掀起裙子,dress_lift +剥下裙子胸口的部分,dress_pull +脱下裙子,dress_removed +西服衬衫,dress_shirt +压住裙摆,dress_tug +穿衣,dressing +着装,dressing_another +钻头,drill +钻头发型,drill_hair +饮料,drink +喝,drinking +酒杯,drinking_glass +吸管,drinking_straw +液体滴落,dripping +驾驶,driving +无人机,drone +流口水,drooling +立绘阴影,drop_shadow +(意外)掉落,dropping +药物,drugs +鼓,drum +桶(容器),drum_(container) +架子鼓,drum_set +鼓槌,drumsticks +喝醉的,drunk +弄干(浴后),drying +弄干头发,drying_hair +双重人格,dual_persona +双持,dual_wielding +鸭子,duck +肚兜,dudou +决斗,duel +游戏王决斗怪兽,duel_monster +行李包,duffel_bag +粗呢大衣,duffel_coat +哑铃,dumbbell +黄昏,dusk +灰尘,dust +烟尘,dust_cloud +掸子,duster +德式倾斜镜头,dutch_angle +DVD封面,dvd_cover +鹰,eagle +咬耳朵,ear_biting +耳红,ear_blush +穿耳洞,ear_piercing +护耳罩,ear_protection +耳标,ear_tag +耳塞,earbuds +耳罩,earmuffs +耳机,earphones +耳环,earrings +多双耳朵,ears +下垂的耳朵,ears_down +为耳朵留洞的头饰,ears_through_headwear +画架,easel +东方风格建筑,east_asian_architecture +复活节彩蛋,easter_egg +东方龙,eastern_dragon +咀嚼,eating +工霞,egasumi +蛋,egg +跳蛋,egg_vibrator +茄子,eggplant +埃及,egyptian +埃及风格服饰,egyptian_clothes +惠方卷,ehoumaki +八分音符,eighth_note +射精,ejaculation +长手套,elbow_gloves +护肘,elbow_pads +电风扇,electric_fan +电吉他,electric_guitar +插座,electric_socket +电,electricity +大象,elephant +精灵,elf +四天王,elite_four +绘马,ema +害羞,embarrassed +经润饰的服装,embellished_costume +余烬,embers +纹章,emblem +翡翠(宝石),emerald_(gemstone) +集中线,emphasis_lines +空洞的眼神,empty_eyes +能量球,energy_ball +镭射枪,energy_gun +光剑,energy_sword +英语,english_text +L和R不分的英文,engrish_text +女仆装,enmaided +(围巾束成的)乌贼鳍发型,enpera +信封,envelope +肩章,epaulettes +壮丽且经典的故事(的同人),epic +橡皮擦,eraser +膨胀的阴核,erect_clitoris +勃起,erection +老二立帐篷,erection_under_clothes +色情漫画,eromanga +有问题的作画,error +兽耳族,erune +傍晚,evening +晚会礼服,evening_gown +集合绘,everyone +邪恶的露齿笑,evil_grin +邪恶的笑容,evil_smile +超大的射精量,excessive_cum +极浓密的阴毛,excessive_pubic_hair +做运动,exercise +露出,exhibitionism +爆炸,explosion +易爆的,explosive +角色的多表情展示,expression_chart +无口,expressionless +表情,expressions +富有表现力的衣服,expressive_clothes +富有表现力的头发,expressive_hair +衍生角色(形象),extra +多只手臂,extra_arms +四只耳朵,extra_ears +多只眼睛,extra_eyes +有多张嘴巴,extra_mouth +眼神交流,eye_contact +荷鲁斯之眼,eye_of_horus +眼球,eyeball +眉毛,eyebrows +挡在头发下的眉毛,eyebrows_behind_hair +透过头发可以看到的眉毛,eyebrows_visible_through_hair +睫毛,eyelashes +眼线,eyeliner +眼罩,eyepatch +眼罩比基尼,eyepatch_bikini +摘眼罩,eyepatch_removed +眼睛,eyes +透过头发可以看到的眼睛,eyes_visible_through_hair +眼影,eyeshadow +眼镜别在头上,eyewear_on_head +摘眼镜,eyewear_removed +脸部,face +脸贴脸,face-to-face +没有脸部细节的,faceless +无脸部女性,faceless_female +无脸部男性,faceless_male +脸部画有图案,facepaint +以手掩面,facepalm +颜射,facial +胡须,facial_hair +脸部有图案,facial_mark +脸部有纹身,facial_tattoo +二人面对面(脸贴得很近),facing_another +面向别处,facing_away +面向镜头,facing_viewer +失败,failure +妖精,fairy +妖精的翅膀,fairy_wings +仿制的动物耳朵,fake_animal_ears +(同人向玩梗向的)封面,fake_cover +假胡须,fake_facial_hair +装饰性的角,fake_horns +假胡子,fake_mustache +(同人向玩梗向的)截图,fake_screenshot +仿造的尾巴,fake_tail +仿造的翅膀,fake_wings +法尔西昂(火焰之纹章),falchion_(fire_emblem) +倒地的,fallen_down +正在下坠的,falling +落叶,falling_leaves +落花,falling_petals +一家子,family +家徽,family_crest +扇子|电风扇,fan +虎牙,fang +露出虎牙|露出尖牙,fang_out +尖牙,fangs +腰包,fanny_pack +幻想,fantasy +时尚,fashion +肥胖的|大号美女,fat +胖子,fat_man +丰满的,fat_mons +父女,father_and_daughter +父子,father_and_son +水龙头,faucet +腹甲,faulds +仿手办风格,faux_figurine +传统手绘风格,faux_traditional_media +羽毛装饰的袖子,feather-trimmed_sleeves +羽毛围巾,feather_boa +羽翼,feathered_wings +羽毛,feathers +软呢帽,fedora +喂食,feeding +脚,feet +双脚不在图内,feet_out_of_frame +翘起脚,feet_up +口交,fellatio +摆着口交的手势,fellatio_gesture +潮吹,female_ejaculation +女性高潮,female_orgasm +痴女,female_pervert +女性视角,female_pov +逆强奸,femdom +栅栏,fence +貂,ferret +摩天轮,ferris_wheel +节日|祭典,festival +胎儿状的蜷缩,fetal_position +女女女3P,fff_threesome +女女男3P,ffm_threesome +原创人物,fictional_persona +田野,field +燃烧着的翅膀,fiery_wings +战斗机,fighter_jet +战斗中的,fighting +战斗姿态|摆着架势的,fighting_stance +手办,figure +胶片颗粒感|老电影滤镜,film_grain +对某美术品的模仿|模仿名画,fine_art_parody +精细的面料感|超还原的丝袜,fine_fabric_emphasis +咬手指,finger_biting +手指作手枪状,finger_gun +口内指交,finger_in_another's_mouth +手指伸进嘴里,finger_in_mouth +准备扣扳机的手势,finger_on_trigger +吮吸手指,finger_sucking +手抵在嘴唇边,finger_to_mouth +指交,fingering +手指隔着衣物摸来摸去,fingering_through_clothes +无指手套,fingerless_gloves +指甲,fingernails +手指,fingers +手指并拢,fingers_together +鱼鳍,fins +火,fire +萤火虫,fireflies +壁炉,fireplace +烟花,fireworks +射击,firing +鱼,fish +鱼人,fish_girl +鱼形发饰,fish_hair_ornament +鱼形印花,fish_print +鱼眼镜头,fisheye +钓鱼,fishing +钓鱼竿,fishing_rod +渔网袜,fishnet_legwear +渔网,fishnets +未勃起的,flaccid +旗,flag +旗帜印花,flag_print +梿枷,flail +火焰,flame +火焰图案,flame_print +焰目,flaming_eye +拍打动作,flapping +回忆场景,flashback +露出,flashing +手电筒,flashlight +烧瓶,flask +贫瘠的屁股,flat_ass +平顶帽|司机帽,flat_cap +平胸|贫乳,flat_chest +抓着贫乳,flat_chest_grab +平涂风格,flat_color +无背景|平铺的背景,flats +在逃跑的,fleeing +点兔女仆装,fleur_de_lapin_uniform +柔韧性好的,flexible +秀肌肉,flexing +人字拖,flip-flops +翻盖手机,flip_phone +外卷发型,flipped_hair +浮动的|在漂浮的,floating +飘动的头发,floating_hair +浮岛,floating_island +漂浮物,floating_object +悠远|恢弘感|一大群,flock +地板,floor +大耳朵|下垂的耳朵|飞机耳发型,floppy_ears +花朵点缀的背景|有花香的感觉,floral_background +花朵图案(印花),floral_print +花,flower +花(符号),flower_(symbol) +花手镯,flower_bracelet +花耳环,flower_earrings +花田,flower_field +绳结,flower_knot +花环|鲜花项链,flower_necklace +毛茸茸的,fluffy +慌乱的,flustered +笛,flute +飞行,flying +飞踢,flying_kick +飞扬的汗珠,flying_sweatdrops +泡沫,foam +雾,fog +抬腿露阴,folded +蜷起来的马尾,folded_ponytail +折叠椅,folding_chair +扇子,folding_fan +食物,food +嘴里有食物,food_in_mouth +身体上有食物,food_on_body +脸上有食物,food_on_face +食物印花,food_print +食品摊,food_stand +单脚不在图内,foot_out_of_frame +足交,footjob +脚印,footprints +强制高潮,forced_orgasm +额头,forehead +额头贴额头,forehead-to-forehead +额前有宝石,forehead_jewel +亲吻额头,forehead_kiss +额前有图案,forehead_mark +护额,forehead_protector +正前缩距透视法,foreshortening +包皮,foreskin +森林,forest +餐叉,fork +正装,formal +喷泉,fountain +四叶草,four-leaf_clover +虚实穿插|连接到现实的感觉|次元壁,fourth_wall +狐狸,fox +狐耳,fox_ears +狐娘,fox_girl +狐狸面具,fox_mask +狐狸手势,fox_shadow_puppet +狐狸尾巴,fox_tail +极具立体感的乳房,framed_breasts +雀斑,freckles +潜水(泳姿不限),freediving +法式辫子,french_braid +法国国旗,french_flag +炸薯条,french_fries +舌吻,french_kiss +法文,french_text +煎蛋|荷包蛋,fried_egg +朋友,friends +花边围裙,frilled_apron +花边比基尼,frilled_bikini +花边蝴蝶结,frilled_bow +花边胸罩,frilled_bra +花边披肩,frilled_capelet +花边项圈,frilled_choker +花边衣领,frilled_collar +花边连衣裙,frilled_dress +花边手套,frilled_gloves +花边发带,frilled_hairband +花边和服,frilled_kimono +花边的裤袜,frilled_legwear +花边内裤,frilled_panties +花边缎带,frilled_ribbon +花边衬衫,frilled_shirt +花边衬衫领,frilled_shirt_collar +花边裙子,frilled_skirt +花边袖子,frilled_sleeves +花边泳衣,frilled_swimsuit +花边,frills +碎边饰物|边缘装饰物,fringe_trim +青蛙,frog +青蛙娘,frog_girl +青蛙发饰,frog_hair_ornament +青蛙捆绑,frogtie +俯视镜头,from_above +背影,from_behind +仰视镜头,from_below +室外看向室内(的镜头),from_outside +角色的侧面,from_side +正面有图案的内裤,front-print_panties +从正面打结的比基尼,front-tie_bikini +从正面打结的衣物,front-tie_top +束在正面的马尾,front_ponytail +拉链在正面的泳衣,front_zipper_swimsuit +摩擦性器官|用性器官摩擦别人的身体,frottage +皱眉,frown +被冻住的,frozen +水果,fruit +平底锅,frying_pan +被肏到恍惚|被干得不要不要,fucked_silly +整张脸泛红,full-face_blush +有整件衣物长的拉链,full-length_zipper +有完整两性器官的扶她,full-package_futanari +穿着全套盔甲的,full_armor +全身|画出了全身,full_body +满月,full_moon +双肩下握颈体位,full_nelson +气得冒烟(含一定漫画元素),fume +兜裆布,fundoshi +毛皮,fur +皮草靴,fur-trimmed_boots +毛皮镶边斗篷,fur-trimmed_cape +毛皮装饰披肩,fur-trimmed_capelet +镶边皮草大衣,fur-trimmed_coat +毛皮镶边连衣裙,fur-trimmed_dress +毛皮镶边手套,fur-trimmed_gloves +毛皮镶边风帽,fur-trimmed_hood +皮草夹克,fur-trimmed_jacket +皮草修身长裤,fur-trimmed_legwear +毛皮镶边短裙,fur-trimmed_skirt +毛皮镶边袖子,fur-trimmed_sleeves +矮跟休闲皮草靴,fur_boots +毛皮大衣,fur_coat +毛皮衣领,fur_collar +毛皮帽子,fur_hat +皮草款式|毛皮镶边,fur_trim +振袖,furisode +毛茸茸|兽人,furry +融合|合体,fusion +扶她和女性,futa_with_female +两个扶她,futa_with_futa +扶她和男性,futa_with_male +扶她,futanari +干扶她|玩弄扶她,futasub +被窝,futon +封印之杖,fuuin_no_tsue +超细款丁字裤,g-string +猿辔|塞口物,gag +被塞口的,gagged +学兰,gakuran +游戏机,game_console +游戏控制器,game_controller +游戏手柄,gamepad +(融入了)游戏机制元素,gameplay_mechanics +轮奸,gangbang +黑皮,ganguro +隙间(东方),gap_(touhou) +敞开的|撑开的(阴部和屁眼),gaping +花园,garden +加尔格·马可大修道院制服,garreg_mach_monastery_uniform +船形帽|卫戍帽|军便帽,garrison_cap +吊带袜的腰带,garter_belt +吊带袜的吊带,garter_straps +吊带袜的袜带,garters +防毒面具,gas_mask +门,gate +褶裥(衣物),gathers +加特林机枪,gatling_gun +护腕,gauntlets +齿轮,gears +宝石,gem +宝石之国的制服,gem_uniform_(houseki_no_kuni) +第1代口袋妖怪,gen_1_pokemon +第2代口袋妖怪,gen_2_pokemon +第3代口袋妖怪,gen_3_pokemon +第4代口袋妖怪,gen_4_pokemon +第5代口袋妖怪,gen_5_pokemon +第6代口袋妖怪,gen_6_pokemon +第7代口袋妖怪,gen_7_pokemon +第8代口袋妖怪,gen_8_pokemon +性转,genderswap +男体化,genderswap_(ftm) +女体化,genderswap_(mtf) +德国服装,german_clothes +德国国旗,german_flag +德国国旗比基尼,german_flag_bikini +德语,german_text +德国,germany +木屐,geta +幽灵,ghost +幽灵尾巴,ghost_tail +巨人,giant +巨大少女,giantess +礼物,gift +礼品袋,gift_bag +礼品盒,gift_box +超大乳房,gigantic_breasts +女上位,girl_on_top +被两个女孩夹在中间,girl_sandwich +递|赠送,giving +灵魂出窍,giving_up_the_ghost +长柄刀,glaive +轻蔑的眼神|怒视,glaring +玻璃,glass +眼镜|眼镜娘,glasses +眼中闪现强烈的情感,glint +地球仪,globe +热情的拥抱|扑上来抱住,glomp +消沉(表情),gloom_(expression) +手套,gloves +脱下手套,gloves_removed +发光的,glowing +发光的眼睛(单眼),glowing_eye +发光的双眼,glowing_eyes +发光的武器,glowing_weapon +荧光棒|打call棒,glowstick +山羊胡,goatee +哥布林,goblin +女神,goddess +护目镜,goggles +脖子挂着护目镜,goggles_around_neck +头上别着护目镜,goggles_on_head +帽子上别着护目镜,goggles_on_headwear +御币,gohei +金,gold +黄金盔甲,gold_armor +金链子,gold_chain +金色镶边,gold_trim +金鱼,goldfish +高尔夫俱乐部,golf_club +好结局,good_end +护喉|护喉甲|颈甲,gorget +哥特式,gothic +哥特萝莉,gothic_lolita +葫芦,gourd +女裙|女长服|女礼服,gown +抓住|抓着,grabbing +抓着对方屁股,grabbing_another's_ass +抓着对方的头发,grabbing_another's_hair +从背后揉胸|从背后抓住胸部,grabbing_from_behind +抓着自己的屁股,grabbing_own_ass +(身体或肢体)倾斜,gradient +渐变色(效果)的背景,gradient_background +渐变色的头发,gradient_hair +倾斜的天空,gradient_sky +涂鸦,graffiti +葡萄,grapes +铅笔画(中号),graphite_(medium) +草,grass +坟墓,grave +墓地,graveyard +胫甲,greaves +绿框眼镜,green-framed_eyewear +绿色背景,green_background +绿色比基尼,green_bikini +绿色紧身衣,green_bodysuit +绿色蝴蝶结,green_bow +绿色胸罩,green_bra +绿色斗篷,green_cape +绿色项圈,green_choker +绿色外套,green_coat +绿色连衣裙,green_dress +绿色的眼镜,green_eyes +绿色鞋类,green_footwear +绿色的手套,green_gloves +绿发,green_hair +绿色发带,green_hairband +绿色的袴,green_hakama +绿色头饰,green_headwear +绿色夹克,green_jacket +绿色和服,green_kimono +绿色的裤袜,green_legwear +绿色连身衣,green_leotard +绿色的指甲,green_nails +绿色的领结,green_neckwear +绿色的内裤,green_panties +绿色的裤子,green_pants +绿色丝带,green_ribbon +绿色水手领,green_sailor_collar +绿色围巾,green_scarf +绿色衬衫,green_shirt +绿色短裤,green_shorts +绿色皮肤,green_skin +绿色的裙子,green_skirt +绿色的毛衣,green_sweater +绿色的泳衣,green_swimsuit +绿茶,green_tea +绿色主题,green_theme +绿色背心,green_vest +绿色翅膀,green_wings +手榴弹,grenade +榴弹发射器,grenade_launcher +灰色背景,grey_background +灰色比基尼,grey_bikini +灰色蝴蝶结,grey_bow +灰色开口衬衫,grey_cardigan +灰色外套,grey_coat +灰色连衣裙,grey_dress +灰色的眼睛,grey_eyes +灰色鞋类,grey_footwear +灰色手套,grey_gloves +灰色的头发|白发|银发,grey_hair +灰色头饰,grey_headwear +灰色连帽衫,grey_hoodie +灰色夹克,grey_jacket +灰色的裤袜,grey_legwear +灰色的领结,grey_neckwear +灰色的内裤,grey_panties +灰色的裤子,grey_pants +灰色丝带,grey_ribbon +灰色水手领,grey_sailor_collar +灰色上衣,grey_shirt +灰色短裤,grey_shorts +灰色皮肤,grey_skin +灰色短裙,grey_skirt +灰色袖子,grey_sleeves +灰色毛衣,grey_sweater +灰色,grey_theme +灰色背心,grey_vest +纯黑白色|黑白照片化,greyscale +烤架,grill +(因痛苦、厌恶等)做怪相,grimace +魔法书,grimoire +爱丽丝的魔法书(东方),grimoire_of_alice +露齿笑,grin +素股|臀推,grinding +食材,groceries +购物袋,grocery_bag +腹股沟|耻丘,groin +腹股沟肌腱,groin_tendon +猥亵,groping +地面载具,ground_vehicle +团团抱,group_hug +团队名|组名,group_name +群P|群交,group_sex +把对方的手抓过来摸自己的胸,guided_breast_grab +抓住男性器插入自己,guided_penetration +吉他,guitar +吉他套,guitar_case +枪,gun +枪刃,gunblade +R-18G,guro +(缝在衣服上以放大、加固等的)衬料,gusset +辣妹,gyaru +体操队队长,gym_leader +体操服,gym_shirt +体操短裤,gym_shorts +体育仓库,gym_storeroom +体操服,gym_uniform +HK416,h&k_hk416 +修道服,habit +扎头巾(名词),hachimaki +光膀子,hadanugi_dousa +羽子板,hagoita +羽衣,hagoromo +发珠,hair_beads +耳后有头发,hair_behind_ear +头发上系着铃铛,hair_bell +延伸到两眼之间的刘海,hair_between_eyes +头绳,hair_bobbles +蝴蝶结发圈,hair_bow +梳子,hair_brush +梳头,hair_brushing +发髻|团子头,hair_bun +头发遮住了乳头(三点),hair_censor +捆发珠,hair_cubes +披下来的头发,hair_down +在摆动的头发,hair_flaps +拨头发,hair_flip +头发上别着花,hair_flower +抓着头发,hair_grab +嘴里有头发,hair_in_mouth +进气口发型,hair_intakes +发饰,hair_ornament +头发披在胸上,hair_over_breasts +头发遮着双眼,hair_over_eyes +头发遮住了一只眼睛,hair_over_one_eye +披肩发,hair_over_shoulder +扯着头发,hair_pull +头发往后扎,hair_pulled_back +发绳,hair_ribbon +飞仙髻,hair_rings +发圈|发束,hair_scrunchie +头发往后捋|往后梳的头发|背头发型,hair_slicked_back +散开的头发,hair_spread_out +发簪,hair_stick +强调一缕一缕感的发型|发丝,hair_strand +发带,hair_tie +将系头发的东西叼在嘴里,hair_tie_in_mouth +束发套,hair_tubes +撩头发,hair_tucking +手指卷着头发,hair_twirling +盘起来的头发,hair_up +发箍,hairband +发夹,hairclip +梳理头发|扎辫子,hairdressing +头发缠在男性器上|卷发「棒」,hairjob +脏辫,hairlocs +一种发卡夹成的发型特征,hairpin +头发上成对的像无线蓝牙的发饰,hairpods +袴|腿衣,hakama +袴裙,hakama_skirt +角色cos灵梦,hakurei_reimu_(cosplay) +戟,halberd +半闭的眼睛(单眼),half-closed_eye +半闭的双眼,half-closed_eyes +很短的手套|半截手套,half_gloves +女神式发型|公主辫,half_updo +一种漫画中常见的网点状的组色组图画法,halftone +网点图背景,halftone_background +万圣夜,halloween +万圣节装扮,halloween_costume +天使光环,halo +绕颈裙|衬衣式连衣裙|露背吊带裙,halter_dress +吊带式的上身的服饰,halter_top +绕颈系带,halterneck +破魔矢,hamaya +锤子,hammer +锤子和镰刀,hammer_and_sickle +仓鼠,hamster +花咲川女子学园,hanasakigawa_school_uniform +朝鲜服饰,hanbok +手放在胸前,hand_between_breasts +手放在两腿之间,hand_between_legs +做手势,hand_gesture +手放在对方的头发上,hand_in_another's_hair +手埋在头发里,hand_in_hair +手伸进内裤里,hand_in_panties +手插在口袋里,hand_in_pocket +手镜,hand_mirror +手里拿着捕虫网,hand_net +手摸着对方的脸颊,hand_on_another's_cheek +手放在对方的胸前,hand_on_another's_chest +手托着对方的下巴,hand_on_another's_chin +手接触到对方的脸,hand_on_another's_face +手放在对方的头上,hand_on_another's_head +手放在对方的肩上,hand_on_another's_shoulder +手放在对方的肚子上,hand_on_another's_stomach +手放在屁股上,hand_on_ass +手放在头上,hand_on_head +手触碰帽子,hand_on_headwear +手搁在刀柄上,hand_on_hilt +手叉着腰|手搁在腰上,hand_on_hip +把手放在自己的脸颊上,hand_on_own_cheek +把手放在自己的胸口,hand_on_own_chest +手碰着自己的下巴,hand_on_own_chin +把手放在自己脸上,hand_on_own_face +手放在自己的膝盖上,hand_on_own_knee +把手放在自己的肚子上,hand_on_own_stomach +手放在肩上,hand_on_shoulder +手掩着嘴,hand_over_own_mouth +手上套着玩偶,hand_puppet +手朝向自己嘴,hand_to_own_mouth +手伸进衬衫下面,hand_under_shirt +抬起手,hand_up +手提包,handbag +手铐,handcuffs +手枪,handgun +手里拿着游戏机,handheld_game_console +手淫,handjob +手帕,handkerchief +手|双手,hands +双手手指交叉|双手紧握,hands_clasped +双手交叉伸进袖子里,hands_in_opposite_sleeves +双手插在兜里,hands_in_pockets +双手放在对方的脸颊上,hands_on_another's_cheeks +双手放在别人的脸上,hands_on_another's_face +双手放在别人的肩膀上,hands_on_another's_shoulders +双手放在屁股上,hands_on_ass +双手放在脚上,hands_on_feet +双手碰到头上的饰物,hands_on_headwear +双手搭在刀柄上,hands_on_hilt +双手叉腰|双手搁在腰上,hands_on_hips +双手放在两腿之间,hands_on_lap +双手放在自己的脸颊上,hands_on_own_cheeks +双手放在自己的胸口,hands_on_own_chest +双手放在自己的脸上,hands_on_own_face +双手放在自己的头上,hands_on_own_head +双手放在自己的膝盖上,hands_on_own_knees +双手放在自己的肚子上,hands_on_own_stomach +双手放在自己的大腿上,hands_on_own_thighs +双手搭在两肩,hands_on_shoulders +双手合十|自己的双手碰一起,hands_together +双手朝上|双手抬起|双手举起,hands_up +倒立,handstand +板羽球,hanetsuki +汉服,hanfu +吊起来的,hanging +(俯身)双乳下垂,hanging_breasts +袢缠,hanten_(clothes) +羽织,haori +法被,happi +快乐|幸福,happy +生日快乐,happy_birthday +万圣节快乐,happy_halloween +新年快乐,happy_new_year +幸福的做爱,happy_sex +开心的眼泪,happy_tears +情人节快乐,happy_valentine +后宫,harem +舞娘服,harem_outfit +灯笼裤,harem_pants +背带,harness +竖琴,harp +有翼人|半人半鸟|鸟人族,harpy +矮人族,harvin +帽子,hat +带有蝴蝶结的帽子,hat_bow +带有羽毛的帽子,hat_feather +带着花的帽子,hat_flower +带有饰物的帽子,hat_ornament +帽子遮住了一只眼,hat_over_one_eye +帽子被摘下|摘下帽子,hat_removed +带有缎带的帽子,hat_ribbon +捏着帽檐,hat_tip +自带耳朵的帽子,hat_with_ears +短斧,hatchet +线影法(纹理),hatching_(texture) +cos成初音未来,hatsune_miku_(cosplay) +天狗叶扇,hauchiwa +憋尿,have_to_pee +夏威夷衫,hawaiian_shirt +头,head +弓背高潮,head_back +头埋在胸里|洗面奶,head_between_breasts +头上起包,head_bump +低着头,head_down +鱼人耳|人鱼耳|鳍状耳朵,head_fins +戴着头戴显示设备,head_mounted_display +一部分头部没画进框里,head_out_of_frame +枕着头|托着头,head_rest +包头巾(名词),head_scarf +歪着头,head_tilt +头上有翅膀,head_wings +头上戴着花冠,head_wreath +扎头巾(名词),headband +头部穿戴物(偏笼统),headdress +头部饰品(含一定科幻元素),headgear +无头,headless +耳机,headphones +耳机挂在脖子上,headphones_around_neck +装饰性头饰(偏幻想和民族风饰品),headpiece +状态条(游戏和科幻风格),heads-up_display +头戴式耳机,headset +头饰(偏衣物类),headwear +头饰被摘下|摘下头饰,headwear_removed +血量槽|HP|血条,health_bar +心|心形|爱心,heart +心形盒子,heart-shaped_box +心形眼镜,heart-shaped_eyewear +心形锁,heart-shaped_lock +心形瞳孔|心形眼,heart-shaped_pupils +心形呆毛,heart_ahoge +含心形的背景|桃色背景,heart_background +心形马赛克,heart_censor +心形项圈,heart_choker +衣服上有心形的开口,heart_cutout +心形耳环,heart_earrings +心形发饰,heart_hair_ornament +双手比心,heart_hands +两个人一起用手比心,heart_hands_duo +眼里冒爱心,heart_in_eye +嘴里画着爱心|心形化的舌头,heart_in_mouth +舰娘锁(舰C),heart_lock_(kantai_collection) +心形项链,heart_necklace +串成心形的绳子(或线条),heart_of_string +心形乳贴,heart_pasties +心形枕头,heart_pillow +心形图案|心形印花,heart_print +心形尾巴|心形尾尖|尾巴摆出心形,heart_tail +心形纹身,heart_tattoo +心跳|表现出心动的文字,heartbeat +喘粗气,heavy_breathing +黑克勒-科赫制系列武器,heckler_&_koch +身高差,height_difference +直升机,helicopter +戴着头盔的,helm +头盔|头部装甲|头部防护罩,helmet +摘下头盔|被摘下的头盔,helmet_removed +变身,henshin +英灵节日装,heroic_spirit_festival_outfit +英灵旅装,heroic_spirit_traveling_outfit +异性恋,hetero +虹膜异色,heterochromia +六边形,hexagon +六角星,hexagram +扶桑花,hibiscus +没有露脸|被挡住的脸,hidden_face +躲藏,hiding +高腰裙,high-waist_skirt +高领,high_collar +高对比度,high_contrast +高跟靴,high_heel_boots +高跟鞋,high_heels +高踢|高抬腿,high_kick +高马尾,high_ponytail +高开衩的衣物,highleg +高开衩的比基尼,highleg_bikini +高开衩的连身衣,highleg_leotard +高开衩的内裤,highleg_panties +高开衩的泳衣,highleg_swimsuit +挑染,highlights +私立光坂高等学校校服,hikarizaka_private_high_school_uniform +引眉,hikimayu +小山坡,hill +公主发型,hime_cut +髋骨,hip_bones +臀部为焦点,hip_focus +服装露出髋部侧面,hip_vent +八木天线,hirschgeweih_antennas +舀子,hishaku +震动棒|按摩棒,hitachi_magic_wand +人魂,hitodama +点击量、阅览数、关注数(贺图),hits +击打,hitting +拿着某物,holding +抱着动物,holding_animal +拉着弓,holding_arrow +握着斧头,holding_axe +提着包,holding_bag +抱着球,holding_ball +提着篮子,holding_basket +捧着书,holding_book +拿着瓶子,holding_bottle +手捧花束,holding_bouquet +拿着弓(武器),holding_bow_(weapon) +端着碗,holding_bowl +端着箱子,holding_box +憋气,holding_breath +手持扫帚,holding_broom +手持摄像机,holding_camera +拿着易拉罐,holding_can +手持糖果,holding_candy +手持卡片,holding_card +抱着猫,holding_cat +拿着筷子,holding_chopsticks +手叼香烟,holding_cigarette +拿着衣服,holding_clothes +拿着避孕套,holding_condom +手持杯子,holding_cup +手持匕首,holding_dagger +抱着玩偶,holding_doll +拿着眼镜|正在摘眼镜,holding_eyewear +拿着扇子,holding_fan +手持旗帜,holding_flag +拿着花,holding_flower +拿着食物,holding_food +拿着餐叉,holding_fork +拿着水果,holding_fruit +拿着礼物,holding_gift +拿着枪,holding_gun +握着头发|手指绕着头发,holding_hair +牵手,holding_hands +拿着帽子,holding_hat +手里捧着头,holding_head +拿着头盔,holding_helmet +拿着泳圈,holding_innertube +拿着乐器,holding_instrument +拿着刀,holding_knife +拿着叶子,holding_leaf +拿着棒棒糖,holding_lollipop +拿着面具,holding_mask +拿着麦克风,holding_microphone +拿着缝衣针,holding_needle +握着自己的脚,holding_own_foot +手握画笔,holding_paintbrush +拿着内裤,holding_panties +拿着纸,holding_paper +握笔,holding_pen +手持铅笔,holding_pencil +拿着手机,holding_phone +抱着枕头,holding_pillow +拿着烟斗,holding_pipe +拿着披萨,holding_pizza +端着碟子,holding_plate +拿着精灵球,holding_poke_ball +抱着宝可梦,holding_pokemon +手持长柄武器,holding_polearm +拎着包袱,holding_sack +手持镰刀,holding_scythe +握着刀鞘,holding_sheath +手持盾牌,holding_shield +拿着鞋子,holding_shoes +手持招牌,holding_sign +手持长矛,holding_spear +拿着汤勺,holding_spoon +手持法杖,holding_staff +手拉着吊带,holding_strap +抱着毛绒玩具,holding_stuffed_animal +手持手写笔,holding_stylus +手持剑,holding_sword +手持注射器,holding_syringe +拿着毛巾,holding_towel +托着盘子,holding_tray +撑伞,holding_umbrella +手持内衣,holding_underwear +握着棒子,holding_wand +手持武器,holding_weapon +握着鞭子,holding_whip +冬青,holly +全息图片,hologram +全息投影,holographic_interface +武装带,holster +嘟嘴|抿嘴,homu +穗群原学园制服,homurahara_academy_uniform +蜂蜜,honey +蜂窝风格图案,honeycomb_(pattern) +蜂窝风格背景,honeycomb_background +兜帽,hood +放下的兜帽,hood_down +戴起来的兜帽,hood_up +连帽斗篷,hooded_cape +连帽披巾,hooded_cloak +连帽外套,hooded_coat +连帽夹克,hooded_jacket +连帽毛衣,hooded_sweater +连帽运动夹克,hooded_track_jacket +连帽衫,hoodie +铁环,hoop +环状耳环,hoop_earrings +蹄子,hooves +地平线,horizon +一字型瞳孔|蛙眼,horizontal_pupils +水平条纹,horizontal_stripes +系着蝴蝶结的角,horn_bow +角上有饰物,horn_ornament +角上系着缎带,horn_ribbon +角状头饰,horned_headwear +带角头盔,horned_helmet +角|有角的,horns +为角留了洞的头饰,horns_through_headwear +恐怖(主题),horror_(theme) +马,horse +马耳,horse_ears +马娘,horse_girl +马尾,horse_tail +骑马,horseback_riding +软管,hose +病床,hospital_bed +热的,hot +热狗,hot_dog +千鸟格纹,houndstooth +沙漏,hourglass +房子,house +画法示范,how_to +拥抱,hug +从背后抱上来,hug_from_behind +大呆毛,huge_ahoge +大屁股,huge_ass +爆乳,huge_breasts +大乳头|长乳头,huge_nipples +大阴茎,huge_penis +大尺寸武器,huge_weapon +拟人化,humanization +羞辱,humiliation +饥饿的,hungry +夫妻,husband_and_wife +紫阳花,hydrangea +催眠|催眠术,hypnosis +冰,ice +冰淇淋,ice_cream +冰淇淋筒,ice_cream_cone +冰块,ice_cube +冰翼,ice_wings +糖霜,icing +学生证|身份证,id_card +眼部打码,identity_censor +偶像,idol +假如这两个角色有孩子了,if_they_mated +在想象的,imagining +快亲上了,imminent_kiss +正要被强奸,imminent_rape +正要进行性行为,imminent_sex +正要被插入的小穴,imminent_vaginal +小恶魔,imp +被刺穿的,impaled +暗示其是扶她,implied_futanari +其实插进去了吧|遮挡了性交的结合处,implied_sex +暗示其是女同性恋,implied_yuri +人体彩绘般的紧身衣,impossible_bodysuit +有不现实的包裹程度的衣服,impossible_clothes +过分紧身的衣服,impossible_dress +人体彩绘般的连身衣,impossible_leotard +贴合程度不合逻辑的衬衫,impossible_shirt +人体彩绘般的泳衣,impossible_swimsuit +受精图,impregnation +在盒子里的|箱内女孩,in_box +(角色)在桶里,in_bucket +(角色)在容器里,in_container +(角色)在杯子里,in_cup +(角色)在食物里,in_food +有某物冲到脸上,in_the_face +乱伦,incest +递食物,incoming_food +递礼物,incoming_gift +等待接吻|献吻,incoming_kiss +食指抬起,index_finger_raised +盘腿坐,indian_style +室内,indoors +无穷|无限符号,infinity +充气筏,inflatable_raft +充气玩具,inflatable_toy +隆起的腹部,inflation +有伤的,injury +墨水|颜料,ink +颜料罐(喷射战士),ink_tank_(splatoon) +直排轮溜冰鞋,inline_skates +泳圈,innertube +昆虫,insect +昆虫翅膀,insect_wings +插入其他镜头或图片,inset +证章,insignia +即堕2格漫画,instant_loss_2koma +乐器,instrument +乐器盒,instrument_case +EVA神经连接器,interface_headset +紧扣的双手,interlocked_fingers +内射断面图,internal_cumshot +跨人种的,interracial +异种奸|异种爱,interspecies +静脉滴注,intravenous_drip +内陷乳头,inverted_nipples +隐形椅子,invisible_chair +隐去阴茎,invisible_penis +iPhone,iphone +iPod,ipod +铁十字架,iron_cross +深喉,irrumatio +so-nanoka(是这样啊的口癖),is_that_so +岛屿,island +意大利国旗,italian_flag +意大利语,italian_text +南瓜灯,jack-o'-lantern +夹克,jacket +用夹克束腰,jacket_around_waist +用夹克披肩,jacket_on_shoulders +脱下夹克|脱下的夹克,jacket_removed +美洲豹尾巴,jaguar_tail +日本,japan +日本铠甲,japanese_armor +日式服装,japanese_clothes +日本国旗,japanese_flag +加帕里馒头,japari_bun +加帕里符号,japari_symbol +罐子,jar +嫉妒的,jealous +牛仔裤,jeans +水母,jellyfish +运动衫,jersey +小丑帽,jester_cap +喷气式飞机,jet +漆黑的骑士王ver新宿1999,jet_black_king_of_knights_ver._shinjuku_1999 +蓬莱玉枝,jeweled_branch_of_hourai +珠宝,jewelry +僵尸,jiangshi +铃铛,jingle_bell +轻蔑的眼神,jitome +关节,joints +JOJO立,jojo_pose +盒装饮料,juice_box +朱丽叶袖,juliet_sleeves +跳跃的,jumping +(尤指女式)连衫裤,jumpsuit +KDA组合(英雄联盟),k/da_(league_of_legends) +头盔|甲胄,kabuto +镜饼,kagami_mochi +怪兽,kaijuu +卡米纳墨镜,kamina_shades +神山高中校服,kamiyama_high_school_uniform +鬼金棒,kanabou +cos成鹿目圆,kaname_madoka_(cosplay) +汉字,kanji +簪子,kanzashi +河童,kappa +(烹饪时穿的)罩衫,kappougi +唐伞妖怪,karakasa_obake +卡拉OK,karaoke +乌天狗,karasu_tengu +狩衣,kariginu +日本刀,katana +KEEP-OUT警戒线,keep_out +继续高中军服,keizoku_military_uniform +继续高中校服,keizoku_school_uniform +兽耳模式,kemonomimi_mode +方头巾,kerchief +袈裟,kesa +番茄酱,ketchup +水壶,kettle +钥匙,key +电脑键盘,keyboard_(computer) +键盘乐器,keyboard_(instrument) +钥匙扣,keychain +锁扣,keyhole +要石,keystone +提水板,kickboard +踢,kicking +玩偶装,kigurumi +和服,kimono +掀起和服,kimono_lift +剥开和服,kimono_pull +和服裙,kimono_skirt +幼儿园制服,kindergarten_uniform +杵,kine +国王(国际象棋),king_(chess) +cos成雾雨魔理沙,kirisame_marisa_(cosplay) +烟斗,kiseru +吻|亲吻|接吻,kiss +县立北高校制服|县立北高校(凉宫春日的忧郁),kita_high_school_uniform +厨房,kitchen +妖狐,kitsune +清澄高中校服,kiyosumi_school_uniform +及膝靴,knee_boots +护膝,knee_pads +顶起膝盖,knee_up +及膝袜,kneehighs +跪着,kneeling +膝窝,kneepits +膝盖,knees +膝盖蜷到胸前,knees_on_chest +膝盖顶到胸部,knees_to_chest +双膝并拢双脚分开,knees_together_feet_apart +顶起双膝,knees_up +小刀,knife +骑士,knight +骑士(国际象棋),knight_(chess) +小辣妹,kogal +木叶村标志,konohagakure_symbol +韩服,korean_clothes +韩文,korean_text +被炉,kotatsu +小手,kote +新年问候,kotoyoro +香霖堂天狗装束,kourindou_tengu_costume +苦无,kunai +黑森峰女子学园军服,kuromorimine_military_uniform +黑森峰女子学园校服,kuromorimine_school_uniform +腹当|草摺,kusazuri +九尾|九尾狐,kyuubi +弓道,kyuudou +白大褂,labcoat +阴唇,labia +蕾丝,lace +蕾丝边胸罩,lace-trimmed_bra +蕾丝边连衣裙,lace-trimmed_dress +蕾丝边手套,lace-trimmed_gloves +蕾丝边发带,lace-trimmed_hairband +蕾丝边裤袜,lace-trimmed_legwear +蕾丝边内裤,lace-trimmed_panties +蕾丝边短裙,lace-trimmed_skirt +蕾丝边袖子,lace-trimmed_sleeves +系带靴,lace-up_boots +蕾丝胸罩,lace_bra +蕾丝内裤,lace_panties +蕾丝镶边,lace_trim +母乳|搾乳|喷乳,lactation +隔着衣服泌乳,lactation_through_clothes +梯子,ladder +长柄勺,ladle +瓢虫,ladybug +禁忌“莱瓦汀”,laevatein +湖,lake +蛇娘,lamia +灯,lamp +路灯,lamppost +(古骑兵的)长矛|长枪,lance +景观,landscape +灯笼,lantern +挂绳,lanyard +膝枕,lap_pillow +笔记本电脑,laptop +大乳晕,large_areolae +大蝴蝶结,large_bow +巨乳,large_breasts +大帽子,large_hat +巨根插入,large_insertion +大阴茎,large_penis +大尾巴,large_tail +大翅膀,large_wings +激光,laser +胶衣,latex +乳胶手套,latex_gloves +拉丁式十字架,latin_cross +在笑的,laughing +洗衣店,laundry +桂冠,laurel_crown +分层比基尼,layered_bikini +分层服装,layered_clothing +分层连衣裙,layered_dress +分层的半裙,layered_skirt +分层的衣袖,layered_sleeves +叶子,leaf +叶子发饰,leaf_hair_ornament +头上的叶子,leaf_on_head +叶子印花,leaf_print +叶伞|当伞用的叶子,leaf_umbrella +倚靠|身体倾斜,leaning +身体往后靠,leaning_back +身体前倾,leaning_forward +靠在物体上,leaning_on_object +向一侧倾斜身体,leaning_to_the_side +皮带,leash +皮革,leather +皮靴,leather_boots +皮手套,leather_gloves +皮夹克,leather_jacket +左撇子,left-handed +从左往右读的漫画,left-to-right_manga +腿部系着带子,leg_belt +腿顶在大腿之间,leg_between_thighs +裤袜有开口,leg_cutout +腿环|袜带,leg_garter +抓着腿,leg_grab +腿毛,leg_hair +单腿抬高,leg_lift +用双腿夹住,leg_lock +腿部丝带,leg_ribbon +腿部纹身,leg_tattoo +抬起腿,leg_up +腿套|暖腿袜,leg_warmers +传说宝可梦,legendary_pokemon +紧身裤,leggings +双腿,legs +两腿分开,legs_apart +大腿小腿折起来,legs_folded +双腿抬过头,legs_over_head +双腿并拢,legs_together +双腿抬起,legs_up +短裤穿在裤袜外面,legwear_under_shorts +柠檬,lemon +柠檬片,lemon_slice +镜头光晕,lens_flare +豹纹,leopard_print +豹尾,leopard_tail +紧身连身衣裤|芭蕾服,leotard +把连衣裤裆部剥到一边,leotard_aside +剥开连衣裤,leotard_pull +内层穿着连衣裤,leotard_under_clothes +信,letter +宽银幕格式,letterboxed +棒球服,letterman_jacket +图书馆,library +舔,licking +舔嘴唇,licking_lips +舔阴茎,licking_penis +救生圈,lifebuoy +被对方掀起衣物,lifted_by_another +掀起自己的衣物,lifted_by_self +将人抱起,lifting_person +光,light +有光泽的乳晕,light_areolae +浅蓝色的眼睛,light_blue_eyes +浅蓝色头发,light_blue_hair +淡淡的腮红,light_blush +浅棕色眼睛,light_brown_eyes +茶色头发,light_brown_hair +灯泡,light_bulb +圣光,light_censor +浅绿色头发,light_green_hair +光粒子,light_particles +光照,light_rays +浅笑,light_smile +光线移动的轨迹,light_trail +打火机,lighter +灯塔,lighthouse +闪电,lightning +光剑,lightsaber +百合花,lily_(flower) +铃兰,lily_of_the_valley +睡莲,lily_pad +有限色盘作画,limited_palette +素线画,lineart +一排人,lineup +女式内衣,lingerie +狮子,lion +狮子耳朵,lion_ears +狮子尾巴,lion_tail +咬嘴唇,lip_biting +唇钉,lip_piercing +嘴唇,lips +口红|唇膏,lipstick +口红印,lipstick_mark +口红管,lipstick_tube +液体,liquid +液态头发,liquid_hair +《Listen!!》,listen!! +活体衣服,living_clothes +蜥蜴,lizard +蜥蜴尾巴,lizard_tail +承重背心,load_bearing_vest +乐福鞋,loafers +锁,lock +互挽手臂,locked_arms +储物柜,locker +更衣室,locker_room +枯树干,log +标志,logo +遮裆布,loincloth +洛丽塔时装,lolita_fashion +洛丽塔发带,lolita_hairband +棒棒糖,lollipop +长辫子,long_braid +长外套,long_coat +长连衣裙,long_dress +长指甲,long_fingernails +长发,long_hair +长腿,long_legs +尖尖的长耳朵,long_pointy_ears +长马尾,long_ponytail +长裙,long_skirt +长袖,long_sleeves +长脚趾甲,long_toenails +长舌头,long_tongue +撞脸,look-alike +遥望,looking_afar +看着另一个,looking_at_another +瞄着乳房,looking_at_breasts +看着阴茎,looking_at_penis +看手机,looking_at_phone +看向阅图者,looking_at_viewer +移开视线,looking_away +回眸,looking_back +俯视,looking_down +看向侧面,looking_to_the_side +仰视,looking_up +松散的带子(衣物),loose_belt +松散的领带,loose_necktie +泡泡袜,loose_socks +防晒霜|身体乳,lotion +乳液瓶子,lotion_bottle +莲花,lotus +躺椅,lounge_chair +休闲服,loungewear +情书,love_letter +低扎长发,low-tied_long_hair +低扎马尾,low_ponytail +低扎双尾,low_twintails +腰间的翅膀,low_wings +下半身特写,lower_body +低腰款式,lowleg +低腰比基尼,lowleg_bikini +低腰内裤,lowleg_panties +低腰裤子,lowleg_pants +润滑油,lube +行李,luggage +鲁纳诺娃魔法学校校服,luna_nova_school_uniform +琵琶(乐器),lute_(instrument) +躺着的,lying +躺在人身上,lying_on_person +歌词,lyrics +m4卡宾枪,m4_carbine +m1911型枪,m1911 +M字摆腿,m_legs +马卡龙,macaron +权杖,mace +机械,machine +机关枪,machine_gun +机械武装,machinery +三点贴纸,maebari +勾玉,magatama +杂志,magazine +弹匣(武器),magazine_(weapon) +杂志封面,magazine_cover +魔法,magic +魔法阵,magic_circle +魔法少年,magical_boy +魔法少女,magical_girl +魔能枪,magical_musket +魔术师,magician +放大镜,magnifying_glass +麻将,mahjong +麻将牌,mahjong_tile +女仆,maid +女仆围裙,maid_apron +女仆比基尼,maid_bikini +女仆头饰,maid_headdress +化妆,makeup +海苔卷寿司,makizushi +男性特写,male_focus +男性自慰,male_masturbation +男性的阴毛,male_pubic_hair +男用泳裤,male_swimwear +男用内裤,male_underwear +木槌,mallet +麻美断头梗,mami_mogu_mogu +龙人族,manakete +蜜柑,mandarin_orange +图中有漫画,manga_(object) +强壮的,manly +地图,map +枫叶,maple_leaf +记号笔|记号笔写的标记,marker +用记号笔作的画,marker_(medium) +♂,mars_symbol +珍妮鞋,mary_janes +睫毛膏,mascara +吉祥物,mascot +面具,mask +掀到头上的面具,mask_on_head +拉着口罩,mask_pull +摘下的面具,mask_removed +抖M,masochism +大师之剑,master_sword +自慰,masturbation +隔着衣服自慰,masturbation_through_clothing +情侣装,matching_outfit +授精体位,mating_press +肉,meat +机甲,mecha +机娘,mecha_musume +机械臂,mechanical_arms +机械式天使光环,mechanical_halo +机械义足,mechanical_legs +露出机损,mechanical_parts +出芯笔,mechanical_pencil +机械翼,mechanical_wings +机械化,mechanization +医用眼罩,medical_eyepatch +普通罩杯的胸,medium_breasts +中等长发,medium_hair +中等长裙子,medium_skirt +Mega形态宝可梦,mega_pokemon +扩音器,megaphone +明治年间女学生制服,meiji_schoolgirl_uniform +meka(守望先锋),meka_(overwatch) +蜜瓜,melon +菠萝包,melon_bread +融化,melting +meme|模因,meme +ACG流行款服装,meme_attire +菜单,menu +美人鱼,mermaid +圣诞快乐,merry_christmas +邮差包,messenger_bag +身上有凌乱的污渍,messy +凌乱的头发,messy_hair +金属项圈,metal_collar +迷你比基尼,micro_bikini +布料很少的内裤,micro_panties +小尺寸短裤,micro_shorts +小尺寸连衣裙,microdress +麦克风,microphone +麦克风杆,microphone_stand +超短裙,microskirt +中指,middle_finger +腹部,midriff +微微露出腹部,midriff_peek +巫女,miko +军事,military +军帽,military_hat +军装夹克,military_jacket +军事作战员,military_operator +军服,military_uniform +军用载具,military_vehicle +牛奶,milk +奶瓶,milk_bottle +牛奶盒,milk_carton +挤奶器,milking_machine +毫厘笔作画(中号),millipen_(medium) +采耳,mimikaki +洗脑,mind_control +迷你八卦炉,mini-hakkero +迷你皇冠,mini_crown +迷你帽子,mini_hat +迷你礼帽,mini_top_hat +迷你翅膀,mini_wings +小人儿,miniboy +迷你女孩,minigirl +迷你裙,miniskirt +万宝槌,miracle_mallet +镜子,mirror +双色比基尼,mismatched_bikini +双色鞋子,mismatched_footwear +双色手套,mismatched_gloves +双色裤袜,mismatched_legwear +头发和阴毛颜色不一,mismatched_pubic_hair +导弹,missile +正常位,missionary +误会,misunderstanding +见泷原中学校制服,mitakihara_school_uniform +巴纹,mitsudomoe_(shape) +连指手套,mittens +混浴,mixed_bathing +手绘和数绘并存风格,mixed_media +拌料盆,mixing_bowl +二男加一女,mmf_threesome +呻吟,moaning +头巾式室内女帽,mob_cap +麻薯,mochi +中分的阴毛,moderate_pubic_hair +莫霍克发型,mohawk +痣,mole +乳房上有痣,mole_on_breast +脖子上有痣,mole_on_neck +大腿上有痣,mole_on_thigh +眼角有痣,mole_under_eye +嘴角有痣|美人痣,mole_under_mouth +性骚扰,molestation +熔岩,molten_rock +钱,money +显示器,monitor +猴子,monkey +猴耳,monkey_ears +猴尾,monkey_tail +单色画,monochrome +单色背景,monochrome_background +单片眼镜,monocle +怪物,monster +亚人,monster_boy +人外娘,monster_girl +魔物化,monsterification +月亮,moon +月光,moonlight +拖把,mop +早晨,morning +牵牛花,morning_glory +打了马赛克,mosaic_censoring +苔藓,moss +母女,mother_and_daughter +母子,mother_and_son +母性,motherly +运动导致的模糊,motion_blur +体现运动的线,motion_lines +机动载具,motor_vehicle +摩托车,motorcycle +摩托头盔,motorcycle_helmet +富士山,mount_fuji +山,mountain +山地地平线,mountainous_horizon +老鼠,mouse +鼠标,mouse_(computer) +老鼠耳朵,mouse_ears +鼠娘,mouse_girl +鼠尾巴,mouse_tail +嘴,mouth +嘴角画着口水滴形状的缺口,mouth_drool +用嘴叼着,mouth_hold +口罩,mouth_mask +面纱,mouth_veil +带柄的杯子|马克杯,mug +姆Q,mukyuu +细带款比基尼,multi-strapped_bikini +头发绑着多条系带,multi-tied_hair +多彩的,multicolored +多色背景,multicolored_background +多色款比基尼,multicolored_bikini +多色款连体衣,multicolored_bodysuit +多色款斗篷,multicolored_cape +多色款衣服,multicolored_clothes +多色款连衣裙,multicolored_dress +有多种颜色的眼睛,multicolored_eyes +多色款手套,multicolored_gloves +多色的头发,multicolored_hair +多色款腿部服饰,multicolored_legwear +多彩指甲,multicolored_nails +多彩皮肤,multicolored_skin +多色款裙子,multicolored_skirt +多彩的条纹,multicolored_stripes +有多种颜色的翅膀,multicolored_wings +多列四格漫画,multiple_4koma +多只蝴蝶结,multiple_bows +多个男孩,multiple_boys +多股(麻花)辫,multiple_braids +多部作品的角色或元素同框,multiple_crossover +多个耳环,multiple_earrings +多人口交,multiple_fellatio +多个女孩,multiple_girls +头发上有多只蝴蝶结,multiple_hair_bows +多足角色,multiple_legs +多个单色结构拼接成的,multiple_monochrome +多个性别暧昧的角色,multiple_others +两个女性(两对胸)进行乳交,multiple_paizuri +阴茎包围网,multiple_penises +多重形象的角色,multiple_persona +多人骑着车,multiple_riders +多条尾巴,multiple_tails +多角度|多姿势,multiple_views +多对翅膀,multiple_wings +木乃伊,mummy +超能力或魔法用于日常使用,mundane_utility +胸甲,muneate +肌肉发达的女性,muscular_female +蘑菇,mushroom +音乐,music +音符,musical_note +音符发饰,musical_note_hair_ornament +胡子,mustache +哑光,muted_color +枪口火光,muzzle_flash +神兽(口袋妖怪),mythical_pokemon +薙刀,naginata +钉子|指甲,nail +美甲,nail_art +指甲油,nail_polish +裸体围裙,naked_apron +裸体且裹着绷带,naked_bandage +裸体且披着斗篷,naked_cape +裸体且披着大衣,naked_coat +裸体且穿着连帽衫,naked_hoodie +裸体且穿着和服,naked_kimono +裸体且穿着工作服,naked_overalls +裸体且缠着丝带,naked_ribbon +裸体且裹着围巾,naked_scarf +裸体且裹着被子,naked_sheet +裸体且穿着衬衫,naked_shirt +裸体毛衣,naked_sweater +裸体且穿着搭肩衫,naked_tabard +裸体且裹着毛巾,naked_towel +记名牌,name_tag +人与人的同名梗,namesake +波路中学校服,nami_junior_high_school_uniform +七色丘中学校服,nanairogaoka_middle_school_uniform +七森中学校服,nanamori_school_uniform +nanodesu(口癖),nanodesu_(phrase) +私立直江津高校制服,naoetsu_high_school_uniform +脖颈,nape +妖娆的腰,narrow_waist +新大岛学园校服,national_shin_ooshima_school_uniform +印第安人,native_american +"《夏色えがおで1,2,Jump!》",natsuiro_egao_de_1_2_jump! +大自然,nature +下流的表情,naughty_face +海军制服,naval_uniform +肚脐,navel +服装上有露出肚脐的开口,navel_cutout +脐钉,navel_piercing +海军蓝腿部服饰,navy_blue_legwear +只穿着内衣(或泳衣)和围裙,nearly_naked_apron +脖子,neck +颈部系着缎带,neck_ribbon +颈环,neck_ring +波浪褶边的领子,neck_ruff +领巾,neckerchief +项链,necklace +领带,necktie +领带贴(夹)在乳沟里,necktie_between_breasts +抓着领带,necktie_grab +抓着颈部的衣服,neckwear_grab +针,needle +睡袍,negligee +猫又,nekomata +年贺状,nengajou +霓虹灯效果,neon_trim +不安的,nervous +不安地微笑,nervous_smile +网,net +NTR,netorare +新式死库水,new_school_swimsuit +新年,new_year +人妖,newhalf +报纸,newspaper +蘸水笔画风,nib_pen_(medium) +夜,night +夜空,night_sky +睡帽,nightcap +睡衣,nightgown +日本画,nihonga +耐克,nike +忍者,ninja +任天堂3ds,nintendo_3ds +任天堂ds,nintendo_ds +任天堂switch,nintendo_switch +乳头打了码,nipple_censor +乳头夹,nipple_clamps +乳头穿刺,nipple_piercing +乳环,nipple_rings +乳头走光,nipple_slip +玩弄乳头,nipple_tweak +能露出乳头的衣物,nippleless_clothes +乳头,nipples +去掉了(原设有的)蒙眼要素,no_blindfold +没穿胸罩,no_bra +去掉了(原设有的)眼罩,no_eyepatch +没画出眼睛,no_eyes +去掉了(原设有的)眼镜,no_eyewear +去掉了(原设有的)手套,no_gloves +去掉了(原设有的)帽子,no_hat +去掉了(原设有的)头饰,no_headwear +不含人类,no_humans +没画出腿,no_legs +裆胯以下裸着,no_legwear +几乎没用勾线的画,no_lineart +没画出嘴,no_mouth +没画出乳头,no_nipples +没画出鼻子,no_nose +下半身真空,no_panties +没穿着裤子和裙子,no_pants +没画出瞳孔,no_pupils +裸着裆部但没画小穴,no_pussy +没穿衬衫,no_shirt +没穿鞋,no_shoes +没穿袜子,no_socks +没画睾丸,no_testicles +去掉了(原设有的)翅膀,no_wings +能面,noh_mask +改款过的日本服饰,nontraditional_miko +面条,noodles +套索,noose +鼻子,nose +鼻子发红,nose_blush +鼻涕泡,nose_bubble +柴刀,nose_hatchet +鼻子有穿孔,nose_piercing +流鼻血,nosebleed +笔记本,notebook +便条簿,notepad +表示角色注意到了什么的线条,notice_lines +小说封面,novel_cover +恶搞的马赛克,novelty_censor +裸体,nude +裸体封面,nude_cover +数字,number +编号纹身,number_tattoo +编号,numbered +序印者(魔法少女奈叶),numbers_(nanoha) +修女,nun +护士,nurse +护士帽,nurse_cap +喂奶手交,nursing_handjob +o3o,o3o +带O型环的衣物,o-ring +带O型环的比基尼,o-ring_bikini +带O型环的下装,o-ring_bottom +带O型环的上衣,o-ring_top +o_o,o_o +便当,obentou +和服腰带,obi +抱着某物,object_hug +异物插入,object_insertion +人与物的同名梗,object_namesake +头上有非头饰类的物体,object_on_head +物品化,objectification +海,ocean +章鱼士兵,octarian +章鱼,octopus +钉宫信长,oda_uri +有个异类,odd_one_out +简朴的画作,oekaki +露肩连衣裙,off-shoulder_dress +露肩夹克,off-shoulder_jacket +露肩衬衫,off-shoulder_shirt +露肩毛衣,off-shoulder_sweater +露肩式,off_shoulder +办公室,office +办公椅,office_chair +办公室女职员,office_lady +原作画风|官方画风,official_style +符纸,ofuda +衣服上贴着符纸,ofuda_on_clothes +OK手势,ok_sign +尤指米斯蒂娅·萝蕾拉的形象,okamisty +老人,old_man +旧式死库水,old_school_swimsuit +老妇,old_woman +老化,older +守旧派画风,oldschool +煎蛋卷,omelet +蛋包饭,omurice +仰躺,on_back +在床上,on_bed +坐在椅子上,on_chair +在沙发上,on_couch +在书桌上,on_desk +在地板上,on_floor +在地上,on_ground +在头上,on_head +在膝盖上,on_lap +在人身上,on_person +在肩上,on_shoulder +侧卧,on_side +趴着,on_stomach +在桌子上,on_table +在交通工具上,on_vehicle +御柱,onbashira +独眼,one-eyed +一体式泳衣,one-piece_swimsuit +泳衣晒痕,one-piece_tan +露出一只乳房,one_breast_out +闭着一只眼,one_eye_closed +一只眼被遮住,one_eye_covered +单膝跪,one_knee +只扎了一边的头发,one_side_up +鬼,oni +鬼角,oni_horns +鬼面,oni_mask +饭团,onigiri +阴阳师,onmyouji +拟声词,onomatopoeia +温泉,onsen +大洗联队徽章,ooarai_(emblem) +大洗联队军装,ooarai_military_uniform +大洗联队军服,ooarai_school_uniform +祓串,oonusa +大太刀|野太刀,ootachi +没透出眼睛的眼镜,opaque_glasses +胸部开口的毛衣,open-chest_sweater +翻开的书,open_book +(前面)解开的胸罩,open_bra +敞开的衣服,open_clothes +敞开式外套,open_coat +解开的衣领,open_collar +开门,open_door +解开的裤子拉链,open_fly +张开的手,open_hand +敞开的连帽衫,open_hoodie +敞开的夹克,open_jacket +敞开的和服,open_kimono +张嘴的,open_mouth +敞开的衬衫,open_shirt +敞开的短裤,open_shorts +敞开的裙子,open_skirt +敞开的背心,open_vest +打开的窗户,open_window +开门,opening_door +巨乳萝莉,oppai_loli +用嘴,oral +迎接射精而伸出舌头,oral_invitation +橙色背景,orange_background +橙色比基尼,orange_bikini +橙色紧身衣,orange_bodysuit +橙色蝴蝶结,orange_bow +橙色项圈,orange_choker +橙色连衣裙,orange_dress +橙色的眼镜,orange_eyes +橙色的花,orange_flower +橙色鞋类,orange_footwear +橙色的手套,orange_gloves +橙色头发,orange_hair +橙色发带,orange_hairband +橙色头饰,orange_headwear +橙色夹克,orange_jacket +橙色和服,orange_kimono +橙色裤袜,orange_legwear +橙色指甲,orange_nails +橙色领子,orange_neckwear +橙色内裤,orange_panties +橙色丝带,orange_ribbon +橙色水手领,orange_sailor_collar +橙色围巾,orange_scarf +橙色发束,orange_scrunchie +橙色衬衫,orange_shirt +橙色短裤,orange_shorts +橙色短裙,orange_skirt +橙色的天空,orange_sky +片橙,orange_slice +橙色,orange_theme +球体,orb +半兽人,orc +高潮|绝顶,orgasm +乱交,orgy +折纸,origami +阿宅|御宅族,otaku +性别不定的非人角色,other_focus +女装少年|伪娘,otoko_no_ko +国立音乃木坂学院女子制服,otonokizaka_school_uniform +关键部位在画面外,out-of-frame_censoring +不符角色原本性格,out_of_character +画里只有一部分,out_of_frame +户外|野战,outdoors +角色形象有描边,outline +一部分画到了背景框外面,outside_border +伸出手臂,outstretched_arm +伸出双臂,outstretched_arms +伸出的手,outstretched_hand +隔热手套,oven_mitts +过膝,over-kneehighs +下半无框眼镜,over-rim_eyewear +扛在肩上,over_shoulder +工作服,overalls +大衣,overcoat +精液溢出(未拔出),overflow +杂草丛生,overgrown +巨大生物,oversized_animal +过大号的衣服,oversized_clothes +大件物体,oversized_object +过大号衬衫,oversized_shirt +外裙|套裙,overskirt +卵子,ovum +猫头鹰,owl +双手合十,own_hands_together +P90,p90 +奶嘴,pacifier +桨,paddle +挂锁,padlock +页码,page_number +宝塔,pagoda +疼痛,pain +油漆|涂料|颜料,paint +溅上了涂料,paint_splatter +画笔,paintbrush +绘画,painting +绘画,painting_(object) +乳交,paizuri +着衣乳交,paizuri_under_clothes +睡衣,pajamas +苍白的颜色,pale_color +苍白的皮肤,pale_skin +调色板,palette +椰树,palm_tree +手掌,palms +薄饼,pancake +熊猫,panda +女性穿着西服,pant_suit +内裤|短裤|裤袜,panties +一只脚上挂着内裤,panties_around_one_leg +内裤挪到一边,panties_aside +头上套着内裤,panties_on_head +内裤已被脱下,panties_removed +穿在灯笼裤里的内裤,panties_under_buruma +穿在裤袜里的内裤,panties_under_pantyhose +裤子,pants +褪下裤子,pants_down +拉着裤子,pants_pull +卷起裤脚,pants_rolled_up +把内裤往上扯,panty_lift +微微露出内裤,panty_peek +拉下来的内裤,panty_pull +连裤袜|裤袜,pantyhose +褪下的裤袜,pantyhose_pull +紧致的裤袜|略透明的裤袜,pantylines +内裤走光,pantyshot +四号战车,panzerkampfwagen_iv +高加索元素,papakha +纸张,paper +纸飞机,paper_airplane +纸袋,paper_bag +纸鹤,paper_crane +纸扇,paper_fan +提灯,paper_lantern +帕拉迪岛军装,paradis_military_uniform +阳伞,parasol +冷糕,parfait +公园,park +风雪外套,parka +模仿捏他别的作品及人物,parody +鹦鹉,parrot +分开的刘海,parted_bangs +张开的嘴唇,parted_lips +部分上色,partially_colored +部分淹没,partially_submerged +解开部分纽扣,partially_unbuttoned +只拉上部分拉链,partially_unzipped +可以看见一部分肛门,partially_visible_anus +可以看见一部分外阴,partially_visible_vulva +派对帽,party_hat +派对拉炮,party_popper +意大利面食,pasta +像粉笔蜡笔作画的,pastel_(medium) +乳贴,pasties +油酥糕点,pastry +补丁,patch +路径,path +打了patreon的logo,patreon_logo +有patreon用户名,patreon_username +护肩,pauldrons +路面,pavement +兽爪手套,paw_gloves +爪子姿势,paw_pose +爪印图案,paw_print +兽爪鞋,paw_shoes +爪子,paws +☮(东方仗助衣服上有),peace_symbol +桃,peach +类似警帽的帽子,peaked_cap +珍珠,pearl_(gemstone) +珍珠项链,pearl_necklace +胸肌,pectorals +尿,pee +撒尿|失禁,peeing +尿到了自己(身体上)衣服上,peeing_self +偷窥,peeking +偷窥(的姿态),peeking_out +偷窥(强调视角),peeping +天马,pegasus +天马骑士,pegasus_knight +旗袍类衣物的前摆,pelvic_curtain +笔,pen +铅笔,pencil +铅笔裙,pencil_dress +短铅笔裙,pencil_skirt +吊坠,pendant +企鹅,penguin +企鹅帽,penguin_hood +阴茎,penis +亮“枪”,penis_awe +抓着阴茎,penis_grab +阴茎勾着内裤,penis_in_panties +阴茎贴着脸,penis_on_face +三角旗,pennant +五芒星印,pentacle +五角星,pentagram +会阴,perineum +丰满的乳房,perky_breasts +垂直位乳交,perpendicular_paizuri +小人儿在头上,person_on_head +人格替换,personality_switch +拟人化,personification +用了透视画法,perspective +变态,pervert +宠物,pet +宠物碗,pet_bowl +宠物play,pet_play +花瓣,petals +水面上漂着花瓣,petals_on_liquid +娇小的,petite +衬裙,petticoat +抚摸,petting +用来暗示阴茎,phallic_symbol +包茎,phimosis +电话|手机,phone +手机屏幕,phone_screen +留声机,phonograph +照片,photo +照片,photo_(object) +照片为背景,photo_background +很像照片的画,photorealistic +钢琴,piano +野餐,picnic +野餐篮,picnic_basket +图中有照片或其他图片,picture_(object) +相框,picture_frame +馅饼,pie +突码头,pier +穿洞,piercing +猪,pig +萌向的内八腿,pigeon-toed +驮着,piggyback +药丸,pill +药丸样式的耳环,pill_earrings +柱子,pillar +柱状画布背景,pillarboxed +枕头,pillow +抓着枕头,pillow_grab +睡帽,pillow_hat +抱枕,pillow_hug +驾驶员,pilot +驾驶员专用服,pilot_suit +背心裙|无袖连衣裙,pinafore_dress +夹鼻眼镜|无脚眼镜,pince-nez +掐,pinching +掐袖子,pinching_sleeves +粉色镜框眼镜,pink-framed_eyewear +粉色围裙,pink_apron +粉色背景,pink_background +粉色比基尼,pink_bikini +粉色上衣,pink_blouse +粉色紧身衣,pink_bodysuit +粉色蝴蝶结,pink_bow +粉色胸罩,pink_bra +粉色开口衬衫,pink_cardigan +粉色项圈,pink_choker +粉色外套,pink_coat +粉色颈圈,pink_collar +粉色连衣裙,pink_dress +粉红色的眼睛,pink_eyes +粉红色的花,pink_flower +粉色鞋类,pink_footwear +粉色手套,pink_gloves +粉红色的头发,pink_hair +粉色发带,pink_hairband +粉色袴,pink_hakama +粉色头饰,pink_headwear +粉色连帽衫,pink_hoodie +粉色夹克,pink_jacket +粉色和服,pink_kimono +粉色裤袜,pink_legwear +粉色连身衣,pink_leotard +粉色的嘴唇,pink_lips +粉红唇膏,pink_lipstick +粉色指甲,pink_nails +粉色领子,pink_neckwear +粉色内裤,pink_panties +粉色裤子,pink_pants +粉色丝带,pink_ribbon +粉红玫瑰,pink_rose +粉色水手领,pink_sailor_collar +粉色围巾,pink_scarf +粉色发束,pink_scrunchie +粉色衬衫,pink_shirt +粉色短裤,pink_shorts +粉色皮肤,pink_skin +粉色短裙,pink_skirt +粉色袖子,pink_sleeves +粉色毛衣,pink_sweater +粉色泳衣,pink_swimsuit +粉色主题,pink_theme +粉色毛巾,pink_towel +粉色背心,pink_vest +粉色翅膀,pink_wings +伸出小拇指,pinky_out +细条纹图案,pinstripe_pattern +管子,pipe +管道,pipes +海盗,pirate +海盗帽,pirate_hat +手枪,pistol +木髓遮阳帽,pith_helmet +像素绘,pixel_art +像素化,pixelated +披萨,pizza +格子呢图案,plaid +格子呢背景,plaid_background +格子呢比基尼,plaid_bikini +格子呢蝴蝶结,plaid_bow +格子呢连衣裙,plaid_dress +格子呢领子,plaid_neckwear +格子呢内裤,plaid_panties +格子呢裤子,plaid_pants +格子呢围巾,plaid_scarf +格子呢衬衫,plaid_shirt +格子呢短裙,plaid_skirt +格子呢背心,plaid_vest +行星,planet +植物,plant +娘化植物,plant_girl +足底屈曲,plantar_flexion +往地上插的剑,planted_sword +往地上插的武器,planted_weapon +塑料袋,plastic_bag +盘子,plate +板甲,plate_armor +平台鞋,platform_footwear +白金色头发,platinum_blonde_hair +兔女郎连身衣,playboy_bunny_leotard +2P玩家角色风格颜色替换,player_2 +在玩耍的,playing +扑克牌,playing_card +打游戏,playing_games +演奏乐器,playing_instrument +掌机,playstation_portable +百褶连衣裙,pleated_dress +百褶裙,pleated_skirt +拨片,plectrum +EVA驾驶员服,plugsuit +身材丰满的,plump +深领,plunging_neckline +口袋,pocket +怀表,pocket_watch +百奇,pocky +百奇日,pocky_day +百奇游戏,pocky_kiss +指示棒,pointer +用手指着,pointing +指着自己,pointing_at_self +指向看图的人,pointing_at_viewer +向上指,pointing_up +敷衍的打码,pointless_censoring +有套子但还是内射了,pointless_condom +妖精耳,pointy_ears +尖头鞋,pointy_footwear +带着尖角的发型,pointy_hair +精灵球,poke_ball +基础红白精灵球,poke_ball_(basic) +精灵球图案,poke_ball_print +精灵球主题,poke_ball_theme +宝可梦图鉴编号,pokedex_number +口袋妖怪(生物)|宝可梦,pokemon_(creature) +头上的口袋妖怪,pokemon_on_head +肩上的口袋妖怪,pokemon_on_shoulder +扑克筹码,poker_chip +戳,poking +极地迦勒底制服,polar_chaldea_uniform +杆子,pole +钢管舞,pole_dancing +长柄武器,polearm +警察,police +警服,police_uniform +女警,policewoman +圆斑点,polka_dot +圆斑背景,polka_dot_background +圆斑比基尼,polka_dot_bikini +圆斑蝴蝶结,polka_dot_bow +圆斑胸罩,polka_dot_bra +圆斑裤袜,polka_dot_legwear +圆斑内裤,polka_dot_panties +圆斑发束,polka_dot_scrunchie +圆斑泳衣,polka_dot_swimsuit +马球衫,polo_shirt +波洛斯皇冠,polos_crown +小绒球(衣物挂件),pom_pom_(clothes) +蓬帕杜发型,pompadour +穗饰披巾|雨披,poncho +池塘,pond +马尾,ponytail +泳池,pool +泳池扶梯,pool_ladder +在泳池边,poolside +爆米花,popcorn +竖起来的衣领,popped_collar +冰棒|棒冰,popsicle +门廊,porch +道士帽,porkpie_hat +色情刊物|色情片,pornography +半身像,portrait +姿势,pose +海报,poster_(object) +锅,pot +土豆,potato +薯片,potato_chips +药水,potion +盆栽,potted_plant +(贴身的)小袋,pouch +浇注|倾倒,pouring +撅嘴,pout +第一人称视角,pov +看向裤裆的视角,pov_crotch +手伸来方向的视角,pov_hands +动力装甲,power_armor +电线,power_lines +动能服,power_suit +电源符号,power_symbol +真理学院军装,pravda_military_uniform +真理学院校服,pravda_school_uniform +念珠,prayer_beads +祈祷,praying +先走汁,precum +怀孕,pregnant +活体头发|可念动的头发,prehensile_hair +(性诱惑)展示,presenting +牧师,priest +公主,princess +公主抱,princess_carry +印花比基尼,print_bikini +印花蝴蝶结,print_bow +印花胸罩,print_bra +印花连衣裙,print_dress +印花手套,print_gloves +印花头饰,print_headwear +印花和服,print_kimono +印花裤袜,print_legwear +印花马克杯,print_mug +印花领带,print_neckwear +印花内裤,print_panties +印花沙滩裙,print_sarong +印花衬衫,print_shirt +印花短裙,print_skirt +植入广告,product_placement +亵渎,profanity +侧颜,profile +类似海报或杂志的插图效果,projected_inset +精液射到空中,projectile_cum +俯卧后入,prone_bone +螺旋桨,propeller +假肢,prosthesis +卖春,prostitution +阴毛,pubic_hair +从衣物里露出一部分阴毛,pubic_hair_peek +淫纹,pubic_tattoo +公开场所(性交),public +野外露出,public_nudity +公用|肉便器|公交车,public_use +撅起的嘴唇,puckered_lips +布丁,pudding +水坑,puddle +斜纹泡泡袖,puff_and_slash_sleeves +蓬蓬的袖子,puffy_detached_sleeves +蓬蓬的长袖,puffy_long_sleeves +勃起的乳头,puffy_nipples +蓬蓬的短袖,puffy_short_sleeves +蓬蓬的短裤,puffy_shorts +蓬蓬的袖子,puffy_sleeves +被另一个人拉(或拉衣物),pulled_by_another +拉下自己的衣物,pulled_by_self +拉,pulling +南瓜,pumpkin +南瓜发饰,pumpkin_hair_ornament +低胸装高跟鞋打扮,pumps +双关语,pun +挥拳,punching +木偶|手偶,puppet +小狗,puppy +紫框眼镜,purple-framed_eyewear +紫色背景,purple_background +紫色比基尼,purple_bikini +紫色紧身衣,purple_bodysuit +紫色蝴蝶结,purple_bow +紫色胸罩,purple_bra +紫色斗篷,purple_cape +紫色项圈,purple_choker +紫色外套,purple_coat +紫色连衣裙,purple_dress +紫色的眼睛,purple_eyes +紫色的花,purple_flower +紫色鞋类,purple_footwear +紫色手套,purple_gloves +紫色头发,purple_hair +紫色发带,purple_hairband +紫色的袴,purple_hakama +紫色头饰,purple_headwear +紫色夹克,purple_jacket +紫色和服,purple_kimono +紫色裤袜,purple_legwear +紫色连身衣,purple_leotard +紫色的嘴唇,purple_lips +紫色指甲,purple_nails +紫色领子,purple_neckwear +紫色内裤,purple_panties +紫色长裤,purple_pants +紫色丝带,purple_ribbon +紫玫瑰,purple_rose +紫色水手领,purple_sailor_collar +紫色围巾,purple_scarf +紫色发束,purple_scrunchie +紫色衬衫,purple_shirt +紫色短裤,purple_shorts +紫色皮肤,purple_skin +紫色裙子,purple_skirt +紫色的天空,purple_sky +紫色的袖子,purple_sleeves +紫色毛衣,purple_sweater +紫色泳衣,purple_swimsuit +紫色,purple_theme +紫色雨伞,purple_umbrella +紫色背心,purple_vest +紫色翅膀,purple_wings +推搡,pushing +小穴,pussy +爱液|淫水,pussy_juice +爱液流成的滩,pussy_juice_puddle +衣物上有爱液的水渍,pussy_juice_stain +拉丝的爱液,pussy_juice_trail +微微渗出的爱液,pussy_peek +清代官帽,qing_guanmao +四股辫子的发型,quad_tails +四分之一音符,quarter_note +皇后(国际象棋),queen_(chess) +羽毛笔,quill +箭筒,quiver +点兔兔之家制服,rabbit_house_uniform +浣熊耳朵,raccoon_ears +浣熊娘,raccoon_girl +浣熊尾巴,raccoon_tail +赛车女郎,race_queen +球拍,racket +辐射符号,radiation_symbol +收音机,radio +无线电天线,radio_antenna +连肩衣袖,raglan_sleeves +用手扶着,railing +轨道,railroad_tracks +雷门高中,raimon +雷门足球队服,raimon_soccer_uniform +雨,rain +彩虹,rainbow +彩虹背景,rainbow_background +彩虹发,rainbow_hair +彩虹色序排列,rainbow_order +雨衣,raincoat +扬起的眉毛,raised_eyebrow +扬起的双眉,raised_eyebrows +举起的拳头,raised_fist +拉面,ramen +波子汽水,ramune +小学书包,randoseru +日式英语,ranguage +强奸|轮奸|凌辱,rape +一副要强奸人的表情,rape_face +西洋剑,rapier +镭射枪,ray_gun +从背后伸手过来刺激性伴侣,reach-around +(有目的地)伸手,reaching +朝画外伸手,reaching_out +阅读,reading +准备拔刀的姿势,ready_to_draw +有具体现实的画,real_life_insert +画有现实世界地标,real_world_location +写实的,realistic +斜倒斜躺姿势,reclining +竖笛,recorder +●REC,recording +红框眼镜,red-framed_eyewear +红色盔甲,red_armor +红色背景,red_background +红色腰带,red_belt +红色比基尼,red_bikini +红色紧身衣,red_bodysuit +红色边框,red_border +红色蝴蝶结,red_bow +红色胸罩,red_bra +红色灯笼裤,red_buruma +红色斗篷,red_cape +红色披肩,red_capelet +红色项圈,red_choker +红色披风,red_cloak +红色外套,red_coat +红色颈圈,red_collar +红十字,red_cross +红色连衣裙,red_dress +红眼睛,red_eyes +红色的花,red_flower +红色鞋类,red_footwear +红色手套,red_gloves +红发,red_hair +红色发带,red_hairband +绯袴,red_hakama +红色头带,red_headband +红色头饰,red_headwear +红帽,red_hood +红色的角,red_horns +红色夹克,red_jacket +红色和服,red_kimono +红色裤袜,red_legwear +红色连身衣,red_leotard +朱唇,red_lips +红色唇膏,red_lipstick +血月,red_moon +红色指甲,red_nails +红色领子,red_neckwear +红色内裤,red_panties +红色裤子,red_pants +红色瞳孔,red_pupils +红色丝带,red_ribbon +红色绳子,red_rope +红玫瑰,red_rose +红色水手领,red_sailor_collar +红色束腰,red_sash +红色围巾,red_scarf +红色巩膜,red_sclera +红色发束,red_scrunchie +红色衬衫,red_shirt +红色短裤,red_shorts +红色皮肤,red_skin +红色裙子,red_skirt +红色的天空,red_sky +红色的袖子,red_sleeves +红色的毛衣,red_sweater +红色泳衣,red_swimsuit +红色|红色主题,red_theme +红色的伞,red_umbrella +红色背心,red_vest +红色翅膀,red_wings +反射|倒映,reflection +反光的眼睛,reflective_eyes +反光地板,reflective_floor +冰箱,refrigerator +(乳头)激凸,rei_no_himo +驯鹿,reindeer +驯鹿角,reindeer_antlers +缰绳,reins +改三|改二|改,remodel_(kantai_collection) +遥控器,remote_control +震动遥控器,remote_control_vibrator +餐厅,restaurant +拘束,restrained +改造(碧蓝航线),retrofit_(azur_lane) +暴露的穿着,revealing_clothes +逆兔女郎,reverse_bunnysuit +背面骑乘位,reverse_cowgirl_position +反握,reverse_grip +对原款服装反向使用布料,reverse_outfit +腿下握颈肛交,reverse_suspended_congress +男装|男装丽人,reverse_trap +背面座体位,reverse_upright_straddle +左轮手枪,revolver +棱纹紧身衣,ribbed_bodysuit +棱纹连衣裙,ribbed_dress +棱纹连身衣,ribbed_leotard +棱纹衬衫,ribbed_shirt +棱纹袖子,ribbed_sleeves +棱纹毛衣,ribbed_sweater +丝带|缎带,ribbon +缎带饰边短裙,ribbon-trimmed_skirt +缎带饰边袖子,ribbon-trimmed_sleeves +被丝带捆住|人捆成礼物的样子,ribbon_bondage +缎带项圈,ribbon_choker +嘴里叼着丝带,ribbon_in_mouth +缎带饰边,ribbon_trim +肋骨,ribs +米,rice +碗,rice_bowl +脸上沾着米饭,rice_on_face +盛饭勺,rice_spoon +骑,riding +马术马鞭,riding_crop +步枪,rifle +舰装,rigging +无框眼镜,rimless_eyewear +戒指,ring +将受虐者的嘴固定成O字,ring_gag +圈圈眼,ringed_eyes +垂下的长鬈发,ringlets +涟漪|波纹,ripples +旭日旗,rising_sun +笏板,ritual_baton +河,river +リヨ画风,riyo_(lyomsnpmp)_(style) +路,road +路标,road_sign +礼袍,robe +机器人,robot +机械耳,robot_ears +机械关节,robot_joints +岩石,rock +火箭,rocket +火箭筒,rocket_launcher +杆,rod +悔悟棒,rod_of_remorse +角色转换,role_reversal +旱冰鞋,roller_skates +翻白眼,rolling_eyes +罗马音,romaji_text +屋顶,rooftop +战车(国际象棋),rook_(chess) +房间,room +绳,rope +玫瑰|蔷薇,rose +玫瑰花瓣,rose_petals +玫瑰印花,rose_print +对称旋转|《少女革命》女同转圈圈,rotational_symmetry +圆底烧瓶,round-bottom_flask +圆形眼镜,round_eyewear +动漫里的没有缝的牙齿,round_teeth +圆窗,round_window +背景或画框是圆角,rounded_corners +橡胶靴,rubber_boots +浴缸小黄鸭,rubber_duck +揉眼睛,rubbing_eyes +瓦砾,rubble +红宝石,ruby_(gemstone) +舰C舰娘专用鞋,rudder_footwear +小地毯,rug +废墟,ruins +尺子,ruler +符文,runes +奔跑,running +俄语,russian_text +被重度(性)虐待的女性,ryona +陵樱学园制服,ryouou_school_uniform +军刀(武器),saber_(weapon) +麻布,sack +悲伤的,sad +马鞍,saddle +曲别针,safety_pin +(因俯身而)下垂的乳房,sagging_breasts +水手服,sailor +水手服款比基尼,sailor_bikini +水手领,sailor_collar +水手服款裙子,sailor_dress +水手帽,sailor_hat +水冰月重绘挑战,sailor_moon_redraw_challenge +美少女战士,sailor_senshi +美少女战士制服,sailor_senshi_uniform +水手服款衬衫,sailor_shirt +战斗服(龙珠),saiyan_armor +盃,sakazuki +酒,sake +酒瓶,sake_bottle +栅川中学校服,sakugawa_school_uniform +樱丘女子高等学校校服,sakuragaoka_high_school_uniform +唾液,saliva +唾液拉丝,saliva_trail +敬礼,salute +样图,sample +武士,samurai +沙,sand +沙雕,sand_sculpture +凉鞋,sandals +脱下凉鞋,sandals_removed +三明治,sandwich +三人夹三明治,sandwiched +铁血工造(少女前线),sangvis_ferri +三白眼,sanpaku +三花团子,sanshoku_dango +圣诞老人靴,santa_boots +圣诞老人服装,santa_costume +圣诞帽,santa_hat +蓝宝石,sapphire_(gemstone) +裹胸布,sarashi +莎笼|沙滩浴巾,sarong +饰带|日式束腰,sash +背包,satchel +茶杯托,saucer +桑德斯军装,saunders_military_uniform +桑德斯附高校服,saunders_school_uniform +香肠,sausage +纱绫形,sayagata +鞘,scabbard +鳞片,scales +伤痕,scar +眼睛上的疤痕,scar_across_eye +脸颊上的疤痕,scar_on_cheek +害怕的,scared +围巾,scarf +红魔馆,scarlet_devil_mansion +风景,scenery +笏,scepter +学校,school +书包,school_bag +课堂椅,school_chair +课桌,school_desk +通学帽,school_hat +校内泳衣|死库水,school_swimsuit +校服,school_uniform +近未来|sf|科幻,science_fiction +片太刀铗,scissor_blade +剪刀,scissors +小型摩托车,scooter +瞄准镜,scope +怒视|嫌弃的眼神|不满的眼神,scowl +划痕|皮肤上的浅伤,scratches +尖叫|大声喊,screaming +屏幕,screen +屏幕色调纹理滤镜,screentones +螺丝钉,screw +螺丝刀,screwdriver +卷轴,scroll +发束,scrunchie +大镰刀,scythe +密封罐|封印,seal +有接缝的裤袜,seamed_legwear +探照灯,searchlight +海贝,seashell +同季度动画角色模仿,season_connection +海藻,seaweed +诱人的微笑,seductive_smile +半透明的|可透过看到里层衣物|因沾湿而变透明,see-through +通过透光的织物看到人体轮廓,see-through_silhouette +半透的袖子,see-through_sleeves +青海波(日本古画素材),seigaiha +《星间飞行》,seikan_hikou +圣祥大附小学校校服,seishou_elementary_school_uniform +正坐,seiza +女性自慰,self_fondle +(同一人但不同版本或形象)自攻自受,selfcest +无上框眼镜,semi-rimless_eyewear +深棕色|乌贼墨色,sepia +水手服,serafuku +严肃的,serious +薮猫尾巴,serval_tail +节分习俗,setsubun +断下来的头,severed_head +缝纫,sewing +性交,sex +泛指从身后插入的体位,sex_from_behind +机械奸,sex_machine +性爱玩具,sex_toy +六人组,sextuplets +性暗示,sexually_suggestive +手铐|镣铐|枷锁,shackles +日荫,shade +阴沉脸,shaded_face +遮阳手势,shading_eyes +阴影,shadow +乐队礼帽,shako_cap +浅水,shallow_water +共浴|鸳鸯浴,shared_bathing +共享食物|用嘴递食物,shared_food +共用一条围巾,shared_scarf +相合伞,shared_umbrella +鲨鱼,shark +锐利的手指甲,sharp_fingernails +鲨鱼牙|锯齿牙,sharp_teeth +锐利的脚趾甲,sharp_toenails +刨冰,shaved_ice +披巾|披帛,shawl +刀鞘|护套,sheath +戴着护套的|入鞘的,sheathed +羊,sheep +羊耳,sheep_ears +羊女孩,sheep_girl +羊角,sheep_horns +透明的衣服,sheer_clothes +透明的裤袜,sheer_legwear +抓着床单,sheet_grab +乐谱,sheet_music +架子,shelf +贝壳,shell +贝壳比基尼,shell_bikini +弹药匣,shell_casing +柴犬,shiba_inu +绳缚,shibari +着衣绑缚,shibari_over_clothes +纸垂,shide +盾,shield +式神,shikigami +色纸,shikishi +注连绳|七五三绳,shimenawa +胫甲,shin_guards +竹刀,shinai +死后世界战线制服,shinda_sekai_sensen_uniform +二次元女体化梗,shindan_maker +新选组,shinsengumi +光泽,shiny +有光泽的衣服,shiny_clothes +有光泽的 头发,shiny_hair +闪光宝可梦,shiny_pokemon +有光泽的皮肤,shiny_skin +船,ship +衬衫,shirt +拉起掀起卷起衬衫,shirt_lift +拉开衬衫,shirt_pull +脱下衬衫,shirt_removed +衬衫外有其他衣服,shirt_tucked_in +扯住衬衫,shirt_tug +赤膊的(男性),shirtless +用脚趾吊着鞋,shoe_dangle +鞋底,shoe_soles +鞋带,shoelaces +鞋子,shoes +脱下的鞋,shoes_removed +流星,shooting_star +店铺,shop +购物,shopping +购物袋,shopping_bag +岸,shore +短裙,short_dress +豆豆眉|短眉毛,short_eyebrows +短发,short_hair +扎起或卷起一部分长发,short_hair_with_long_locks +短裤腿连体衣,short_jumpsuit +短下摆和服,short_kimono +短领带,short_necktie +内侧穿着长袖外面套着短袖衣服,short_over_long_sleeves +短马尾,short_ponytail +热裤,short_shorts +短袖,short_sleeves +短剑,short_sword +短双尾发型,short_twintails +短裤,shorts +褪下短裤,shorts_pull +短裙里穿着短裤,shorts_under_skirt +霰弹枪,shotgun +将棋,shougi +障子,shouji +少女祈祷中,shoujo_kitou-chuu +肩甲,shoulder_armor +挎包|单肩背包,shoulder_bag +肩胛骨,shoulder_blades +肩扛,shoulder_carry +露肩装,shoulder_cutout +肩带式枪套,shoulder_holster +垫肩,shoulder_pads +肩部纹身,shoulder_tattoo +喊叫,shouting +铲子,shovel +淋浴喷头,shower_head +淋浴,showering +展会女郎装束,showgirl_skirt +虾,shrimp +神社,shrine +只包肩的女式上衣,shrug_(clothing) +手里剑,shuriken +嘘(手势),shushing +秀知院学园制服,shuuchiin_academy_uniform +秀尽学院制服,shuujin_academy_uniform +害羞的,shy +兄弟姐妹,siblings +病,sick +镰刀,sickle +人物并排,side-by-side +侧边系带比基尼,side-tie_bikini +侧边系带连身衣,side-tie_leotard +侧边系带内裤,side-tie_panties +披在一侧的单条辫子,side_braid +披在两侧的两条辫子,side_braids +披在两侧的发髻,side_bun +侧边开口的服饰,side_cutout +侧马尾,side_ponytail +侧面有缝的服饰,side_slit +侧乳,sideboob +(男子的)鬓角,sideburns +侧面没有布料的服饰,sideless_outfit +披在耳前的发绺,sidelocks +侧坐在鞍上,sidesaddle +轻轻向侧面瞥,sideways_glance +帽舌划到侧面,sideways_hat +把嘴画在侧脸,sideways_mouth +画出叹气的标记,sigh +写上去的字,sign +签名,signature +默剧漫画,silent_comic +轮廓,silhouette +蛛网|虫茧,silk +银色的眼睛,silver_eyes +银发,silver_hair +银色镶边,silver_trim +朴素的背景,simple_background +简体字,simplified_chinese_text +唱歌,singing +露出单边肩膀,single_bare_shoulder +一只脚没穿靴子,single_boot +单股辫,single_braid +单边没脱掉的袖子,single_detached_sleeve +只一边戴着耳环,single_earring +单手戴着过肘的手套,single_elbow_glove +单手穿着护臂,single_gauntlet +单手穿着手套,single_glove +单侧进气口发型,single_hair_intake +单侧头上有翅膀,single_head_wing +单侧长着角,single_horn +单边穿着过膝服饰,single_kneehigh +单边穿着连裤袜,single_leg_pantyhose +单边长裤,single_pantsleg +一只脚没穿鞋子,single_shoe +单边没有袖管,single_sleeve +单边穿着长袜,single_sock +单边有肩带,single_strap +只有一条腿有穿着,single_thighhigh +露出单边服饰上的垂直条纹,single_vertical_stripe +单翼,single_wing +水槽,sink +姐妹,sisters +坐,sitting +坐在床上,sitting_on_bed +坐在课桌上,sitting_on_desk +颜面骑乘|阴部贴着脸,sitting_on_face +坐在大腿上,sitting_on_lap +坐在人身上,sitting_on_person +十六分音符,sixteenth_note +体格差,size_difference +滑板,skateboard +溜冰鞋,skates +骨架,skeleton +素描,sketch +素描本,sketchbook +签子,skewer +被皮肤包着的角,skin-covered_horns +把嘴的一部分画成虎牙状,skin_fang +衣服很紧显露出身体曲线或肢体特征,skin_tight +皮肤的凹陷,skindentation +很瘦的身材,skinny +短裙,skirt +一条腿上挂着短裙,skirt_around_one_leg +指掀起裙子时形成的篮子形状,skirt_basket +被掀起裙子(含突发情况意义),skirt_flip +优雅地提着裙子,skirt_hold +将裙子掀上去,skirt_lift +拉开裙子,skirt_pull +脱下的短裙,skirt_removed +与上衣搭配的短裙|特制的裙子套装,skirt_set +正装短裙,skirt_suit +扯住裙摆|按住裙摆,skirt_tug +骷髅,skull +头骨和交叉的骨头,skull_and_crossbones +骷髅耳环,skull_earrings +骷髅发饰,skull_hair_ornament +骷髅项链,skull_necklace +骷髅印花,skull_print +天空,sky +天际线,skyline +摩天大楼,skyscraper +掌印,slap_mark +掌掴,slapping +砍,slashing +奴隶,slave +睡奸,sleep_molestation +睡觉,sleeping +(不躺着)睡觉|直立睡觉,sleeping_upright +睡衣,sleepwear +犯困的,sleepy +袖口,sleeve_cuffs +无袖,sleeveless +无袖连衣裙,sleeveless_dress +无袖连帽衫,sleeveless_hoodie +无袖夹克,sleeveless_jacket +无袖和服,sleeveless_kimono +无袖衬衫,sleeveless_shirt +无袖高领毛衣,sleeveless_turtleneck +叠起来的袖子,sleeves_folded_up +袖子长过手指,sleeves_past_fingers +袖子长过手腕,sleeves_past_wrists +袖子往上拉起,sleeves_pushed_up +卷起的袖子,sleeves_rolled_up +移动门,sliding_doors +修长的腿,slim_legs +史莱姆,slime +史莱姆娘,slime_girl +枪支的吊带,sling +弹弓泳衣|吊带泳衣|V字泳衣,slingshot_swimsuit +拖鞋,slippers +猫瞳,slit_pupils +贫乳|娇小的乳房,small_breasts +娇小的乳头,small_nipples +智能手机,smartphone +有气味的,smell +闻,smelling +微笑|笑容,smile +笑脸,smiley_face +傻笑|得意的笑,smirk +烟,smoke +烟囱,smokestack +吸烟,smoking +枪口冒烟,smoking_gun +得意脸,smug +小吃|零食,snack +蜗牛,snail +蛇,snake +蛇发,snake_hair +蛇形发饰|蛙头发饰,snake_hair_ornament +蛇尾,snake_tail +运动鞋,sneakers +打喷嚏,sneezing +狙击步枪,sniper_rifle +潜水设备的通气管,snorkel +鼻涕,snot +动物口鼻部,snout +雪|雪原,snow +雪兔,snow_bunny +雪球,snowball +雪花发饰,snowflake_hair_ornament +雪花印花,snowflake_print +雪花,snowflakes +下雪,snowing +雪人,snowman +泡脚,soaking_feet +肥皂,soap +泡泡,soap_bubbles +肥皂泡马赛克,soap_censor +足球运动,soccer +足球,soccer_ball +足球服,soccer_uniform +袜子吊袜带,sock_garters +拉着袜子,sock_pull +袜子,socks +脱袜子,socks_removed +汽水,soda +汽水瓶,soda_bottle +汽水罐,soda_can +当世袖,sode +士兵,soldier +脚底,soles +实心圆眼睛,solid_circle_eyes +实心椭圆眼睛,solid_oval_eyes +单独人物,solo +以单独人物为焦点,solo_focus +歌名,song_name +大空卷轴,sorcerer's_sutra_scroll +灵魂之石,soul_gem +拟声词,sound_effects +汤,soup +苏联,soviet +宇宙,space +宇宙船,space_craft +宇航服,spacesuit +黑桃,spade_(shape) +意大利面,spaghetti +细肩带|细吊带|肩带式,spaghetti_strap +弹性纤维(衣物面料),spandex +被拍打过的屁股,spanked +打屁股,spanking +菱形或星形的闪光,sparkle +闪着星光的背景,sparkle_background +星星眼,sparkling_eyes +火星子,sparks +麻雀,sparrow +铲(厨具),spatula +护肩,spaulders +发言者,speaker +矛,spear +神枪朗基努斯,spear_the_gungnir +雪中打伞的恋人梗,special_feeling_(meme) +对话框|会话气泡,speech_bubble +速度线,speed_lines +魔咒卡,spell_card +法球|水晶球,sphere +蜘蛛,spider +蜘蛛娘,spider_girl +彼岸花,spider_lily +蜘蛛网,spider_web +蛛网印花,spider_web_print +带钉手镯,spiked_bracelet +狼牙棒,spiked_club +带钉项圈,spiked_collar +刺刺的头发,spiked_hair +带刺的龟壳,spiked_shell +带刺的尾巴,spiked_tail +尖刺|带钉的,spikes +液体从容器中漫出来,spill +从容器中倒出液体的动作,spilling +脊柱,spine +有中心的旋转,spinning +漩涡眼,spiral_eyes +还魂|伴灵|精神体,spirit +惊讶或无语到喷了,spit_take +女性的嘴和下半身被夹攻的体位,spitroast +从嘴里吐出液体的动作,spitting +飞溅,splashing +水球小绿(喷射战士),splattershot_(splatoon) +分开,split +尾部散开的单马尾发型,split_ponytail +分屏,split_screen +表发怒的会话气泡,spoken_anger_vein +表脸红害羞的会话气泡,spoken_blush +会话气泡里带省略号,spoken_ellipsis +会话气泡里带感叹号,spoken_exclamation_mark +会话气泡里带爱心,spoken_heart +会话气泡里出现"!?",spoken_interrobang +表哼歌的会话气泡,spoken_musical_note +会话气泡里出现指代物,spoken_object +表疑惑的会话气泡,spoken_question_mark +表不满和发牢骚的会话气泡,spoken_squiggle +表(流汗)无语的会话气泡,spoken_sweatdrop +海绵|搓澡棉,sponge +汤勺,spoon +嘴里含着勺子,spoon_in_mouth +运动比基尼,sports_bikini +运动内衣,sports_bra +运动服,sportswear +在黑白的画中特意少量使用一些颜色,spot_color +聚光灯,spotlight +掰开肛门,spread_anus +伸展双臂,spread_arms +摊开屁股,spread_ass +张开手指,spread_fingers +张开腿|M字张腿|桃色蹲姿|V字张腿,spread_legs +掰开阴部|掰穴|掰开的阴部,spread_pussy +隔着衣物掰穴,spread_pussy_under_clothes +张开脚趾,spread_toes +调教用的撑杆,spreader_bar +春季,spring_(season) +大葱,spring_onion +方块四格漫画,square_4koma +蹲|工口蹲,squatting +乌贼|鱿鱼,squid +乌贼军团(喷射战士),squidbeak_splatoon +表不满和发牢骚的符号,squiggle +松鼠,squirrel +松鼠耳朵,squirrel_ears +松鼠尾巴,squirrel_tail +圣葛罗莉安娜女学园校徽,st._gloriana's_(emblem) +圣葛罗莉安娜军装,st._gloriana's_military_uniform +圣葛罗莉安娜女学园校服,st._gloriana's_school_uniform +工作人员,staff +工作人员(音乐),staff_(music) +舞台,stage +舞台灯光,stage_lights +污渍,stain +彩色玻璃,stained_glass +台阶,stairs +替身(JOJO),stand_(jojo) +站立的,standing +站在水上或液体上,standing_on_liquid +单腿站立,standing_on_one_leg +站立后入,standing_sex +站立劈叉|站立高抬腿,standing_split +星星眼,star-shaped_pupils +星星(天空中的),star_(sky) +星形,star_(symbol) +星形项圈,star_choker +星形耳环,star_earrings +星之守护者(英雄联盟),star_guardian_(league_of_legends) +星形发饰,star_hair_ornament +星星眼,star_in_eye +星形项链,star_necklace +六芒星,star_of_david +星形乳贴,star_pasties +星形图案|星形印花,star_print +海星,starfish +凝视|盯,staring +星光学园制服,starlight_academy_uniform +星空背景,starry_background +星空,starry_sky +星空印花,starry_sky_print +宝可梦御三家,starter_pokemon +三只宝可梦御三家,starter_pokemon_trio +用装置强行固定人体,stationary_restraints +属性栏|状态表,stats +雕像,statue +隐奸,stealth_sex +蒸汽,steam +蒸汽马赛克|雾气马赛克,steam_censor +发情,steaming_body +蒸汽朋克,steampunk +指尖抵着指间,steepled_fingers +方向盘,steering_wheel +踩着,stepped_on +听诊器,stethoscope +木条|木棍样式的东西,stick +海报,stick_poster +贴纸,sticker +女侍从的制服,stile_uniform +细跟高跟鞋,stiletto_heels +马镫袜,stirrup_legwear +缝合的缝线,stitches +小腹,stomach +鼓起的小腹,stomach_bulge +腹部纹身,stomach_tattoo +石头,stone +石灯笼,stone_lantern +凳子,stool +店面,storefront +灶台,stove +跨坐,straddling +直发,straight_hair +拘束服,straitjacket +绞首,strangling +大件物品的吊带,strap +戴着假阴茎,strap-on +勒在乳房之间的带子,strap_between_breasts +泛指覆盖面少的吊带胸罩,strap_gap +往上掰衣物的吊带,strap_lift +衣物吊带滑落(导致走光),strap_slip +无肩带款式,strapless +无肩带比基尼,strapless_bikini +无肩带胸罩,strapless_bra +无肩带连衣裙,strapless_dress +无肩带连身衣,strapless_leotard +带束带的高跟鞋,strappy_heels +草帽,straw_hat +草莓,strawberry +草莓发饰,strawberry_hair_ornament +草莓内裤,strawberry_panties +草莓印花,strawberry_print +草莓蛋糕|镶草莓的糕点,strawberry_shortcake +沾着阴毛|掉落的阴毛,stray_pubic_hair +有其他颜色条纹的头发,streaked_hair +流动的,stream +流泪,streaming_tears +街道,street +做伸展运动|伸懒腰,stretch +强袭兵装,striker_unit +绳子,string +系绳比基尼,string_bikini +命运的红线,string_of_fate +彩旗串,string_of_flags +系绳内裤,string_panties +条纹样式的,striped +条纹背景,striped_background +条纹比基尼,striped_bikini +条纹蝴蝶结,striped_bow +条纹胸罩,striped_bra +条纹连衣裙,striped_dress +条纹手套,striped_gloves +条纹连帽衫,striped_hoodie +条纹和服,striped_kimono +条纹裤袜,striped_legwear +条纹领子,striped_neckwear +条纹内裤,striped_panties +条纹裤,striped_pants +条纹缎带,striped_ribbon +条纹围巾,striped_scarf +条纹衬衫,striped_shirt +条纹短裤,striped_shorts +条纹裙,striped_skirt +条纹袖子,striped_sleeves +条纹毛衣,striped_sweater +条纹泳衣,striped_swimsuit +条纹尾巴,striped_tail +钢管舞用的钢管,stripper_pole +胡茬,stubble +壁尻,stuck +耳钉,stud_earrings +镶钉皮带,studded_belt +毛绒玩具,stuffed_animal +兔子玩偶,stuffed_bunny +填料布偶玩具,stuffed_toy +画风模仿,style_parody +数绘笔,stylus +冲锋枪,submachine_gun +被淹没的,submerged +带字幕的,subtitled +魅魔,succubus +触手吸盘,suction_cups +方糖,sugar_cube +暗示性的白色液体,suggestive_fluid +西装,suit +西装外套,suit_jacket +手提箱,suitcase +白泽球,sukusuku_hakutaku +夏天,summer +夏祭,summer_festival +太阳,sun +太阳帽,sun_hat +阳光,sunbeam +日出时突现的光照,sunburst +圣代,sundae +太阳裙|背心裙,sundress +向日葵,sunflower +向日葵发饰,sunflower_hair_ornament +墨镜,sunglasses +阳光,sunlight +日出,sunrise +防晒霜,sunscreen +日落,sunset +库巴大王梗,super_crown +超级机器人,super_robot +超级赛亚人,super_saiyan +超级水枪,super_soaker +超级英雄,superhero +消音器,suppressor +苏尔外套,surcoat +冲浪板,surfboard +医用口罩,surgical_mask +惊喜之吻|意外之吻,surprise_kiss +惊讶,surprised +怪诞的|超现实主义,surreal +调查兵团徽章,survey_corps_(emblem) +寿司,sushi +火车便当式体位,suspended_congress +吊带短裤,suspender_shorts +吊带裙|挂肩宽阔外长裤,suspender_skirt +背带,suspenders +悬吊绑缚,suspension +万字符,swastika +汗水,sweat +吸汗带,sweatband +汗珠,sweatdrop +毛衣,sweater +束在腰上的毛衣,sweater_around_waist +毛衣连衣裙,sweater_dress +毛衣背心,sweater_vest +大汗淋漓,sweating_profusely +长袖运动卫衣,sweatshirt +甘薯,sweet_potato +糖果,sweets +扫把状刘海,swept_bangs +泳裤,swim_briefs +泳帽,swim_cap +泳裤,swim_trunks +游泳,swimming +泳装,swimsuit +把泳衣的裆部挪到一边,swimsuit_aside +常夏的泳衣(fgo学妹灵衣),swimsuit_of_perpetual_summer +扯下泳衣,swimsuit_pull +里面穿着泳衣,swimsuit_under_clothes +泳衣,swimwear +摆动,swing +漩涡棒棒糖,swirl_lollipop +剑,sword +绯想之剑,sword_of_hisou +靠在肩上的剑,sword_over_shoulder +符号形瞳孔,symbol-shaped_pupils +眼睛里有符号,symbol_in_eye +胸部贴着胸部,symmetrical_docking +左右对称,symmetry +注射器,syringe +T恤,t-shirt +搭肩衫,tabard +足袋,tabi +桌子,table +台布,tablecloth +平板电脑,tablet_pc +立绘,tachi-e +尾巴,tail +尾巴系着铃铛,tail_bell +尾巴蜷到两腿之间,tail_between_legs +尾巴系着蝴蝶结,tail_bow +尾巴遮住三点|用尾巴打码,tail_censor +抓着尾巴,tail_grab +抱着尾巴,tail_hug +尾巴系着缎带,tail_ribbon +摇尾巴,tail_wagging +燕尾服,tailcoat +鲷鱼烧,taiyaki +选哪个好呢,take_your_pick +拍照|自拍,taking_picture +章鱼烧,takoyaki +画符,talisman +有台词的,talking +打电话,talking_on_phone +计数|画正字,tally +鸟爪,talons +苏格兰平顶帽,tam_o'_shanter +手鼓,tambourine +小麦色(皮肤),tan +棕褐色背景,tan_background +七夕,tanabata +唐装,tangzhuang +坦克,tank +无袖紧身背心,tank_top +坦基尼,tankini +短刀,tantou +狸,tanuki +长条诗笺,tanzaku +胶布,tape +嘴里贴着胶布并被绑起来,tape_gag +貘尾,tapir_tail +无辜的眼神|眼神柔弱,tareme +塔罗牌,tarot +流苏,tassel +襷,tasuki +榻榻米,tatami +乌纱,tate_eboshi +纹身,tattoo +绷紧的衣服,taut_clothes +绷紧的连衣裙,taut_dress +绷紧的衬衫,taut_shirt +茶,tea +茶具,tea_set +教师,teacher +茶杯,teacup +⑨之组,team_9 +火箭队,team_rocket +团队合作|多名女性服侍,teamwork +茶壶,teapot +泪珠,teardrop +撕破衣服,tearing_clothes +要哭的表情,tearing_up +眼泪,tears +戏弄,teasing +泰迪熊,teddy_bear +青少年|青少年化,teenage +牙齿,teeth +来自tegaki博客,tegaki +念力,telekinesis +弱电柱|通信电塔,telephone_pole +望远镜,telescope +电视,television +泰事达牌足球,telstar +天妇罗,tempura +天狗,tengu +天狗木屐,tengu-geta +天狗面具,tengu_mask +网球,tennis_ball +网球拍,tennis_racket +网球衫,tennis_uniform +帐篷,tent +触手头发,tentacle_hair +触手奸,tentacle_sex +触手,tentacles +触手伸进衣服里,tentacles_under_clothes +晴天娃娃,teruterubouzu +EVA驾驶服试作款,test_plugsuit +试管,test_tube +睾丸,testicles +文字元素,text_focus +纯文字的图,text_only_page +泰文,thai_text +感谢绘,thank_you +趴在地上并翘起脚,the_pose +盗窃,theft +突出主题,themed_object +体温计,thermometer +魔法瓶,thermos +之后干了个爽,they_had_lots_of_sex_afterwards +浓眉,thick_eyebrows +肉腿,thick_thighs +高帮靴,thigh_boots +大腿之间(绝对领域),thigh_gap +抓着大腿,thigh_grab +腿上系着带子或工具包或枪套,thigh_holster +腿上绑着工具包,thigh_pouch +大腿系着缎带,thigh_ribbon +素股|腿夹,thigh_sex +大腿缠着带子,thigh_strap +有腿环的裤袜,thighband_pantyhose +含过膝袜和裤袜,thighhighs +拉着长袜|拉着过膝袜|拉着裤袜,thighhighs_pull +长袜、过膝袜等外面再穿着靴子,thighhighs_under_boots +大腿,thighlet +大腿,thighs +思考,thinking +第三只眼,third_eye +丁字内裤,thong +丁字式比基尼,thong_bikini +丁字式连身衣,thong_leotard +荆棘,thorns +体现内心想法的会话气泡,thought_bubble +线(物体),thread +立体机动装置,three-dimensional_maneuver_gear +3P,threesome +宝座,throne +隔着衣物爱抚,through_clothes +壁尻,through_wall +推进器,thrusters +拇指指环,thumb_ring +竖大拇指,thumbs_up +皇冠头饰,tiara +挠痒痒,tickling +领带夹,tie_clip +扎好的头发,tied_hair +把衣服(衣角和下摆)打结,tied_shirt +捆绑,tied_up +分层托盘架,tiered_tray +老虎,tiger +虎耳,tiger_ears +虎纹,tiger_print +虎尾,tiger_tail +衣物紧紧的,tight +紧身裤,tight_pants +瓷砖,tiles +歪着的头饰,tilted_headwear +不在同时间线的角色,time_paradox +时间戳,timestamp +有色眼镜,tinted_eyewear +踮起脚尖,tiptoes +轮胎,tire +纸巾|手帕,tissue +纸巾盒,tissue_box +标题,title +模仿标题风格,title_parody +烤吐司,toast +叼着面包,toast_in_mouth +趾尖|脚尖,toe-point +蜷着脚趾,toe_scrunch +露趾靴,toeless_boots +露趾裤袜,toeless_legwear +指甲油,toenail_polish +脚趾甲,toenails +脚趾,toes +厕所|坐便器,toilet +厕纸,toilet_paper +上厕所,toilet_use +头襟|兜巾,tokin_hat +时定高校校服,tokisadame_school_uniform +常盘台中学校服,tokiwadai_school_uniform +清酒瓶,tokkuri +番茄,tomato +假小子,tomboy +墓碑,tombstone +巴纹,tomoe_(symbol) +友枝小学校服,tomoeda_elementary_school_uniform +健壮的,toned +健美男,toned_male +东方棍,tonfa +舌头,tongue +吐舌头,tongue_out +打舌钉,tongue_piercing +太多了,too_many +工具,tools +牙,tooth +牙刷,toothbrush +脸朝下屁股朝上,top-down_bottom-up +礼帽,top_hat +赤裸上身,topless +火炬,torch +鸟居,torii +破损的紧身衣,torn_bodysuit +破损的衣服,torn_clothes +破损的手套,torn_gloves +破损的牛仔裤,torn_jeans +破损的裤袜,torn_legwear +破损的连身衣,torn_leotard +破损的内裤,torn_panties +破损的裤子,torn_pants +破损的围巾,torn_scarf +破损的衬衫,torn_shirt +破损的短裤,torn_shorts +破损的袖子,torn_sleeves +破损的泳衣,torn_swimsuit +淫荡的表情,torogao +鱼雷,torpedo +抓着躯干,torso_grab +拷问,torture +毛巾,towel +披着毛巾的脖子,towel_around_neck +塔,tower +小镇,town +玩具,toy +运动夹克,track_jacket +运动裤,track_pants +运动服,track_suit +集换卡片,trading_card +繁体字,traditional_chinese_text +传统服装|民族服装,traditional_clothes +纯手绘,traditional_media +路障,traffic_cone +信号灯,traffic_light +列车|电车,train +列车内部|电车内部,train_interior +车站,train_station +运动内衣,training_bra +训练兵团徽章,training_corps_(emblem) +角色特征有关联,trait_connection +变形|变型,transformation +透明的,transparent +透明背景,transparent_background +透明雨伞,transparent_umbrella +垃圾桶,trash_can +托盘,tray +宝箱,treasure_chest +高音符号,treble_clef +树,tree +树枝,tree_branch +树荫,tree_shade +树桩,tree_stump +一种女性展示臀部的姿势,trefoil +颤抖,trembling +长摆风衣,trench_coat +绺发发带,tress_ribbon +有三股辫子,tri_tails +道馆馆主,trial_captain +三角形,triangle +三角嘴,triangle_mouth +三角头巾,triangular_headpiece +磨豆腐,tribadism +部落,tribal +不给糖就捣蛋,trick_or_treat +紫绿相间的胶衣比基尼,trick_or_treatment +三叉戟,trident +三角神力,triforce +手指没放在扳机上,trigger_discipline +八卦,trigram +三穴插入,triple_penetration +被绊倒,tripping +嚣张脸,troll_face +恶搞行为,trolling +热带饮料|热带风情鸡尾酒,tropical_drink +小洋号,trumpet +真相吐槽,truth +时空管理局军服,tsab_ground_military_uniform +傲娇,tsundere +吊梢眼,tsurime +管子,tube +直筒式抹胸,tubetop +郁金香,tulip +束腰外衣,tunic +穆斯林头巾,turban +脸色苍白,turn_pale +前后身对比图,turnaround +舰炮,turret +龟,turtle +高领服饰,turtleneck +高领毛衣,turtleneck_sweater +象牙,tusks +男式晚间小礼服,tuxedo +黄昏,twilight +两条辫子,twin_braids +双钻头发型,twin_drills +双胞胎调情|双胞胎乱伦,twincest +双胞胎,twins +双马尾,twintails +躯干夸张地扭曲,twisted_torso +抽搐,twitching +推特用户名,twitter_username +用双脚腿交,two-footed_footjob +用双手,two-handed +用双手手交,two-handed_handjob +双层样式的服饰画法,two-sided_fabric +双色调背景,two-tone_background +双色头发,two-tone_hair +双色发带,two-tone_hairband +双色皮肤,two-tone_skin +小型双股辫,two_side_up +有两条尾巴,two_tails +打结(动作),tying +扎头发,tying_hair +印刷错误,typo +雄英高中校服,u.a._school_uniform +u_u,u_u +白无垢|日式嫁衣,uchikake +团扇,uchiwa +不明飞行物,ufo +丑男,ugly_man +超级怪兽,ultra_beast +雨伞,umbrella +左右乳房动态不一,unaligned_breasts +解开纽扣后,unbuttoned +解开纽扣的衬衫,unbuttoned_shirt +无修正,uncensored +失神,unconscious +无上框眼镜,under-rim_eyewear +盖着被子的,under_covers +钻进被炉的,under_kotatsu +在桌子下面,under_table +下乳|南半球,underboob +衣服有下半乳开口,underboob_cutout +勒着下胸围的衣服,underbust +帽盔式发型,undercut +汗衫,undershirt +尺寸过小的衣服,undersized_clothes +水下,underwater +内衣,underwear +只穿着内衣,underwear_only +解开领带,undone_necktie +脱衣,undressing +大小眼,uneven_eyes +两股辫子大小不一,uneven_twintails +独角兽,unicorn +制服,uniform +米字旗,union_jack +弹力紧身衣,unitard +连续重复的花纹,unmoving_pattern +拔出鞘的,unsheathing +解开的衣物|没穿好的衣物,untied +解开的比基尼,untied_bikini +解开的衬衫,untucked_shirt +解开的,untying +拉开拉链的,unzipped +拉开拉链(动作),unzipping +头发往上蜷的发型,updo +上半身,upper_body +露出上排牙齿,upper_teeth +对坐体位,upright_straddle +从衬衫下方瞥,upshirt +从裙底瞥,upshorts +倒挂的,upside-down +从人的下方往上瞥,upskirt +浦之星女学院校服,uranohoshi_school_uniform +尿道,urethra +尿道插入,urethral_insertion +用过的避孕套,used_condom +用过的纸巾,used_tissue +俄罗斯冬帽,ushanka +子宫,uterus +战术皮带,utility_belt +小舌头|口盖垂|悬雍垂,uvula +室内鞋,uwabaki +比V|剪刀手,v +v字领,v-neck +眉毛皱成V形(表高傲或愤怒),v-shaped_eyebrows +拘谨的手势,v_arms +在眼边比剪刀手,v_over_eye +吸尘器,vacuum_cleaner +阴道,vaginal +阴道异物插入,vaginal_object_insertion +情人节,valentine +女武神,valkyrie +下臂护甲,vambraces +吸血鬼,vampire +远景透视画法,vanishing_point +强化服,varia_suit +可变形战机|超时空要塞机体,variable_fighter +(变装、变化、变色、成长)对比图,variations +花瓶,vase +瓦斯塔亚人,vastaya +跳马(器材),vaulting_horse +蔬菜,vegetable +载具的特写,vehicle_focus +面纱,veil +静脉|血管凸起,veins +静脉凸起的阴茎,veiny_penis +自动售货机,vending_machine +♀,venus_symbol +外廊|缘侧,veranda +竖条纹比基尼,vertical-striped_bikini +竖条纹连衣裙,vertical-striped_dress +竖条纹裤袜,vertical-striped_legwear +竖条纹内裤,vertical-striped_panties +竖条纹衬衫,vertical-striped_shirt +竖条纹裙子,vertical-striped_skirt +竖条纹,vertical_stripes +黑皮,very_dark_skin +很长的头发,very_long_hair +很短的头发,very_short_hair +背心,vest +试管,vial +跳蛋,vibrator +大腿贴着跳蛋控制器,vibrator_in_thighhighs +塞着跳蛋且穿着衣服,vibrator_under_clothes +塞着跳蛋且穿着内裤,vibrator_under_panties +电子游戏,video_game +相机取景框,viewfinder +藤曼,vines +小提琴,violin +处男杀手服,virgin_killer_outfit +处男杀手毛衣,virgin_killer_sweater +虚拟Youtuber,virtual_youtuber +被画出来的空气,visible_air +面罩|遮阳帽舌|遮阳板,visor +遮阳帽,visor_cap +视觉小说,visual_novel +V家歌手的头戴设备,vocaloid_append +薄纱|巴里纱,voile +排球,volleyball +排球服,volleyball_uniform +丸吞,vore +弱音(歌手),voyakiloid +偷窥癖,voyeurism +手指比W,w +双臂摆出W,w_arms +和风萝莉,wa_lolita +和风女仆,wa_maid +涉水,wading +充气泳池,wading_pool +日式点心,wagashi +女服务员围裙,waist_apron +系在前腰的蝴蝶结,waist_bow +穿在腰部的小披风,waist_cape +马甲,waistcoat +服务员,waiter +女服务员,waitress +起床,waking_up +日本短剑|介错刀,wakizashi +进错房间|意外进来看到了不该看的,walk-in +走路,walking +在液体上行走,walking_on_liquid +墙,wall +文字背景,wall_of_text +壁咚,wall_slam +钱包,wallet +魔杖,wand +战争,war +穿衣方式错了,wardrobe_error +衣服出了意外(如崩开),wardrobe_malfunction +鸭子坐,wariza +军舰,warship +洗涤,washing +手表,watch +水,water +水瓶,water_bottle +水滴,water_drop +水枪,water_gun +水彩画风,watercolor_(medium) +水彩笔作画,watercolor_pencil_(medium) +船艇,watercraft +瀑布,waterfall +水壶,watering_can +水印,watermark +西瓜,watermelon +西瓜棒,watermelon_bar +双腿交叉站姿,watson_cross +波浪,waves +挥手,waving +挥舞着手臂,waving_arms +卷发,wavy_hair +薯片嘴型,wavy_mouth +武器|兵器,weapon +背上的武器,weapon_on_back +肩上的武器,weapon_over_shoulder +网址,web_address +结婚|婚礼,wedding +左手无名指,wedding_band +婚纱,wedding_dress +结婚戒指,wedding_ring +嵌入(拉裆部衣物所致),wedgie +秤,weighing_scale +人物在意体重,weight_conscious +狼人,werewolf +西部剧,western +湿透的|沾湿的,wet +湿衣服,wet_clothes +湿头发,wet_hair +湿内裤,wet_panties +湿衬衫,wet_shirt +湿泳衣,wet_swimsuit +潜水服,wetsuit +鲸鱼,whale +作者犯病,what +另一条世界线,what_if +车轮,wheel +独轮小推车,wheelbarrow +轮椅,wheelchair +把梗元素藏在图里,when_you_see_it +鞭子,whip +鞭痕,whip_marks +生奶油,whipped_cream +搅拌,whisk +脸颊有胡须状痕迹(如狐妖脸上),whisker_markings +动物胡须|络腮胡,whiskers +讲悄悄话,whispering +口哨,whistle +脖子上挂着口哨,whistle_around_neck +白框眼镜,white-framed_eyewear +白色围裙,white_apron +白色背景,white_background +白色腰带,white_belt +白色比基尼,white_bikini +白色灯笼裤,white_bloomers +白色上衣,white_blouse +白色紧身衣,white_bodysuit +白色边框,white_border +白色蝴蝶结,white_bow +白色胸罩,white_bra +白色女式背心,white_camisole +白色斗篷,white_cape +白色披风,white_capelet +白色开襟衫,white_cardigan +白猫,white_cat +白色项圈,white_choker +白色外套,white_coat +白色领子,white_collar +白色情人节,white_day +白色连衣裙,white_dress +白色的眼睛,white_eyes +白色的花,white_flower +白色鞋类,white_footwear +白手套,white_gloves +白发,white_hair +白色发带,white_hairband +白色头带,white_headband +白色头饰,white_headwear +白色连帽衫,white_hoodie +白色夹克,white_jacket +白色和服,white_kimono +白丝袜,white_legwear +白色连身衣,white_leotard +白色指甲,white_nails +白色领子,white_neckwear +白色轮廓,white_outline +白色内裤,white_panties +白色裤子,white_pants +白色瞳孔,white_pupils +白色丝带,white_ribbon +白玫瑰,white_rose +白色水手领,white_sailor_collar +白色围巾,white_scarf +白色发束,white_scrunchie +白色水手服,white_serafuku +白衬衫,white_shirt +白色短裤,white_shorts +白色的皮肤,white_skin +白色短裙,white_skirt +白色袖子,white_sleeves +白色西装,white_suit +白色毛衣,white_sweater +白色泳衣,white_swimsuit +白色无袖紧身背心,white_tank_top +白色主题,white_theme +白虎(动物),white_tiger +白色毛巾,white_towel +白色背心,white_vest +白色翅膀,white_wings +邪龙之魔女ver新宿1999,wicked_dragon_witch_ver._shinjuku_1999 +睁大眼睛,wide-eyed +宽脸化,wide_face +沙漏型身材|安产型,wide_hips +宽马尾,wide_ponytail +广角镜头,wide_shot +宽大的袖子,wide_sleeves +丰盈的大腿,wide_spread_legs +百合夫妇,wife_and_wife +假发,wig +尤指女性刚性交时畏缩的表情,wince +风,wind +风铃,wind_chime +起风效果|上升气流,wind_lift +发条钥匙,winding_key +风车,windmill +窗,window +透过窗户照进来的光,window_shade +套着画框(背景),windowboxed +风向带|鲤鱼旗,windsock +葡萄酒,wine +酒瓶,wine_bottle +红酒杯,wine_glass +燕子领,wing_collar +带翅膀的鞋子,winged_footwear +带翅膀的头盔,winged_helmet +翅膀,wings +冬季,winter +冬装,winter_clothes +擦眼泪,wiping_tears +电线|数据线,wire +紫藤,wisteria +魔女,witch +魔女(魔法少女小圆),witch_(madoka_magica) +女巫帽,witch_hat +法师帽,wizard_hat +狼,wolf +狼耳朵,wolf_ears +狼女孩,wolf_girl +狼尾巴,wolf_tail +木材,wood +木桶,wooden_bucket +木地板,wooden_floor +木剑,wooden_sword +木墙,wooden_wall +第二次世界大战,world_war_ii +花环,wreath +扳手,wrench +摔角(运动),wrestling +摔角服,wrestling_outfit +摔角场,wrestling_ring +拧干衣服,wringing_clothes +和袖子分开的手腕的袖口,wrist_cuffs +抓着手腕,wrist_grab +腕带,wrist_ribbon +束腕,wrist_scrunchie +护腕(布类),wrist_wrap +腕带,wristband +手表,wristwatch +写作,writing +脚部作画错误,wrong_feet +X3,x3 +透视,x-ray +X形发饰,x_hair_ornament +X_X,x_x +XD,xd +箭羽纹,yagasuri +山形花车,yamakasa +病娇,yandere +㚻|BL|腐|腐向,yaoi +毛线,yarn +毛线球,yarn_ball +八十神高中校服,yasogami_school_uniform +打哈欠,yawning +狗年,year_of_the_dog +猪年,year_of_the_pig +鼠年,year_of_the_rat +鸡年,year_of_the_rooster +黄框眼镜,yellow-framed_eyewear +黄色围裙,yellow_apron +黄色背景,yellow_background +黄色比基尼,yellow_bikini +黄色紧身衣,yellow_bodysuit +黄色的蝴蝶结,yellow_bow +黄色胸罩,yellow_bra +黄色项圈,yellow_choker +黄色连衣裙,yellow_dress +黄色的眼睛,yellow_eyes +黄色的花,yellow_flower +黄色鞋类,yellow_footwear +黄色手套,yellow_gloves +黄色发带,yellow_hairband +黄色头饰,yellow_headwear +黄色夹克,yellow_jacket +黄色和服,yellow_kimono +黄色裤袜,yellow_legwear +黄色连身衣,yellow_leotard +黄色指甲,yellow_nails +黄色领子,yellow_neckwear +黄色内裤,yellow_panties +黄丝带,yellow_ribbon +黄玫瑰,yellow_rose +黄色围巾,yellow_scarf +黄色巩膜,yellow_sclera +黄色发束,yellow_scrunchie +黄色衬衫,yellow_shirt +黄色短裤,yellow_shorts +黄色皮肤,yellow_skin +黄色裙子,yellow_skirt +黄色毛衣,yellow_sweater +黄色主题,yellow_theme +黄色背心,yellow_vest +黄色的翅膀,yellow_wings +对,yes +Yes/No枕头,yes-no_pillow +太极图,yin_yang +侧身坐,yokozuwari +约德尔人,yordle +使用方法错误,you're_doing_it_wrong +要被强奸的flag,you_gonna_get_raped +妖怪,youkai +幼儿化,younger +弽,yugake +浴衣,yukata +油库里(网络表情),yukkuri_shiteitte_ne +茶碗,yunomi +百合,yuri +Z手环,z-ring +褥垫,zabuton +Zero装束,zero_suit +绝对领域,zettai_ryouiki +拉链,zipper +拉链的拉片,zipper_pull_tab +僵尸,zombie +僵尸姿势,zombie_pose +人物立绘缩放(剪影)图层,zoom_layer +日式草鞋,zouri diff --git a/gf/sdxl-webui-refiner-fixed/.gitattributes b/gf/sdxl-webui-refiner-fixed/.gitattributes new file mode 100644 index 0000000000000000000000000000000000000000..dfe0770424b2a19faf507a501ebfc23be8f54e7b --- /dev/null +++ b/gf/sdxl-webui-refiner-fixed/.gitattributes @@ -0,0 +1,2 @@ +# Auto detect text files and perform LF normalization +* text=auto diff --git a/gf/sdxl-webui-refiner-fixed/.gitignore b/gf/sdxl-webui-refiner-fixed/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf --- /dev/null +++ b/gf/sdxl-webui-refiner-fixed/.gitignore @@ -0,0 +1 @@ +.DS_Store diff --git a/gf/sdxl-webui-refiner-fixed/README.md b/gf/sdxl-webui-refiner-fixed/README.md new file mode 100644 index 0000000000000000000000000000000000000000..b69235f55860e7156ac462f4ed561b81509b4a44 --- /dev/null +++ b/gf/sdxl-webui-refiner-fixed/README.md @@ -0,0 +1,52 @@ +# SDXL Refiner fixed (stable-diffusion-webui Extension) +## Extension for integration of the SDXL refiner into Automatic1111 + +This extension makes the SDXL Refiner available in [Automatic1111 stable-diffusion-webui.](https://github.com/AUTOMATIC1111/stable-diffusion-webui) + +The implentation is done as [described by Stability AI](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) as *an ensemble of experts pipeline for latent diffusion: In a first step, the base model is used to generate (noisy) latents, which are then further processed with a refinement model specialized for the final denoising steps.* + +This extension is heavily based on [sd-webui-refiner](https://github.com/wcde/sd-webui-refiner), so thanks goes to [wcde](https://github.com/wcde) for his great work. + +Nevertheless this extension has some modfications and enhancments over sd-webui-refiner + +* simplyfied ui to avoid the most common misunderstanding of the relation between refiner steps and total steps. (for more details see Handover from Base to Refiner) +* uses a fixed point where the diffusion is handed over from the base model to the refiner. (for more details see Handover from Base to Refiner) +* some code refactoring, removing unnecessary tests and variables + +## Installation + +In Automatic1111 go to the **Extensions** tab and there to **Install from URL**. Paste this URL + +`https://github.com/lisanet/sdxl-webui-refiner-fixed.git` + +into the field **URL for extension's git repository**, leave the other fields blank and hit the **Install** button. + +Now head over to the **Installed** tab, there mark the extension checked and hit **Apply and restart UI**. + +The extension is now loaded and you can access it on the txt2img and img2img tabs. + +## Usage + +To use the refiner you need to already have downloaded the SDXL 1.0 refiner. You can [find it on Hugging Face](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/tree/main). + +On the **txt2img** tab scroll down, open the Refiner panel, check **Enable Refiner** and select the refiner checkpoints from the **Model** dropdown. + +That's it. Now you can generate an image as usual. The extension will take care of handing over the latent image from the base checkpoints to the refiner witout the need or hassle to define the correct handover point for yourself. + +The extension is available on the img2img tab too. + +## Handover from Base to Refiner + +As [described here](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/stable_diffusion_xl), *Stable Diffusion XL base is trained on timesteps 0-999 and Stable Diffusion XL refiner is finetuned from the base model on low noise timesteps 0-199 inclusive, so we use the base model for the first 800 timesteps (high noise) and the refiner for the last 200 timesteps (low noise). Hence, high_noise_frac is set to 0.8, so that all steps 200-999 (the first 80% of denoising timesteps) are performed by the base model and steps 0-199 (the last 20% of denoising timesteps) are performed by the refiner model.* + +To avoid one if the most misunderstandings in the usage of the refiner, this extension sets the handover point at 80% of the denoised image and uses the refiner for the last 20%. Since this extensions has to use steps instead of timestamps, the 80% timestemp mark is not met exactly, though it's never lower. + +This fixed handover point is the main advantage, because if the handover point will be lower than 80% it's very likely, that the refiner will introduce distortions in the final image. And there are a lot of articles and videos out there, who have misunderstood the former steps slider in the extension and the relation between this value and the total steps. + +If you still prefer to play around (with the risk of distorted images) and set the hand over value on your own, please have a look at the above mentioned sd-webui-refiner extension. + +## Technical notes + +The extension loads only UNET from the refiner checkpoint and replaces he tbase UNET with it at the handover point for last steps of denoising. + +Use Tiled VAE if you have 12GB or less VRAM. diff --git a/gf/sdxl-webui-refiner-fixed/scripts/__pycache__/refiner.cpython-310.pyc b/gf/sdxl-webui-refiner-fixed/scripts/__pycache__/refiner.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..138330c350fce97dfa991366a4eab24c9699ce09 Binary files /dev/null and b/gf/sdxl-webui-refiner-fixed/scripts/__pycache__/refiner.cpython-310.pyc differ diff --git a/gf/sdxl-webui-refiner-fixed/scripts/refiner.py b/gf/sdxl-webui-refiner-fixed/scripts/refiner.py new file mode 100644 index 0000000000000000000000000000000000000000..c74ed2aba7d544e6da162a71200c68428045ffa9 --- /dev/null +++ b/gf/sdxl-webui-refiner-fixed/scripts/refiner.py @@ -0,0 +1,133 @@ +import torch +from modules import scripts, script_callbacks, devices, sd_models, sd_models_config, shared +import gradio as gr +import sgm.modules.diffusionmodules.discretizer +from sgm.modules.encoders.modules import ConcatTimestepEmbedderND +from safetensors.torch import load_file, load +from sgm.modules.diffusionmodules.wrappers import OPENAIUNETWRAPPER +from omegaconf import OmegaConf +from sgm.util import ( + disabled_train, + get_obj_from_str, + instantiate_from_config, +) + +class Refiner(scripts.Script): + def __init__(self): + super().__init__() + self.model = None + self.base = None + self.model_name = '' + self.embedder = ConcatTimestepEmbedderND(256) + self.c_ae = None + self.uc_ae = None + + def title(self): + return "Refiner" + + def show(self, is_img2img): + return scripts.AlwaysVisible + + def build_model(self): + refiner_config = OmegaConf.load(sd_models_config.config_sdxl_refiner).model.params.network_config + self.model = instantiate_from_config(refiner_config) + self.model = get_obj_from_str(OPENAIUNETWRAPPER)( + self.model, compile_model=False + ).eval() + dev = 'mps' if devices.device.type == 'mps' else 'cpu' # on Apple Silicon mps is more efficient than switch back and forth + self.model = self.model.to(dev, devices.dtype_unet) + self.model.train = disabled_train + self.model.diffusion_model.dtype = devices.dtype_unet + self.model.conditioning_key = 'crossattn' + self.model.cond_stage_key = 'txt' + self.model.parameterization = 'v' + discretization = sgm.modules.diffusionmodules.discretizer.LegacyDDPMDiscretization() + self.model.alphas_cumprod = torch.asarray(discretization.alphas_cumprod, device=devices.device, dtype=devices.dtype_unet) + for param in self.model.parameters(): + param.requires_grad = False + + def load_model(self, model_name): + if not shared.opts.disable_mmap_load_safetensors: + ckpt = load_file(sd_models.checkpoints_list[model_name].filename) + else: + ckpt = load(open(sd_models.checkpoints_list[model_name].filename, 'rb').read()) + model_type = '' + for key in ckpt.keys(): + if 'conditioner' in key: + model_type = 'Refiner' + if 'input_blocks.7.1.transformer_blocks.4.attn1.to_k.weight' in key: + model_type = 'Base' + break + if model_type != 'Refiner': + self.enable = False + script_callbacks.remove_current_script_callbacks() + print('\nNot refiner, extension disabled!\n') + return False + + print('\nLoading refiner...\n') + self.build_model() + + state_dict = dict() + for key in ckpt.keys(): + if 'model.diffusion_model' in key: + state_dict[key.replace('model.d', 'd')] = ckpt[key].half() + self.model.load_state_dict(state_dict) + self.model_name = model_name + return True + + def ui(self, is_img2img): + with gr.Accordion(label='Refiner', open=False): + enable = gr.Checkbox(label='Enable Refiner', value=False) + with gr.Row(): + checkpoint = gr.Dropdown(choices=['None', *sd_models.checkpoints_list.keys()], label='Model') + + ui = [enable, checkpoint] + return ui + + def process(self, p, enable, checkpoint): + if not enable or checkpoint == 'None': + script_callbacks.remove_current_script_callbacks() + self.base = None + self.model_name = '' + self.model = None + devices.torch_gc() + return + if self.model_name != checkpoint: + if not self.load_model(checkpoint): return + self.c_ae = self.embedder(torch.tensor(shared.opts.sdxl_refiner_high_aesthetic_score).unsqueeze(0).to(devices.device).repeat(p.batch_size, 1)) + self.uc_ae = self.embedder(torch.tensor(shared.opts.sdxl_refiner_low_aesthetic_score).unsqueeze(0).to(devices.device).repeat(p.batch_size, 1)) + p.extra_generation_params['Refiner model'] = checkpoint.rsplit('.', 1)[0] + p.extra_generation_params['Refiner steps'] = '20 %' + + + def denoiser_callback(params: script_callbacks.CFGDenoiserParams): + if params.sampling_step > params.total_sampling_steps - int(params.total_sampling_steps * 0.2) - 2: + params.text_cond['vector'] = torch.cat((params.text_cond['vector'][:, :2304], self.c_ae), 1) + params.text_uncond['vector'] = torch.cat((params.text_uncond['vector'][:, :2304], self.uc_ae), 1) + params.text_cond['crossattn'] = params.text_cond['crossattn'][:, :, -1280:] + params.text_uncond['crossattn'] = params.text_uncond['crossattn'][:, :, -1280:] + if self.base is None: + self.base, self.model = self.switch_model(p, self.base, self.model) + + def denoised_callback(params: script_callbacks.CFGDenoiserParams): + if params.sampling_step == params.total_sampling_steps - 2: + self.model, self.base = self.switch_model(p, self.model, self.base) + + script_callbacks.on_cfg_denoiser(denoiser_callback) + script_callbacks.on_cfg_denoised(denoised_callback) + + def switch_model(self, p, oldmodel, newmodel): + if devices.device.type == 'mps': # on Apple Silicon mps is more efficient than switch back and forth + oldmodel = p.sd_model.model + else: + oldmodel = p.sd_model.model.to('cpu', devices.dtype_unet) + devices.torch_gc() + p.sd_model.model = newmodel.to(devices.device, devices.dtype_unet) + newmodel = None + return oldmodel, newmodel + + def postprocess(self, p, processed, *args): + if self.base is not None: + self.model, self.base = self.switch_model(p, self.model, self.base) + script_callbacks.remove_current_script_callbacks() + diff --git a/gf/stable-diffusion-webui-sched/.gitignore b/gf/stable-diffusion-webui-sched/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..b6e47617de110dea7ca47e087ff1347cc2646eda --- /dev/null +++ b/gf/stable-diffusion-webui-sched/.gitignore @@ -0,0 +1,129 @@ +# Byte-compiled / optimized / DLL files +__pycache__/ +*.py[cod] +*$py.class + +# C extensions +*.so + +# Distribution / packaging +.Python +build/ +develop-eggs/ +dist/ +downloads/ +eggs/ +.eggs/ +lib/ +lib64/ +parts/ +sdist/ +var/ +wheels/ +pip-wheel-metadata/ +share/python-wheels/ +*.egg-info/ +.installed.cfg +*.egg +MANIFEST + +# PyInstaller +# Usually these files are written by a python script from a template +# before PyInstaller builds the exe, so as to inject date/other infos into it. +*.manifest +*.spec + +# Installer logs +pip-log.txt +pip-delete-this-directory.txt + +# Unit test / coverage reports +htmlcov/ +.tox/ +.nox/ +.coverage +.coverage.* +.cache +nosetests.xml +coverage.xml +*.cover +*.py,cover +.hypothesis/ +.pytest_cache/ + +# Translations +*.mo +*.pot + +# Django stuff: +*.log +local_settings.py +db.sqlite3 +db.sqlite3-journal + +# Flask stuff: +instance/ +.webassets-cache + +# Scrapy stuff: +.scrapy + +# Sphinx documentation +docs/_build/ + +# PyBuilder +target/ + +# Jupyter Notebook +.ipynb_checkpoints + +# IPython +profile_default/ +ipython_config.py + +# pyenv +.python-version + +# pipenv +# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. +# However, in case of collaboration, if having platform-specific dependencies or dependencies +# having no cross-platform support, pipenv may install dependencies that don't work, or not +# install all needed dependencies. +#Pipfile.lock + +# PEP 582; used by e.g. github.com/David-OConnor/pyflow +__pypackages__/ + +# Celery stuff +celerybeat-schedule +celerybeat.pid + +# SageMath parsed files +*.sage.py + +# Environments +.env +.venv +env/ +venv/ +ENV/ +env.bak/ +venv.bak/ + +# Spyder project settings +.spyderproject +.spyproject + +# Rope project settings +.ropeproject + +# mkdocs documentation +/site + +# mypy +.mypy_cache/ +.dmypy.json +dmypy.json + +# Pyre type checker +.pyre/ diff --git a/gf/stable-diffusion-webui-sched/LICENSE b/gf/stable-diffusion-webui-sched/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..89a316444f791f06fb819b7e708698271ac2d91b --- /dev/null +++ b/gf/stable-diffusion-webui-sched/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2023 opparco + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/gf/stable-diffusion-webui-sched/README.md b/gf/stable-diffusion-webui-sched/README.md new file mode 100644 index 0000000000000000000000000000000000000000..c60a2ec4d5823264b59792e1b7272a3aba797e68 --- /dev/null +++ b/gf/stable-diffusion-webui-sched/README.md @@ -0,0 +1,2 @@ +# stable-diffusion-webui-sched +This extension allows you to visually see at which step the parsed prompt switches. diff --git a/gf/stable-diffusion-webui-sched/scripts/__pycache__/sched.cpython-310.pyc b/gf/stable-diffusion-webui-sched/scripts/__pycache__/sched.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..78ed2b4382660f68b390d10544ac3fe0893e8b89 Binary files /dev/null and b/gf/stable-diffusion-webui-sched/scripts/__pycache__/sched.cpython-310.pyc differ diff --git a/gf/stable-diffusion-webui-sched/scripts/sched.py b/gf/stable-diffusion-webui-sched/scripts/sched.py new file mode 100644 index 0000000000000000000000000000000000000000..b7649ff9c904d22862ddb082318f006ef9cb8463 --- /dev/null +++ b/gf/stable-diffusion-webui-sched/scripts/sched.py @@ -0,0 +1,189 @@ +from functools import partial, reduce + +import torch + +from modules import extra_networks, script_callbacks + +from modules import prompt_parser +from modules.devices import device, dtype +from modules.sd_hijack import model_hijack + +import gradio as gr +import matplotlib.pyplot as ax + +ax.switch_backend("agg") + +sd_model = None +sd_model_betas = None +sd_model_alphas_cumprod = None +sd_model_alphas_cumprod_prev = None + + +def do_restore_model_params(): + sd_model.betas = sd_model_betas + sd_model.alphas_cumprod = sd_model_alphas_cumprod + sd_model.alphas_cumprod_prev = sd_model_alphas_cumprod_prev + + values = sd_model_alphas_cumprod.tolist() + + x_values = list(range(sd_model.num_timesteps)) + + ax.clf() # clear current figure. + ax.plot(x_values, values, label="original") + ax.legend() + ax.title("Alphas Cumulative Product") + ax.xlabel("step") + ax.ylabel("alphas cumprod") + + return ax + + +def do_update_model_params(beta_start_mil: int, beta_end_mil: int): + + torch.set_printoptions(precision=8, threshold=50) + + values = sd_model_alphas_cumprod.tolist() + + beta_start = beta_start_mil * 1.e-5 + beta_end = beta_end_mil * 1.e-5 + # beta_schedule = "linear" + # num_train_timesteps = 1000 # default = 1000 + + betas = torch.linspace(beta_start, beta_end, sd_model.num_timesteps, device=device, dtype=dtype) + + alphas = 1.0 - betas + alphas_cumprod = torch.cumprod(alphas, dim=0) + alphas_cumprod_prev = torch.cat( + (torch.tensor([1.0], device=device, dtype=dtype), alphas_cumprod[:-1])) + + new_values = alphas_cumprod.tolist() + + x_values = list(range(sd_model.num_timesteps)) + + ax.clf() # clear current figure. + ax.plot(x_values, values, label="original") + ax.plot(x_values, new_values, label="update") + ax.legend() + ax.title("Alphas Cumulative Product") + ax.xlabel("step") + ax.ylabel("alphas cumprod") + + sd_model.betas = betas + sd_model.alphas_cumprod = alphas_cumprod + sd_model.alphas_cumprod_prev = alphas_cumprod_prev + + return ax + + +def do_schedule(text, steps, current_step): + + # + # update_token_counter in modules/ui.py + # + try: + text, _ = extra_networks.parse_prompt(text) + + _, prompt_flat_list, _ = prompt_parser.get_multicond_prompt_list([text]) + prompt_schedules = prompt_parser.get_learned_conditioning_prompt_schedules(prompt_flat_list, steps) + + except Exception: + # a parsing error can happen here during typing, and we don't want to bother the user with + # messages related to it in console + prompt_schedules = [[[steps, text]]] + + flat_prompts = reduce(lambda list1, list2: list1+list2, prompt_schedules) + + ht = [] + md = '' + + current_prompt = None + for when, prompt in flat_prompts: + if current_step <= when: + current_prompt = prompt + break + + if current_prompt is not None: + # + # in modules/sd_hijack_clip.py + # + clip = model_hijack.clip + + batch_chunks, token_count = clip.process_texts([current_prompt]) + + # used_embeddings = {} + chunk_count = max([len(x) for x in batch_chunks]) + + for i in range(chunk_count): + batch_chunk = [chunks[i] if i < len(chunks) else clip.empty_chunk() for chunks in batch_chunks] + for x in batch_chunk: + for token in clip.tokenizer.convert_ids_to_tokens(x.tokens): + if token.startswith('<|'): + if token == '<|startoftext|>': + ht.append(['.', 'B']) + elif token == '<|endoftext|>': + ht.append(['.', 'E']) + else: + ht.append([token[:-4] if token.endswith('') else token, None]) + + md += f'{token_count} tokens at step {current_step}\n' + + for when, prompt in flat_prompts: + md += f'### step {when}\n' + md += prompt + md += f'\n' + + return ht, md + + +def on_model_loaded(sd_model_): + global sd_model + global sd_model_betas, sd_model_alphas_cumprod, sd_model_alphas_cumprod_prev + + if sd_model_ == sd_model: + return + + sd_model = sd_model_ + sd_model_betas = sd_model_.betas.to(device, dtype) + sd_model_alphas_cumprod = sd_model_.alphas_cumprod.to(device, dtype) + sd_model_alphas_cumprod_prev = sd_model_.alphas_cumprod_prev.to(device, dtype) + + +def on_ui_tabs(): + with gr.Blocks(analytics_enabled=False, variant="compact") as demo: + with gr.Row(): + with gr.Column(): + plot = gr.Plot(value=ax) + with gr.Row(): + with gr.Column(): + beta_start = gr.Slider(minimum=5, maximum=125, step=1, label="Beta start * 1.e+5", value=85) # 85.020 + with gr.Column(): + beta_end = gr.Slider(minimum=400, maximum=2000, step=20, label="Beta end * 1.e+5", value=1200) # 1200.104 + with gr.Row(): + with gr.Column(): + restore_button = gr.Button(value="Restore") + with gr.Column(): + update_button = gr.Button(value="Update", variant="primary") + + with gr.Row(): + with gr.Column(): + prompt = gr.Textbox(label="Prompt", show_label=False, lines=3, placeholder="Prompt") + with gr.Row(): + with gr.Column(): + steps = gr.Slider(minimum=1, maximum=150, step=1, label="Sampling steps", value=20) + with gr.Column(): + current_step = gr.Slider(minimum=1, maximum=150, step=1, label="Count tokens at this step", value=1) + with gr.Row(): + with gr.Column(): + schedule_button = gr.Button(value="Schedule", variant="primary") + report_ht = gr.HighlightedText(combine_adjacent=True, adjacent_separator=' ', label="CLIP").style(color_map={'B': 'green', 'E': 'red'}) + report_md = gr.Markdown() + + restore_button.click(fn=do_restore_model_params, inputs=[], outputs=[plot]) + update_button.click(fn=do_update_model_params, inputs=[beta_start, beta_end], outputs=[plot]) + schedule_button.click(fn=do_schedule, inputs=[prompt, steps, current_step], outputs=[report_ht, report_md]) + + return (demo, "Sched.", "sched"), + + +script_callbacks.on_model_loaded(on_model_loaded) +script_callbacks.on_ui_tabs(on_ui_tabs) diff --git a/gf/stable-diffusion-webui-state/.gitignore b/gf/stable-diffusion-webui-state/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..3f1d0552ebfc3f24669484822a7029caa35177bf --- /dev/null +++ b/gf/stable-diffusion-webui-state/.gitignore @@ -0,0 +1,2 @@ +.idea +__pycache__/ diff --git a/gf/stable-diffusion-webui-state/CODE_OF_CONDUCT.md b/gf/stable-diffusion-webui-state/CODE_OF_CONDUCT.md new file mode 100644 index 0000000000000000000000000000000000000000..e92d51c87163c4d1e9a01de7f11b71162a8bf0be --- /dev/null +++ b/gf/stable-diffusion-webui-state/CODE_OF_CONDUCT.md @@ -0,0 +1,128 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +We as members, contributors, and leaders pledge to make participation in our +community a harassment-free experience for everyone, regardless of age, body +size, visible or invisible disability, ethnicity, sex characteristics, gender +identity and expression, level of experience, education, socio-economic status, +nationality, personal appearance, race, religion, or sexual identity +and orientation. + +We pledge to act and interact in ways that contribute to an open, welcoming, +diverse, inclusive, and healthy community. + +## Our Standards + +Examples of behavior that contributes to a positive environment for our +community include: + +* Demonstrating empathy and kindness toward other people +* Being respectful of differing opinions, viewpoints, and experiences +* Giving and gracefully accepting constructive feedback +* Accepting responsibility and apologizing to those affected by our mistakes, + and learning from the experience +* Focusing on what is best not just for us as individuals, but for the + overall community + +Examples of unacceptable behavior include: + +* The use of sexualized language or imagery, and sexual attention or + advances of any kind +* Trolling, insulting or derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or email + address, without their explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Enforcement Responsibilities + +Community leaders are responsible for clarifying and enforcing our standards of +acceptable behavior and will take appropriate and fair corrective action in +response to any behavior that they deem inappropriate, threatening, offensive, +or harmful. + +Community leaders have the right and responsibility to remove, edit, or reject +comments, commits, code, wiki edits, issues, and other contributions that are +not aligned to this Code of Conduct, and will communicate reasons for moderation +decisions when appropriate. + +## Scope + +This Code of Conduct applies within all community spaces, and also applies when +an individual is officially representing the community in public spaces. +Examples of representing our community include using an official e-mail address, +posting via an official social media account, or acting as an appointed +representative at an online or offline event. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported to the community leaders responsible for enforcement at +ilian6806@gmail.com. +All complaints will be reviewed and investigated promptly and fairly. + +All community leaders are obligated to respect the privacy and security of the +reporter of any incident. + +## Enforcement Guidelines + +Community leaders will follow these Community Impact Guidelines in determining +the consequences for any action they deem in violation of this Code of Conduct: + +### 1. Correction + +**Community Impact**: Use of inappropriate language or other behavior deemed +unprofessional or unwelcome in the community. + +**Consequence**: A private, written warning from community leaders, providing +clarity around the nature of the violation and an explanation of why the +behavior was inappropriate. A public apology may be requested. + +### 2. Warning + +**Community Impact**: A violation through a single incident or series +of actions. + +**Consequence**: A warning with consequences for continued behavior. No +interaction with the people involved, including unsolicited interaction with +those enforcing the Code of Conduct, for a specified period of time. This +includes avoiding interactions in community spaces as well as external channels +like social media. Violating these terms may lead to a temporary or +permanent ban. + +### 3. Temporary Ban + +**Community Impact**: A serious violation of community standards, including +sustained inappropriate behavior. + +**Consequence**: A temporary ban from any sort of interaction or public +communication with the community for a specified period of time. No public or +private interaction with the people involved, including unsolicited interaction +with those enforcing the Code of Conduct, is allowed during this period. +Violating these terms may lead to a permanent ban. + +### 4. Permanent Ban + +**Community Impact**: Demonstrating a pattern of violation of community +standards, including sustained inappropriate behavior, harassment of an +individual, or aggression toward or disparagement of classes of individuals. + +**Consequence**: A permanent ban from any sort of public interaction within +the community. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], +version 2.0, available at +https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. + +Community Impact Guidelines were inspired by [Mozilla's code of conduct +enforcement ladder](https://github.com/mozilla/diversity). + +[homepage]: https://www.contributor-covenant.org + +For answers to common questions about this code of conduct, see the FAQ at +https://www.contributor-covenant.org/faq. Translations are available at +https://www.contributor-covenant.org/translations. diff --git a/gf/stable-diffusion-webui-state/CONTRIBUTING.md b/gf/stable-diffusion-webui-state/CONTRIBUTING.md new file mode 100644 index 0000000000000000000000000000000000000000..f8e3c8cc0e302d7d1a44a3928d9c6eaace72e683 --- /dev/null +++ b/gf/stable-diffusion-webui-state/CONTRIBUTING.md @@ -0,0 +1,43 @@ +# Contributing + +First off, thank you for considering contributing. It’s people like you who make Stable Diffusion such a great tool. + +## Getting Started + +Before you begin, make sure you have a GitHub account and that you have forked the repository to your own GitHub account. Clone it to your local machine to begin making changes. + +## Making Changes + +Here's a quick rundown of how you should go about making a change: + +1. **Create your own branch** from `develop`: +Avoid making changes in the `develop` branch directly. Instead, create a new branch for your changes. Use a short, descriptive name for your branch, like `feature-enhance-ui` or `bugfix-ui-state-save`. + +2. **Make your changes**: +Implement your changes, adhering to the coding conventions and standards. + +3. **Commit your changes**: +Make sure your commit messages are clear and follow the best practices. E.g., "Add feature to ..." or "Fix issue with ..." + +4. **Pull the latest changes from `develop`**: +Before submitting a pull request, make sure your branch is up to date with the latest changes in the `develop` branch. + +5. **Push your changes**: +Push your changes to your fork on GitHub. + +6. **Submit a pull request**: +Open a pull request from your branch to the `develop` branch of the original repository. Provide a concise and informative title and description for your pull request. Link any relevant issues. + +## After Your Pull Request is Merged + +Once your pull request is merged, you might want to clean up your branches and pull the latest changes from the main repository to ensure your local copy is up-to-date. + +## Additional Guidance + +- If you're not sure where to start, look for open issues labeled "good first issue" or "help wanted." + +- Feedback and constructive criticism are always welcome. We’re looking to learn and improve. + +- If you're looking for a specific way to contribute, reviewing pull requests is a great place to start. + +Thank you for contributing! Your efforts help improve the experience for everyone using the extension. diff --git a/gf/stable-diffusion-webui-state/LICENSE b/gf/stable-diffusion-webui-state/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..a5102d8de6021c9d36e072b7a6d1294e5e082d30 --- /dev/null +++ b/gf/stable-diffusion-webui-state/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2023 Ilian Iliev + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/gf/stable-diffusion-webui-state/README.md b/gf/stable-diffusion-webui-state/README.md new file mode 100644 index 0000000000000000000000000000000000000000..38ebc5a67d9292deffb619241ace0ca68a254295 --- /dev/null +++ b/gf/stable-diffusion-webui-state/README.md @@ -0,0 +1,46 @@ +

+ + +

+ +# stable-diffusion-webui-state + +This extension is for AUTOMATIC1111's [Stable Diffusion web UI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) + +### Capabilities + +* Preserve web UI parameters (inputs, sliders, checkboxes etc.) after page reload. +* It can be extended to preserve basically everything in the UI. +* Import / Export and Reset state options +* Supported extensions: + - Control Net + - ADetailer + - Dynamic Prompting + - Multidiffusion upscale/Tiled Diffusion & VAE + +### Usage + +Demo video: [https://www.youtube.com/watch?v=hXsxBKYMGaA](https://www.youtube.com/watch?v=hXsxBKYMGaA) + +### Install + +Use **Install from URL** option with this repo url. + +### Requirements + +None at all. + +### Usage + +Go to **Settings->State** and check all parameters that you want to be preserved after page reload. + +### Contributing + +Feel free to submit PRs to develop! + +

+ ...and you can always buy me a :beer:!

+ + Donate with PayPal + +

diff --git a/gf/stable-diffusion-webui-state/javascript/state.app.js b/gf/stable-diffusion-webui-state/javascript/state.app.js new file mode 100644 index 0000000000000000000000000000000000000000..2933141264853123e9c723d3df126e75f54564ee --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.app.js @@ -0,0 +1,4 @@ + +document.addEventListener('DOMContentLoaded', function() { + onUiLoaded(state.core.init); +}); diff --git a/gf/stable-diffusion-webui-state/javascript/state.constants.js b/gf/stable-diffusion-webui-state/javascript/state.constants.js new file mode 100644 index 0000000000000000000000000000000000000000..8a954a6759fdcbe94c931b285114519864742594 --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.constants.js @@ -0,0 +1,6 @@ +window.state = window.state || {}; +state = window.state; + +state.constants = { + LS_PREFIX: 'state-' +}; \ No newline at end of file diff --git a/gf/stable-diffusion-webui-state/javascript/state.core.js b/gf/stable-diffusion-webui-state/javascript/state.core.js new file mode 100644 index 0000000000000000000000000000000000000000..d09a77fa6556947b0ffd192812035c130bf42849 --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.core.js @@ -0,0 +1,440 @@ +window.state = window.state || {}; +state = window.state; + +state.core = (function () { + + const TABS = ['txt2img', 'img2img']; + + const ELEMENTS = { + 'prompt': 'prompt', + 'negative_prompt': 'neg_prompt', + 'sampling_steps': 'steps', + 'hires_steps': 'hires_steps', + 'hires_scale': 'hr_scale', + 'hires_resize_x': 'hr_resize_x', + 'hires_resize_y': 'hr_resize_y', + 'hires_denoising_strength': 'denoising_strength', + 'refiner_switch': 'switch_at', + 'width': 'width', + 'height': 'height', + 'batch_count': 'batch_count', + 'batch_size': 'batch_size', + 'cfg_scale': 'cfg_scale', + 'denoising_strength': 'denoising_strength', + 'seed': 'seed', + }; + + const ELEMENTS_WITHOUT_PREFIX = { + 'resize_mode': 'resize_mode', + }; + + const ELEMENTS_WITH_DUPLICATE_IDS = { + INPUTS: { + 'upscaler_2_visibility': 'extras_upscaler_2_visibility', + 'upscaler_scale_by_resize': 'extras_upscaling_resize', + 'upscaler_scale_by_max_side_length': 'extras_upscale_max_side_length', + 'upscaler_scale_to_w': 'extras_upscaling_resize_w', + 'upscaler_scale_to_h': 'extras_upscaling_resize_h', + 'upscaler_scale_to_crop': 'extras_upscaling_crop', + }, + SELECTS: { + 'upscaler_1': 'extras_upscaler_1', + 'upscaler_2': 'extras_upscaler_2', + } + }; + + const SELECTS = { + 'sampling': 'sampling', + 'scheduler': 'scheduler', + 'hires_upscaler': 'hr_upscaler', + 'refiner_checkpoint': 'checkpoint', + 'script': '#script_list', + }; + + const MULTI_SELECTS = { + 'styles': 'styles' + }; + + const TOGGLE_BUTTONS = { + 'hires_fix': 'hr', + 'refiner': 'enable', + }; + + let store = null; + + function hasSetting(id, tab) { + const suffix = tab ? `_${tab}` : ''; + return this[`state${suffix}`] && this[`state${suffix}`].indexOf(id) > -1; + } + + function init() { + fetch('/state/config.json?_=' + (+new Date())) + .then(response => response.json()) + .then(config => { + try { + config.hasSetting = hasSetting + load(config); + } catch (error) { + console.error('[state]: Error:', error); + } + }) + .catch(error => console.error('[state]: Error getting JSON file:', error)); + } + + function forEachElement(list, config, action) { + for (const [settingId, element] of Object.entries(list)) { + TABS.forEach(tab => { + if (config.hasSetting(settingId, tab)) { + action(element, tab); + } + }); + } + } + + function load(config) { + + store = new state.Store(); + + loadUI(config); + restoreTabs(config); + + forEachElement(ELEMENTS, config, (element, tab) => { + handleSavedInput(`${tab}_${element}`); + }); + + forEachElement(ELEMENTS_WITHOUT_PREFIX, config, (element, tab) => { + handleSavedInput(`${element}`); + }); + + forEachElement(SELECTS, config, (element, tab) => { + handleSavedSelects(`${tab}_${element}`); + }); + + forEachElement(MULTI_SELECTS, config, (element, tab) => { + handleSavedMultiSelects(`${tab}_${element}`); + }); + + forEachElement(TOGGLE_BUTTONS, config, (element, tab) => { + handleToggleButton(`${tab}_${element}`); + }); + + forEachElement(ELEMENTS_WITH_DUPLICATE_IDS.INPUTS, config, (element, tab) => { + handleSavedInput(`${element}`, true); + }); + + forEachElement(ELEMENTS_WITH_DUPLICATE_IDS.SELECTS, config, (element, tab) => { + handleSavedSelects(`${element}`, true); + }); + + handleExtensions(config); + handleSettingsPage(); + } + + function createHeaderButton(title, text, className, style, action) { + + const button = state.utils.html.create('button', { + title: title, + innerHTML: text, + className: className, + }, style); + + if (action) { + button.addEventListener('click', action); + } + + return button; + } + + function createHeaderFileInput(title, text, className) { + + let inputId = 'state-import-file-inline'; + + let importBtn = createHeaderButton(title,text, className, { + display: 'none' + }, () => { + actions.importState(inputId); + }); + + let label = state.utils.html.create('label', {}, { cursor: 'pointer' }); + label.appendChild(state.utils.html.create('input', { + type: 'file', + id: inputId, + accept: 'application/json', + }, { + display: 'none' + })); + label.appendChild(document.createTextNode(text)); + label.addEventListener('change', () => { + importBtn.dispatchEvent(new Event('click')); + }); + + let button = createHeaderButton(title, '', className, {}); + button.appendChild(label); + + return { + hiddenButton: importBtn, + button: button + }; + } + + function loadUI(config) { + + let quickSettings = gradioApp().getElementById("quicksettings"); + let className = quickSettings.querySelector('button').className; + let uiConfig = config['state_ui']; + + if (!uiConfig || uiConfig.indexOf('Reset Button') > -1) { + quickSettings.appendChild(createHeaderButton('State: Reset', "*️⃣", className, {}, actions.resetAll)); + } + + if (!uiConfig || uiConfig.indexOf('Export Button') > -1) { + quickSettings.appendChild(createHeaderButton('State: Export',"📤", className, {}, actions.exportState)); + } + + if (!uiConfig || uiConfig.indexOf('Import Button') > -1) { + let fileInput = createHeaderFileInput('State: Import',"📥", className); + quickSettings.appendChild(fileInput.hiddenButton); + quickSettings.appendChild(fileInput.button); + } + } + + function restoreTabs(config) { + + if (! config.hasSetting('tabs')) { + return; + } + + const tabs = gradioApp().querySelectorAll('#tabs > div:first-child button'); + const value = store.get('tab'); + + if (value) { + for (var i = 0; i < tabs.length; i++) { + if (tabs[i].textContent === value) { + state.utils.triggerEvent(tabs[i], 'click'); + break; + } + } + } + // Use this when onUiTabChange is fixed + // onUiTabChange(function () { + // store.set('tab', gradioApp().querySelector('#tabs .tab-nav button.selected').textContent); + // }); + bindTabClickEvents(); + } + + function bindTabClickEvents() { + Array.from(gradioApp().querySelectorAll('#tabs .tab-nav button')).forEach(tab => { + tab.removeEventListener('click', storeTab); + tab.addEventListener('click', storeTab); + }); + } + + function storeTab() { + store.set('tab', gradioApp().querySelector('#tabs .tab-nav button.selected').textContent); + bindTabClickEvents(); // dirty hack here... + } + + function getElement(id) { + for (let i = 0; i < TABS.length; i++) { + if (id.startsWith(`${TABS[i]}_#`)) { + // handle elements with same ids in different tabs... + return gradioApp().querySelector('#tab_' + id.replace(`${TABS[i]}_#`, `${TABS[i]} #`)); + } + } + return gradioApp().getElementById(id); + } + + function handleSavedInput(id, duplicateIds) { + + let elements = null; + + if (duplicateIds) { + elements = gradioApp().querySelectorAll(`[id="${id}"] textarea, [id="${id}"] input`); + } else { + elements = gradioApp().querySelectorAll(`#${id} textarea, #${id} input`); + } + + const events = ['change', 'input']; + + if (! elements || ! elements.length) { + state.logging.warn(`Input not found: ${id}`); + return; + } + + let forEach = function (action) { + events.forEach(function(event) { + elements.forEach(function (element) { + action.call(element, event); + }); + }); + }; + + forEach(function (event) { + this.addEventListener(event, function () { + let value = this.value; + if (this.type && this.type === 'checkbox') { + value = this.checked; + } + store.set(id, value); + }); + }); + + TABS.forEach(tab => { + const seedInput = gradioApp().querySelector(`#${tab}_seed input`); + ['random_seed', 'reuse_seed'].forEach(id => { + const btn = gradioApp().querySelector(`#${tab}_${id}`); + btn.addEventListener('click', () => { + setTimeout(() => { + state.utils.triggerEvent(seedInput, 'change'); + }, 100); + }); + }); + }); + + let value = store.get(id); + + if (! value) { + return; + } + + forEach(function (event) { + state.utils.setValue(this, value, event); + }); + } + + function handleSavedSelects(id, duplicateIds) { + if (duplicateIds) { + const elements = gradioApp().querySelectorAll(`[id="${id}"]`); + if (! elements || ! elements.length) { + state.logging.warn(`Select not found: ${id}`); + return; + } + elements.forEach(function (element) { + state.utils.handleSelect(element, id, store); + }); + } else { + state.utils.handleSelect(getElement(id), id, store); + } + } + + function handleSavedMultiSelects(id) { + const select = gradioApp().getElementById(`${id}`); + state.utils.handleMultipleSelect(select, id, store); + } + + function handleToggleButton(id) { + let btn = gradioApp().querySelector(`button#${id}`); + if (! btn) { // New gradio version + btn = gradioApp().querySelector(`.input-accordion#${id}`); + } + if (! btn) { + state.logging.warn(`Button not found: ${id}`); + return; + } + if (store.get(id) === 'true') { + state.utils.clickToggleMenu(btn); + } + btn.addEventListener('click', function () { + let classList = Array.from(this.classList); + if (btn.tagName === 'BUTTON') { // Old gradio version + store.set(id, classList.indexOf('secondary-down') === -1); + } else { + store.set(id, classList.indexOf('input-accordion-open') > -1); + } + }); + } + + function handleExtensions(config) { + if (config['state_extensions']) { + config['state_extensions'].forEach(function (ext) { + if (ext in state.extensions) { + state.extensions[ext].init(); + } + }); + } + } + + function handleSettingsPage() { + + const page = gradioApp().querySelector('#settings_state'); + state.utils.html.setStyle(page.querySelectorAll('fieldset'), { + 'marginTop': '20px', + 'marginBottom': '10px' + }); + + let buttonsContainer = gradioApp().querySelector('#settings_state_buttons'); + if (buttonsContainer) { + buttonsContainer.parentNode.removeChild(buttonsContainer); + } + buttonsContainer = document.createElement('div'); + buttonsContainer.id = 'settings_state_buttons'; + + let setCheckboxes = function (value, checkFunc) { + checkFunc = checkFunc || function () { return true; }; + Array.from(page.querySelectorAll('input[type="checkbox"]')).forEach(function (el) { + if (checkFunc(el)) { + if (el.checked !== value) { + el.checked = value; + state.utils.triggerEvent(el, 'change'); + } + } else if (el.checked === value) { + el.checked = !value; + state.utils.triggerEvent(el, 'change'); + } + }); + }; + buttonsContainer.appendChild(state.utils.html.createButton('Select All', function () { + setCheckboxes(true); + })); + buttonsContainer.appendChild(state.utils.html.createButton('Select All Except Seeds', function () { + setCheckboxes(true, function (el) { + return el.nextElementSibling.textContent.indexOf('seed') === -1; + }); + })); + buttonsContainer.appendChild(state.utils.html.createButton('Unselect All', function () { + setCheckboxes(false); + })); + state.utils.html.setStyle(buttonsContainer, { + 'marginTop': '20px', + 'marginBottom': '10px' + }); + buttonsContainer.appendChild(state.utils.html.create('hr')); + buttonsContainer.appendChild(state.utils.html.create('div', { innerHTML: 'Actions' }, { marginBottom: '10px' })); + buttonsContainer.appendChild(state.utils.html.createButton('Reset All', actions.resetAll)); + buttonsContainer.appendChild(state.utils.html.createButton('Export State', actions.exportState)); + buttonsContainer.appendChild(state.utils.html.createButton('Import State', actions.importState)); + buttonsContainer.appendChild(state.utils.html.create('input', { + id: 'state-import-file', type: 'file', accept: 'application/json' + })); + page.appendChild(buttonsContainer); + } + + let actions = { + resetAll: function () { + let confirmed = confirm('Reset all state values?'); + if (confirmed) { + store.clearAll(); + alert('All state values deleted!'); + } + }, + exportState: function () { + state.utils.saveFile('sd-webui-state', store.getAll()); + }, + importState: function (id) { + const fileInput = gradioApp().getElementById(id || 'state-import-file'); + if (! fileInput.files || ! fileInput.files[0]) { + alert('Please select a JSON file!'); + return; + } + const file = fileInput.files[0]; + const reader = new FileReader(); + reader.onload = function (event) { + store.load(JSON.parse(event.target.result)); + window.location.reload(); + }; + reader.readAsText(file); + } + }; + + return { init }; +}()); diff --git a/gf/stable-diffusion-webui-state/javascript/state.ext.adetailer.js b/gf/stable-diffusion-webui-state/javascript/state.ext.adetailer.js new file mode 100644 index 0000000000000000000000000000000000000000..3769062256f9c05d09f5b40a97b6c9041f04356b --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.ext.adetailer.js @@ -0,0 +1,173 @@ +window.state = window.state || {}; +window.state.extensions = window.state.extensions || {}; +state = window.state; + +state.extensions['adetailer'] = (function () { + + let container = null; + let store = null; + let cnTabs = []; + + function bindTabEvents() { + const tabs = container.querySelectorAll('.tabs > div > button'); + tabs.forEach(tab => { // dirty hack here + tab.removeEventListener('click', onTabClick); + tab.addEventListener('click', onTabClick); + }); + return tabs; + } + + function handleTabs() { + let tabs = bindTabEvents(); + let value = store.get('tab'); + if (value) { + for (var i = 0; i < tabs.length; i++) { + if (tabs[i].textContent === value) { + state.utils.triggerEvent(tabs[i], 'click'); + break; + } + } + } + } + + function onTabClick() { + store.set('tab', this.textContent); + bindTabEvents(); + } + + function handleCheckbox(checkbox, id) { + let value = store.get(id); + if (value) { + state.utils.setValue(checkbox, value, 'change'); + } + checkbox.addEventListener('change', function () { + store.set(id, this.checked); + }); + } + + function handleCheckboxes(container, container_idx) { + let checkboxes = container.querySelectorAll('input[type="checkbox"]'); + checkboxes.forEach(function checkbox(checkbox, idx) { + handleCheckbox(checkbox, `ad-tab-${container_idx}-checkbox-${idx}`) + }); + } + + function handleSliders(container, container_idx) { + let sliders = container.querySelectorAll('input[type="range"]'); + sliders.forEach(function (slider, idx) { + let id = `ad-tab-${container_idx}-slider-${idx}`; + let value = store.get(id); + if (value) { + state.utils.setValue(slider, value, 'change'); + } + slider.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + } + + function handleTextboxes(container, container_idx) { + let textboxes = container.querySelectorAll('textarea'); + textboxes.forEach(function (textbox, idx) { + let id = `ad-tab-${container_idx}-textbox-${idx}`; + let value = store.get(id); + if (value) { + state.utils.setValue(textbox, value, 'change'); + } + textbox.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + } + + function handleSelects(container, container_idx) { + let selects = container.querySelectorAll('.gradio-dropdown') + selects.forEach(function (select, idx) { + state.utils.handleSelect(select, `ad-tab-${container_idx}-select-${idx}`, store); + }); + } + + function handleRadioButtons(container, container_idx) { + let fieldsets = container.querySelectorAll('fieldset'); + fieldsets.forEach(function (fieldset, idx) { + let radios = fieldset.querySelectorAll('input[type="radio"]'); + let id = `ad-tab-${container_idx}-fieldset-${idx}`; + let value = store.get(id); + if (value) { + radios.forEach(function (radio) { + state.utils.setValue(radio, value, 'change'); + }); + } + radios.forEach(function (radio) { + radio.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + }); + } + + function handleDropdown(dropdown, id) { + let value = store.get(id); + + if (value && value === 'true') { + state.utils.triggerEvent(dropdown, 'click'); + } + dropdown.addEventListener('click', function () { + let span = this.querySelector('.transition, .icon'); + store.set(id, span.style.transform !== 'rotate(90deg)'); + }); + } + + function handleDropdowns(container, container_idx) { + let dropdowns = container.querySelectorAll('.gradio-accordion .label-wrap'); + dropdowns.forEach(function (dropdown, idx) { + handleDropdown(dropdown, `ad-tab-${container_idx}-dropdown-${idx}`); + }); + } + + function load() { + setTimeout(function () { + handleDropdown(container.querySelector('#script_txt2img_adetailer_ad_main_accordion > .label-wrap'), 'ad-dropdown-main'); + handleCheckbox(container.querySelector('#script_txt2img_adetailer_ad_enable > label > input'), 'ad-checkbox-enable'); + cnTabs.forEach(({ container, container_idx }) => { + handleTabs(container, container_idx); + handleTextboxes(container, container_idx); + handleCheckboxes(container, container_idx); + handleSliders(container, container_idx); + handleRadioButtons(container, container_idx); + handleSelects(container, container_idx); + handleDropdowns(container, container_idx); + }); + }, 500); + } + + function init() { + + container = gradioApp().getElementById('script_txt2img_adetailer_ad_main_accordion'); + store = new state.Store('ext-adetailerr'); + + if (! container) { + return; + } + + let tabs = container.querySelectorAll('.tabitem'); + + if (tabs.length) { + cnTabs = []; + tabs.forEach((tabContainer, i) => { + cnTabs.push({ + container: tabContainer, + container_idx: i + }); + }); + } else { + cnTabs = [{ + container: container + }]; + } + + load(); + } + + return { init }; +}()); diff --git a/gf/stable-diffusion-webui-state/javascript/state.ext.control-net.js b/gf/stable-diffusion-webui-state/javascript/state.ext.control-net.js new file mode 100644 index 0000000000000000000000000000000000000000..205b29b78aa533343f3e07b842f95cee49f7995d --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.ext.control-net.js @@ -0,0 +1,163 @@ +window.state = window.state || {}; +window.state.extensions = window.state.extensions || {}; +state = window.state; + +state.extensions['control-net'] = (function () { + + let container = null; + let store = null; + let cnTabs = []; + + function handleToggle() { + let value = store.get('toggled'); + let toggleBtn = container.querySelector('div.cursor-pointer, .label-wrap'); + + if (value && value === 'true') { + state.utils.triggerEvent(toggleBtn, 'click'); + load(); + } + toggleBtn.addEventListener('click', function () { + let span = this.querySelector('.transition, .icon'); + store.set('toggled', span.style.transform !== 'rotate(90deg)'); + load(); + }); + } + + function bindTabEvents() { + const tabs = container.querySelectorAll('.tabs > div > button'); + tabs.forEach(tab => { // dirty hack here + tab.removeEventListener('click', onTabClick); + tab.addEventListener('click', onTabClick); + }); + return tabs; + } + + function handleTabs() { + let tabs = bindTabEvents(); + let value = store.get('tab'); + if (value) { + for (var i = 0; i < tabs.length; i++) { + if (tabs[i].textContent === value) { + state.utils.triggerEvent(tabs[i], 'click'); + break; + } + } + } + } + + function onTabClick() { + store.set('tab', this.textContent); + bindTabEvents(); + } + + function handleCheckboxes() { + cnTabs.forEach(({ container, store }) => { + let checkboxes = container.querySelectorAll('input[type="checkbox"]'); + checkboxes.forEach(function (checkbox) { + let label = checkbox.nextElementSibling; + let id = state.utils.txtToId(label.textContent); + let value = store.get(id); + if (value) { + state.utils.setValue(checkbox, value, 'change'); + } + checkbox.addEventListener('change', function () { + store.set(id, this.checked); + }); + }); + }); + } + + function handleSelects() { + cnTabs.forEach(({ container, store }) => { + container.querySelectorAll('.gradio-dropdown').forEach(select => { + let id = state.utils.txtToId(select.querySelector('label').firstChild.textContent); + let value = store.get(id); + state.utils.handleSelect(select, id, store); + if (id === 'preprocessor' && value && value.toLowerCase() !== 'none') { + state.utils.onNextUiUpdates(handleSliders); // update new sliders if needed + } + }); + }); + } + + function handleSliders() { + cnTabs.forEach(({ container, store }) => { + let sliders = container.querySelectorAll('input[type="range"]'); + sliders.forEach(function (slider) { + let label = slider.previousElementSibling.querySelector('label span'); + let id = state.utils.txtToId(label.textContent); + let value = store.get(id); + if (value) { + state.utils.setValue(slider, value, 'change'); + } + slider.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + }); + } + + function handleRadioButtons() { + cnTabs.forEach(({ container, store }) => { + let fieldsets = container.querySelectorAll('fieldset'); + fieldsets.forEach(function (fieldset) { + let label = fieldset.firstChild.nextElementSibling; + let radios = fieldset.querySelectorAll('input[type="radio"]'); + let id = state.utils.txtToId(label.textContent); + let value = store.get(id); + if (value) { + radios.forEach(function (radio) { + state.utils.setValue(radio, value, 'change'); + }); + } + radios.forEach(function (radio) { + radio.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + }); + }); + } + + function load() { + setTimeout(function () { + handleTabs(); + handleCheckboxes(); + handleSelects(); + handleSliders(); + handleRadioButtons(); + }, 500); + } + + function init() { + + container = gradioApp().getElementById('controlnet'); + store = new state.Store('ext-control-net'); + + if (! container) { + return; + } + + let tabs = container.querySelectorAll('.tabitem'); + + if (tabs.length) { + cnTabs = []; + tabs.forEach((tabContainer, i) => { + cnTabs.push({ + container: tabContainer, + store: new state.Store('ext-control-net-' + i) + }); + }); + } else { + cnTabs = [{ + container: container, + store: new state.Store('ext-control-net-0') + }]; + } + + handleToggle(); + load(); + } + + return { init }; +}()); diff --git a/gf/stable-diffusion-webui-state/javascript/state.ext.dynamic-prompting.js b/gf/stable-diffusion-webui-state/javascript/state.ext.dynamic-prompting.js new file mode 100644 index 0000000000000000000000000000000000000000..78928b8dbd4a50d6ca2eed0f335d930ad36fe0fd --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.ext.dynamic-prompting.js @@ -0,0 +1,118 @@ +window.state = window.state || {}; +window.state.extensions = window.state.extensions || {}; +state = window.state; + +state.extensions['dynamic prompting'] = (function () { + + let container = null; + let store = null; + + function handleCheckboxes() { + let checkboxes = container.querySelectorAll('input[type="checkbox"]'); + checkboxes.forEach(function (checkbox, idx) { + let id = `dp-checkbox-${idx}`; + let value = store.get(id); + if (value) { + state.utils.setValue(checkbox, value, 'change'); + } + checkbox.addEventListener('change', function () { + store.set(id, this.checked); + }); + }); + } + + function handleSliders() { + let sliders = container.querySelectorAll('input[type="range"]'); + sliders.forEach(function (slider, idx) { + let id = `dp-slider-${idx}`; + let value = store.get(id); + if (value) { + state.utils.setValue(slider, value, 'change'); + } + slider.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + } + + function handleTextboxes() { + let textboxes = container.querySelectorAll('textarea'); + textboxes.forEach(function (textbox, idx) { + let id = `dp-textbox-${idx}`; + let value = store.get(id); + if (value) { + state.utils.setValue(textbox, value, 'change'); + } + textbox.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + } + + function handleSelects() { + let selects = container.querySelectorAll('.gradio-dropdown') + selects.forEach(function (select, idx) { + state.utils.handleSelect(select, `dp-select-${idx}`, store); + }); + } + + function handleRadioButtons() { + let fieldsets = container.querySelectorAll('fieldset'); + fieldsets.forEach(function (fieldset, idx) { + let radios = fieldset.querySelectorAll('input[type="radio"]'); + let id = `dp-fieldset-${idx}` + let value = store.get(id); + if (value) { + radios.forEach(function (radio) { + state.utils.setValue(radio, value, 'change'); + }); + } + radios.forEach(function (radio) { + radio.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + }); + } + + function handleDropdowns() { + let dropdowns = container.querySelectorAll('.gradio-accordion .label-wrap'); + dropdowns.forEach(function (dropdown, idx) { + let id = `dp-dropdown-${idx}`; + let value = store.get(id); + + if (value && value === 'true') { + state.utils.triggerEvent(dropdown, 'click'); + } + dropdown.addEventListener('click', function () { + let span = this.querySelector('.transition, .icon'); + store.set(id, span.style.transform !== 'rotate(90deg)'); + }); + }); + } + + function load() { + setTimeout(function () { + handleCheckboxes(); + handleSliders(); + handleTextboxes(); + handleSelects(); + handleRadioButtons(); + handleDropdowns(); + }, 500); + } + + function init() { + + container = gradioApp().getElementById('sddp-dynamic-prompting'); + store = new state.Store('ext-dynamic-prompting'); + + if (! container) { + return; + } + + load(); + } + + return { init }; +}()); diff --git a/gf/stable-diffusion-webui-state/javascript/state.ext.multidiffusion.js b/gf/stable-diffusion-webui-state/javascript/state.ext.multidiffusion.js new file mode 100644 index 0000000000000000000000000000000000000000..4fe374dd64f1f5e5d5e9b10b09a432ea86c8e6d4 --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.ext.multidiffusion.js @@ -0,0 +1,130 @@ +window.state = window.state || {}; +window.state.extensions = window.state.extensions || {}; +state = window.state; + +state.extensions['multidiffusion'] = (function () { + + let containers = []; + let store = null; + + function handleCheckboxes(container, name) { + let checkboxes = container.querySelectorAll('input[type="checkbox"]'); + checkboxes.forEach(function (checkbox, idx) { + let id = `md-${name}-checkbox-${idx}`; + let value = store.get(id); + if (value) { + state.utils.setValue(checkbox, value, 'change'); + } + checkbox.addEventListener('change', function () { + store.set(id, this.checked); + }); + }); + } + + function handleSliders(container, name) { + let sliders = container.querySelectorAll('input[type="range"]'); + sliders.forEach(function (slider, idx) { + let id = `md-${name}-slider-${idx}`; + let value = store.get(id); + if (value) { + state.utils.setValue(slider, value, 'change'); + } + slider.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + } + + function handleTextboxes(container, name) { + let textboxes = container.querySelectorAll('textarea'); + textboxes.forEach(function (textbox, idx) { + let id = `md-${name}-textbox-${idx}`; + let value = store.get(id); + if (value) { + state.utils.setValue(textbox, value, 'change'); + } + textbox.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + } + + function handleSelects(container, name) { + let selects = container.querySelectorAll('.gradio-dropdown') + selects.forEach(function (select, idx) { + state.utils.handleSelect(select, `md-${name}-select-${idx}`, store); + }); + } + + function handleRadioButtons(container, name) { + let fieldsets = container.querySelectorAll('fieldset'); + fieldsets.forEach(function (fieldset, idx) { + let radios = fieldset.querySelectorAll('input[type="radio"]'); + let id = `md-${name}-fieldset-${idx}`; + let value = store.get(id); + if (value) { + radios.forEach(function (radio) { + state.utils.setValue(radio, value, 'change'); + }); + } + radios.forEach(function (radio) { + radio.addEventListener('change', function () { + store.set(id, this.value); + }); + }); + }); + } + + function handleDropdowns(container, name) { + let dropdowns = container.querySelectorAll('.gradio-accordion .label-wrap'); + dropdowns.forEach(function (dropdown, idx) { + let id = `md-${name}-dropdown-${idx}`; + let value = store.get(id); + + if (value && value === 'true') { + state.utils.triggerEvent(dropdown, 'click'); + } + dropdown.addEventListener('click', function () { + let span = this.querySelector('.transition, .icon'); + store.set(id, span.style.transform !== 'rotate(90deg)'); + }); + }); + } + + function load() { + setTimeout(function () { + containers.forEach(({container, name}) => { + handleCheckboxes(container, name); + handleSliders(container, name); + handleTextboxes(container, name); + handleSelects(container, name); + handleRadioButtons(container, name); + handleDropdowns(container, name); + }); + }, 500); + } + + function init() { + + let spanTags = gradioApp().getElementsByTagName("span"); + for (var i = 0; i < spanTags.length; i++) { + if (spanTags[i].textContent == 'Tiled Diffusion') { + containers.push({container: spanTags[i].parentElement.parentElement,name: 'diffusion'}); + } + if (spanTags[i].textContent == 'Tiled VAE') { + containers.push({container: spanTags[i].parentElement.parentElement,name: 'vae'}); + break; + } + }; + + store = new state.Store('ext-multidiffusion'); + + if (! containers.length) { + return; + } + + load(); + } + + return { init }; +}()); diff --git a/gf/stable-diffusion-webui-state/javascript/state.loggings.js b/gf/stable-diffusion-webui-state/javascript/state.loggings.js new file mode 100644 index 0000000000000000000000000000000000000000..8f7919255a63ced09b3987003299449e5bc53906 --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.loggings.js @@ -0,0 +1,19 @@ +window.state = window.state || {}; +state = window.state; + +state.logging = { + + name: 'state', + + log: function (message) { + console.log(`[${this.name}]: `, message); + }, + + error: function (message) { + console.error(`[${this.name}]: `, message); + }, + + warn: function (message) { + console.warn(`[${this.name}]: `, message); + } +}; diff --git a/gf/stable-diffusion-webui-state/javascript/state.store.js b/gf/stable-diffusion-webui-state/javascript/state.store.js new file mode 100644 index 0000000000000000000000000000000000000000..0d92cc2d6b677f6e12424b6a906c8d0e5ea5c995 --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.store.js @@ -0,0 +1,56 @@ +window.state = window.state || {}; +state = window.state; + +state.Store = function Store (prefix) { + this.prefix = state.constants.LS_PREFIX + (prefix ? prefix + '-' : ''); +}; + +state.Store.prototype.set = function (key, value) { + if (key.startsWith(this.prefix)) { + localStorage.setItem(key, value); + } else { + localStorage.setItem(this.prefix + key, value); + } +}; + +state.Store.prototype.get = function (key) { + return localStorage.getItem(this.prefix + key); +}; + +state.Store.prototype.remove = function (key) { + localStorage.removeItem(this.prefix + key); +}; + +state.Store.prototype.clear = function () { + localStorage.clear(); +}; + +state.Store.prototype.clearAll = function () { + let keys = Object.keys(localStorage); + for (let i = 0; i < keys.length; i++) { + if (keys[i].startsWith(state.constants.LS_PREFIX)) { + localStorage.removeItem(keys[i]); + } + } +}; + +state.Store.prototype.getAll = function () { + let result = {}; + let keys = Object.keys(localStorage); + for (let i = 0; i < keys.length; i++) { + if (keys[i].startsWith(state.constants.LS_PREFIX)) { + result[keys[i]] = localStorage[keys[i]]; + } + } + return result; +}; + +state.Store.prototype.load = function (json) { + this.clearAll(); + let keys = Object.keys(json); + for (let i = 0; i < keys.length; i++) { + if (keys[i].startsWith(state.constants.LS_PREFIX)) { + this.set(keys[i], json[keys[i]]); + } + } +}; diff --git a/gf/stable-diffusion-webui-state/javascript/state.utils.js b/gf/stable-diffusion-webui-state/javascript/state.utils.js new file mode 100644 index 0000000000000000000000000000000000000000..972f4d37e06972db411517b091b984abed7a8582 --- /dev/null +++ b/gf/stable-diffusion-webui-state/javascript/state.utils.js @@ -0,0 +1,244 @@ +window.state = window.state || {}; +state = window.state; +let selectingQueue = -1; + +state.utils = { + triggerEvent: function triggerEvent(element, event) { + if (! element) { + state.logging.warn('Element not found'); + return; + } + element.dispatchEvent(new Event(event.trim())); + return element; + }, + triggerMouseEvent: function triggerMouseEvent(element, event) { + if (! element) { + state.logging.warn('Element not found'); + return; + } + event = event || 'click'; + element.dispatchEvent(new MouseEvent(event, { + view: window, + bubbles: true, + cancelable: true, + })); + return element; + }, + clickToggleMenu: function openToggleMenu(element) { + if (! element) { + state.logging.warn('Toggle button not found'); + return; + } + let mouseEvent = new MouseEvent('click', { + view: window, + bubbles: true, + cancelable: true + }); + element.dispatchEvent(mouseEvent); + let icon = element.querySelector('.icon'); + if (icon) { + icon.dispatchEvent(mouseEvent); + } + let checkbox = element.querySelector('input[type="checkbox"]'); + if (checkbox) { + checkbox.dispatchEvent(mouseEvent); + } + return element; + }, + setValue: function setValue(element, value, event) { + switch (element.type) { + case 'checkbox': + element.checked = value === 'true'; + this.triggerEvent(element, event); + break; + case 'radio': + if (element.value === value) { + element.checked = true; + this.triggerEvent(element, event); + } else { + element.checked = false; + } + break; + default: + element.value = value; + this.triggerEvent(element, event); + } + }, + onContentChange: function onContentChange(targetNode, func) { + const observer = new MutationObserver((mutationsList, observer) => { + for (const mutation of mutationsList) { + if (mutation.type === 'childList') { + func(targetNode); + } + } + }); + observer.observe(targetNode, { + childList: true, + characterData: true, + subtree: true + }); + }, + handleSelect: function handleSelect(select, id, store) { + try { + let value = store.get(id); + + if (value) { + selectingQueue += 1; + setTimeout(() => { + let input = select.querySelector('input'); + state.utils.triggerMouseEvent(input, 'focus'); + + setTimeout(() => { + let items = Array.from(select.querySelectorAll('ul li')); + items.forEach(li => { + if (li.lastChild.wholeText.trim() === value) { + state.utils.triggerMouseEvent(li, 'mousedown'); + return false; + } + }); + state.utils.triggerMouseEvent(input, 'blur'); + selectingQueue -= 1; + }, 100); + }, selectingQueue * 200) + } + + setTimeout(() => { + state.utils.onContentChange(select, function (el) { + let selected = el.querySelector('span.single-select'); + if (selected) { + store.set(id, selected.textContent); + } else { + // new gradio version... + let input = select.querySelector('input'); + if (input) { + store.set(id, input.value); + } + } + }); + }, 150); + } catch (error) { + console.error('[state]: Error:', error); + } + }, + handleMultipleSelect: function handleMultipleSelect(select, id, store) { + try { + let value = store.get(id); + + if (value) { + + value = value.split(',').reverse(); + + if (value.length) { + + let input = select.querySelector('input'); + + let selectOption = function () { + + if (! value.length) { + state.utils.triggerMouseEvent(input, 'blur'); + return; + } + + let option = value.pop(); + state.utils.triggerMouseEvent(input, 'focus'); + + setTimeout(() => { + let items = Array.from(select.querySelectorAll('ul li')); + items.forEach(li => { + if (li.lastChild.wholeText.trim() === option) { + state.utils.triggerMouseEvent(li, 'mousedown'); + return false; + } + }); + setTimeout(selectOption, 100); + }, 100); + } + selectOption(); + } + } + state.utils.onContentChange(select, function (el) { + const selected = Array.from(el.querySelectorAll('.token > span')).map(item => item.textContent); + store.set(id, selected); + }); + } catch (error) { + console.error('[state]: Error:', error); + } + }, + txtToId: function txtToId(txt) { + return txt.split(' ').join('-').toLowerCase(); + }, + callXTimes: function callXTimes(func, times) { + let called = 0; + return function() { + if (called < times) { + called++; + return func.apply(this); + } + } + }, + saveFile: function saveJSON(fileName ,data) { + const json = JSON.stringify(data, null, 4); + const blob = new Blob([json], {type: 'application/json'}); + const url = URL.createObjectURL(blob); + const link = document.createElement('a'); + link.href = url; + link.download = fileName + '.json'; + document.body.appendChild(link); + link.click(); + link.parentNode.removeChild(link); + }, + debounce: function debounce(func, delay) { + let lastCallTime = 0; + return function() { + const currentCallTime = new Date().getTime(); + if (currentCallTime - lastCallTime > delay) { + lastCallTime = currentCallTime; + func.apply(this, arguments); + } + } + }, + onNextUiUpdates: function (func) { + // brute force this to to ensure that the method is called after next few updates + onUiUpdate(this.callXTimes(function () { setTimeout(func, 5); }, 150)); + } +}; + +state.utils.html = { + setStyle: function setStyle(elements, style) { + if (elements instanceof NodeList) { + elements = Array.from(elements); + } else if (elements instanceof Node){ + elements = [elements]; + } else { + return; + } + elements.forEach(element => { + for (let key in style) { + if (style.hasOwnProperty(key)) { + element.style[key] = style[key]; + } + } + }); + }, + create: function create(type, props, style) { + const element = document.createElement(type); + if (props) { + for (let key in props) { + if (props.hasOwnProperty(key)) { + element[key] = props[key]; + } + } + } + if (style) { + this.setStyle(element, style); + } + return element; + }, + createButton: function createButton(text, onclick) { + const btn = document.createElement('button'); + btn.innerHTML = text; + btn.onclick = onclick || function () {}; + btn.className = 'gr-button gr-button-lg gr-button-primary'; + return btn; + } +}; diff --git a/gf/stable-diffusion-webui-state/scripts/__pycache__/state_api.cpython-310.pyc b/gf/stable-diffusion-webui-state/scripts/__pycache__/state_api.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..c57afc0bb27edbc68beb45b0ac1d3e57bf40b70e Binary files /dev/null and b/gf/stable-diffusion-webui-state/scripts/__pycache__/state_api.cpython-310.pyc differ diff --git a/gf/stable-diffusion-webui-state/scripts/__pycache__/state_settings.cpython-310.pyc b/gf/stable-diffusion-webui-state/scripts/__pycache__/state_settings.cpython-310.pyc new file mode 100644 index 0000000000000000000000000000000000000000..93160ab87565a329c3b7e1ecc2f232885a526e06 Binary files /dev/null and b/gf/stable-diffusion-webui-state/scripts/__pycache__/state_settings.cpython-310.pyc differ diff --git a/gf/stable-diffusion-webui-state/scripts/state_api.py b/gf/stable-diffusion-webui-state/scripts/state_api.py new file mode 100644 index 0000000000000000000000000000000000000000..f07246bf175678afefb1b81ffd1376e32ed45316 --- /dev/null +++ b/gf/stable-diffusion-webui-state/scripts/state_api.py @@ -0,0 +1,31 @@ +from fastapi import FastAPI, Body, HTTPException, Request, Response +from fastapi.responses import FileResponse + +import gradio as gr +import modules.shared as shared +import modules.script_callbacks as script_callbacks + + +class StateApi(): + + BASE_PATH = '/state' + + def get_path(self, path): + return f"{self.BASE_PATH}{path}" + + def add_api_route(self, path: str, endpoint, **kwargs): + return self.app.add_api_route(self.get_path(path), endpoint, **kwargs) + + def start(self, _: gr.Blocks, app: FastAPI): + self.app = app + self.add_api_route('/config.json', self.get_config, methods=['GET']) + + def get_config(self): + return FileResponse(shared.cmd_opts.ui_settings_file) + + +try: + api = StateApi() + script_callbacks.on_app_started(api.start) +except: + pass \ No newline at end of file diff --git a/gf/stable-diffusion-webui-state/scripts/state_settings.py b/gf/stable-diffusion-webui-state/scripts/state_settings.py new file mode 100644 index 0000000000000000000000000000000000000000..77becee588fac1f81f9caeb69014258d57ff8181 --- /dev/null +++ b/gf/stable-diffusion-webui-state/scripts/state_settings.py @@ -0,0 +1,105 @@ +import gradio as gr +import modules.shared as shared +from modules import scripts + + +def on_ui_settings(): + + section = ("state", "State") + + shared.opts.add_option("state", shared.OptionInfo([], "Saved main elements", gr.CheckboxGroup, lambda: { + "choices": [ + "tabs" + ] + }, section=section)) + + shared.opts.add_option("state_txt2img", shared.OptionInfo([], "Saved elements from txt2img", gr.CheckboxGroup, lambda: { + "choices": [ + "prompt", + "negative_prompt", + "styles", + "sampling", + "scheduler", + "sampling_steps", + "width", + "height", + "batch_count", + "batch_size", + "cfg_scale", + "seed", + "hires_fix", + "hires_upscaler", + "hires_steps", + "hires_scale", + "hires_resize_x", + "hires_resize_y", + "hires_denoising_strength", + "refiner", + "refiner_checkpoint", + "refiner_switch", + 'upscaler_scale_by_resize', + 'upscaler_scale_by_max_side_length', + 'upscaler_scale_to_w', + 'upscaler_scale_to_h', + 'upscaler_scale_to_crop', + 'upscaler_1', + 'upscaler_2', + 'upscaler_2_visibility', + "script" + ] + }, section=section)) + + shared.opts.add_option("state_img2img", shared.OptionInfo([], "Saved elements from img2img", gr.CheckboxGroup, lambda: { + "choices": [ + "prompt", + "negative_prompt", + "styles", + "refiner", + "refiner_checkpoint", + "refiner_switch", + 'upscaler_scale_by_resize', + 'upscaler_scale_by_max_side_length', + 'upscaler_scale_to_w', + 'upscaler_scale_to_h', + 'upscaler_scale_to_crop', + 'upscaler_1', + 'upscaler_2', + 'upscaler_2_visibility', + "sampling", + "scheduler", + "resize_mode", + "sampling_steps", + "width", + "height", + "batch_count", + "batch_size", + "cfg_scale", + "denoising_strength", + "seed", + "script" + ] + }, section=section)) + + shared.opts.add_option("state_extensions", shared.OptionInfo([], "Saved elements from extensions", gr.CheckboxGroup, lambda: { + "choices": [ + "control-net", + "adetailer", + "multidiffusion", + "dynamic prompting" + ] + }, section=section)) + + shared.opts.add_option("state_ui", shared.OptionInfo([ + "Reset Button", + "Import Button", + "Export Button" + ], "State UI", gr.CheckboxGroup, lambda: { + "choices": [ + "Reset Button", + "Import Button", + "Export Button" + ], + }, section=section)) + + +scripts.script_callbacks.on_ui_settings(on_ui_settings) diff --git a/gf/stable-diffusion-webui-state/style.css b/gf/stable-diffusion-webui-state/style.css new file mode 100644 index 0000000000000000000000000000000000000000..9eaccc58ea4fb3ea83ecf5ba838c8ded40c6864f --- /dev/null +++ b/gf/stable-diffusion-webui-state/style.css @@ -0,0 +1,20 @@ +#settings_state_buttons button, +#settings_state_buttons input[type="file"] { + color: white; + background: rgb(249, 115, 22); + border-radius: 8px; + padding: 5px 15px; + font-weight: bold; + font-size: 16px; + margin-left: 5px; + margin-right: 5px; +} + +#settings_state_buttons input[type="file"] { + padding: 3px 5px; +} + +#settings_state hr { + margin-top: 32px; + margin-bottom: 32px; +} \ No newline at end of file