File size: 8,106 Bytes
6b448ad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
<!--Copyright 2023 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

<p align="center">
    <br>
    <img src="https://raw.githubusercontent.com/huggingface/diffusers/77aadfee6a891ab9fcfb780f87c693f7a5beeb8e/docs/source/imgs/diffusers_library.jpg" width="400"/>
    <br>
</p>

# 🧨 Diffusers

πŸ€— DiffusersλŠ” μ‚¬μ „ν•™μŠ΅λœ λΉ„μ „ 및 μ˜€λ””μ˜€ ν™•μ‚° λͺ¨λΈμ„ μ œκ³΅ν•˜κ³ , μΆ”λ‘  및 ν•™μŠ΅μ„ μœ„ν•œ λͺ¨λ“ˆμ‹ 도ꡬ μƒμž 역할을 ν•©λ‹ˆλ‹€.

보닀 μ •ν™•ν•˜κ²Œ, πŸ€— DiffusersλŠ” λ‹€μŒμ„ μ œκ³΅ν•©λ‹ˆλ‹€:

- 단 λͺ‡ μ€„μ˜ μ½”λ“œλ‘œ 좔둠을 μ‹€ν–‰ν•  수 μžˆλŠ” μ΅œμ‹  ν™•μ‚° νŒŒμ΄ν”„λΌμΈμ„ μ œκ³΅ν•©λ‹ˆλ‹€. ([**Using Diffusers**](./using-diffusers/conditional_image_generation)λ₯Ό μ‚΄νŽ΄λ³΄μ„Έμš”) μ§€μ›λ˜λŠ” λͺ¨λ“  νŒŒμ΄ν”„λΌμΈκ³Ό ν•΄λ‹Ή 논문에 λŒ€ν•œ κ°œμš”λ₯Ό 보렀면 [**Pipelines**](#pipelines)을 μ‚΄νŽ΄λ³΄μ„Έμš”.
- μΆ”λ‘ μ—μ„œ 속도 vs ν’ˆμ§ˆμ˜ μ ˆμΆ©μ„ μœ„ν•΄ μƒν˜Έκ΅ν™˜μ μœΌλ‘œ μ‚¬μš©ν•  수 μžˆλŠ” λ‹€μ–‘ν•œ λ…Έμ΄μ¦ˆ μŠ€μΌ€μ€„λŸ¬λ₯Ό μ œκ³΅ν•©λ‹ˆλ‹€. μžμ„Έν•œ λ‚΄μš©μ€ [**Schedulers**](./api/schedulers/overview)λ₯Ό μ°Έκ³ ν•˜μ„Έμš”.
- UNetκ³Ό 같은 μ—¬λŸ¬ μœ ν˜•μ˜ λͺ¨λΈμ„ end-to-end ν™•μ‚° μ‹œμŠ€ν…œμ˜ ꡬ성 μš”μ†Œλ‘œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. μžμ„Έν•œ λ‚΄μš©μ€ [**Models**](./api/models)을 μ°Έκ³ ν•˜μ„Έμš”.
- κ°€μž₯ μΈκΈ°μžˆλŠ” ν™•μ‚° λͺ¨λΈ ν…ŒμŠ€ν¬λ₯Ό ν•™μŠ΅ν•˜λŠ” 방법을 λ³΄μ—¬μ£ΌλŠ” μ˜ˆμ œλ“€μ„ μ œκ³΅ν•©λ‹ˆλ‹€. μžμ„Έν•œ λ‚΄μš©μ€ [**Training**](./training/overview)λ₯Ό μ°Έκ³ ν•˜μ„Έμš”.

## 🧨 Diffusers νŒŒμ΄ν”„λΌμΈ

λ‹€μŒ ν‘œμ—λŠ” κ³΅μ‹œμ μœΌλ‘œ μ§€μ›λ˜λŠ” λͺ¨λ“  νŒŒμ΄ν”„λΌμΈ, κ΄€λ ¨ λ…Όλ¬Έ, 직접 μ‚¬μš©ν•΄ λ³Ό 수 μžˆλŠ” Colab λ…ΈνŠΈλΆ(μ‚¬μš© κ°€λŠ₯ν•œ 경우)이 μš”μ•½λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.

| Pipeline | Paper | Tasks | Colab
|---|---|:---:|:---:|
| [alt_diffusion](./api/pipelines/alt_diffusion) | [**AltDiffusion**](https://arxiv.org/abs/2211.06679) | Image-to-Image Text-Guided Generation |
| [audio_diffusion](./api/pipelines/audio_diffusion) | [**Audio Diffusion**](https://github.com/teticio/audio-diffusion.git) | Unconditional Audio Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/teticio/audio-diffusion/blob/master/notebooks/audio_diffusion_pipeline.ipynb)
| [cycle_diffusion](./api/pipelines/cycle_diffusion) | [**Cycle Diffusion**](https://arxiv.org/abs/2210.05559) | Image-to-Image Text-Guided Generation |
| [dance_diffusion](./api/pipelines/dance_diffusion) | [**Dance Diffusion**](https://github.com/williamberman/diffusers.git) | Unconditional Audio Generation |
| [ddpm](./api/pipelines/ddpm) | [**Denoising Diffusion Probabilistic Models**](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation |
| [ddim](./api/pipelines/ddim) | [**Denoising Diffusion Implicit Models**](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation |
| [latent_diffusion](./api/pipelines/latent_diffusion) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)| Text-to-Image Generation | 
| [latent_diffusion](./api/pipelines/latent_diffusion) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)| Super Resolution Image-to-Image | 
| [latent_diffusion_uncond](./api/pipelines/latent_diffusion_uncond) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation | 
| [paint_by_example](./api/pipelines/paint_by_example) | [**Paint by Example: Exemplar-based Image Editing with Diffusion Models**](https://arxiv.org/abs/2211.13227) | Image-Guided Image Inpainting | 
| [pndm](./api/pipelines/pndm) | [**Pseudo Numerical Methods for Diffusion Models on Manifolds**](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation | 
| [score_sde_ve](./api/pipelines/score_sde_ve) | [**Score-Based Generative Modeling through Stochastic Differential Equations**](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation | 
| [score_sde_vp](./api/pipelines/score_sde_vp) | [**Score-Based Generative Modeling through Stochastic Differential Equations**](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation | 
| [stable_diffusion](./api/pipelines/stable_diffusion/text2img) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-to-Image Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
| [stable_diffusion](./api/pipelines/stable_diffusion/img2img) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Image-to-Image Text-Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/image_2_image_using_diffusers.ipynb)
| [stable_diffusion](./api/pipelines/stable_diffusion/inpaint) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-Guided Image Inpainting | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb)
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-to-Image Generation | 
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Image Inpainting | 
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Super Resolution Image-to-Image |
| [stable_diffusion_safe](./api/pipelines/stable_diffusion_safe) | [**Safe Stable Diffusion**](https://arxiv.org/abs/2211.05105) | Text-Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ml-research/safe-latent-diffusion/blob/main/examples/Safe%20Latent%20Diffusion.ipynb)
| [stochastic_karras_ve](./api/pipelines/stochastic_karras_ve) | [**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation | 
| [unclip](./api/pipelines/unclip) | [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125) | Text-to-Image Generation |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Text-to-Image Generation | 
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Image Variations Generation | 
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Dual Image and Text Guided Generation | 
| [vq_diffusion](./api/pipelines/vq_diffusion) | [Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822) | Text-to-Image Generation | 

**μ°Έκ³ **: νŒŒμ΄ν”„λΌμΈμ€ ν•΄λ‹Ή λ¬Έμ„œμ— μ„€λͺ…λœ λŒ€λ‘œ ν™•μ‚° μ‹œμŠ€ν…œμ„ μ‚¬μš©ν•œ 방법에 λŒ€ν•œ κ°„λ‹¨ν•œ μ˜ˆμž…λ‹ˆλ‹€.