Update README.md
Browse files
README.md
CHANGED
@@ -6,4 +6,50 @@ tags:
|
|
6 |
- stable-diffusion
|
7 |
- text-to-image
|
8 |
- diffusers
|
9 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
- stable-diffusion
|
7 |
- text-to-image
|
8 |
- diffusers
|
9 |
+
---
|
10 |
+
|
11 |
+
### Introduction:
|
12 |
+
Meet General Bagan, a cutting-edge text-to-image generator trained on a diverse dataset of over 200 images. With a keen understanding of textual inputs, it effortlessly translates words into visually stunning representations. From lifelike nature scenes to captivating abstract compositions.
|
13 |
+
|
14 |
+
### Problem Statement:
|
15 |
+
When we prompted the stable diffusion model to generate an image of Bagan, it produced an image depicting a pagoda from Thailand. Hence, our decision was to fine-tune the current stable diffusion model using a multitude of Bagan photos in order to attain a clearer outcome.
|
16 |
+
|
17 |
+
### How to create prompt:
|
18 |
+
When we create prompt for bagan, we have to consider 6 keywords. Those are Subject, Medium, Style, Art-sharing website, Resolution, and Additional details.
|
19 |
+
|
20 |
+
Subject -> What you want to see in the picture is the subject. Not writing enough about the subjects is a common error.
|
21 |
+
|
22 |
+
Medium -> The medium is the substance that artists work with. Illustration, oil painting, 3D rendering, and photography are a few examples. The impact of Medium is significant because a single keyword can significantly alter the style.
|
23 |
+
|
24 |
+
Style -> The image's artistic style is referred to as the style. Pop art, impressionist, and surrealist are a few examples.
|
25 |
+
|
26 |
+
Art-sharing website -> Specialty graphic websites like Deviant Art and Artstation compile a large number of images from various genres. One surefire way to direct the image toward these styles is to use them as a prompt.
|
27 |
+
|
28 |
+
Resolution -> Resolution represents how sharp and detailed the image is
|
29 |
+
|
30 |
+
Additional Details -> Sweeteners added to an image are additional details. To give the image a more dystopian and sci-fi feel, we will add those elements.
|
31 |
+
|
32 |
+
The example prompt for general bagan is: bagan, a creepy and eery Halloween setting, with Jack o lanterns on the street and shadow figures lurking about, dynamic lighting, photorealistic fantasy concept art, stunning visuals, creative, cinematic, ultra detailed, trending on art station, spooky vibe.
|
33 |
+
That prompt gives you the Halloween theme.
|
34 |
+
|
35 |
+
### Data:
|
36 |
+
We used stable diffusion v1.5 model to train with 223 bagan pictures.
|
37 |
+
|
38 |
+
### Contributors:
|
39 |
+
Main Contributor: [Ye Bhone Lin](https://github.com/Ye-Bhone-Lin), Supervisor: Sa Phyo Thu Htet, Contributors: Thant Htoo San, Min Phone Thit
|
40 |
+
|
41 |
+
### Limitation:
|
42 |
+
We can't generate a photo of a human.
|
43 |
+
|
44 |
+
### Other Work:
|
45 |
+
In our exploration of image generation, we delve into the architectural marvels of Myanmar, featuring iconic landmarks such as Ananda, Shwezigon, Bupaya, Thatbyinnyu, and Mraukoo. Each structure stands as a testament to the rich cultural and historical tapestry of the region, captured through the lens of our innovative text-to-image generator, General Bagan.
|
46 |
+
|
47 |
+
### References:
|
48 |
+
Wikipedia (2022). Stable Diffusion. Retrieved From: https://en.wikipedia.org/wiki/Stable_Diffusion
|
49 |
+
|
50 |
+
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. Retrieved From: https://arxiv.org/abs/2112.10752
|
51 |
+
|
52 |
+
Naomi Brown (2022). What is Stable Diffusion and How to Use it. Retrieved From: https://www.fotor.com/blog/what-is-stable-diffusion
|
53 |
+
|
54 |
+
Mishra, O. (June, 9). Stable Diffusion Explained. Medium. https://medium.com/@onkarmishra/stable-diffusion-explained-1f101284484d
|
55 |
+
|