kyujinpy commited on
Commit
d043139
ยท
1 Parent(s): a338263

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md CHANGED
@@ -1,3 +1,56 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ base_model: Bingsu/my-korean-stable-diffusion-v1-5
4
+ training_prompt: A rabbit is eating a watermelon on the table
5
+ tags:
6
+ - tune-a-video
7
+ - text-to-video
8
+ - diffusers
9
+ - korean
10
+ inference: false
11
  ---
12
+
13
+ # Tune-A-VideKO - Korean Stable Diffusion v1-5
14
+
15
+ ## Model Description
16
+ - Base model: [Bingsu/my-korean-stable-diffusion-v1-5](https://huggingface.co/Bingsu/my-korean-stable-diffusion-v1-5)
17
+ - Training prompt: A rabbit is eating a watermelon on the table
18
+ ![sample-train](samples/rabbit.gif)
19
+
20
+ ## Samples
21
+
22
+ ![sample-500](samples/video4.gif)
23
+ Test prompt: ๊ณ ์–‘์ด๊ฐ€ ํ•ด๋ณ€์—์„œ ์ˆ˜๋ฐ•์„ ๋จน๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค
24
+
25
+ ![sample-500](samples/video5.gif)
26
+ Test prompt: ๊ฐ•์•„์ง€๊ฐ€ ์˜ค๋ Œ์ง€๋ฅผ ๋จน๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค
27
+
28
+ ## Usage
29
+ Clone the github repo
30
+ ```bash
31
+ git clone https://github.com/showlab/Tune-A-Video.git
32
+ ```
33
+
34
+ Run inference code
35
+
36
+ ```python
37
+ from tuneavideo.pipelines.pipeline_tuneavideo import TuneAVideoPipeline
38
+ from tuneavideo.models.unet import UNet3DConditionModel
39
+ from tuneavideo.util import save_videos_grid
40
+ import torch
41
+
42
+ pretrained_model_path = "Bingsu/my-korean-stable-diffusion-v1-5"
43
+ unet_model_path = "kyujinpy/Tune-A-VideoKO-v1-5"
44
+ unet = UNet3DConditionModel.from_pretrained(unet_model_path, subfolder='unet', torch_dtype=torch.float16).to('cuda')
45
+ pipe = TuneAVideoPipeline.from_pretrained(pretrained_model_path, unet=unet, torch_dtype=torch.float16).to("cuda")
46
+ pipe.enable_xformers_memory_efficient_attention()
47
+
48
+ prompt = "๊ฐ•์•„์ง€๊ฐ€ ๋งŒํ™” ์Šคํƒ€์ผ๋กœ ์ƒ์ž๋ฅผ ๋จน๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค"
49
+ video = pipe(prompt, video_length=8, height=512, width=512, num_inference_steps=50, guidance_scale=12.5).videos
50
+
51
+ save_videos_grid(video, f"./{prompt}.gif")
52
+ ```
53
+
54
+ ## Related Papers:
55
+ - [Tune-A-Video](https://arxiv.org/abs/2212.11565): One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
56
+ - [Stable Diffusion](https://arxiv.org/abs/2112.10752): High-Resolution Image Synthesis with Latent Diffusion Models