WenhaoWang commited on
Commit
98c3828
1 Parent(s): c8b1297

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -0
README.md CHANGED
@@ -1,3 +1,88 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ datasets:
4
+ - WenhaoWang/VidProM
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - text-to-video generation
10
+ - VidProM
11
+ - Automatical text-to-video prompt
12
  ---
13
+
14
+
15
+ # The first model for automatic text-to-video prompt completion: Given a few words as input, the model will generate a few whole text-to-video prompts.
16
+
17
+ # Details
18
+
19
+ It is fine-tuned on the [VidProM](https://huggingface.co/datasets/WenhaoWang/VidProM) dataset using [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) and 8 A100 80G GPUs.
20
+
21
+ # Usage
22
+
23
+ ## Download the model
24
+ ```
25
+ from transformers import pipeline
26
+ pipe = pipeline("text-generation", model="WenhaoWang/Meta-Llama-3-8B-AutoT2VPrompt")
27
+ ```
28
+
29
+ ## Set the Parameters
30
+ ```
31
+ input = "An underwater world" # The input text to generate text-to-video prompt.
32
+ max_length = 50 # The maximum length of the generated text.
33
+ temperature = 1.2 # Controls the randomness of the generation. Higher values lead to more random outputs.
34
+ top_k = 8 # Limits the number of words considered at each step to the top k most likely words.
35
+ num_return_sequences = 10 # The number of different text-to-video prompts to generate from the same input.
36
+ ```
37
+
38
+ ## Generation
39
+ ```
40
+ all_prompts = pipe(input, max_length = max_length, do_sample = True, temperature = temperature, top_k = top_k, num_return_sequences=num_return_sequences)
41
+
42
+ def process(text):
43
+ text = text.replace('\n', '.')
44
+ text = text.replace(' .', '.')
45
+ text = text[:text.rfind('.')]
46
+ text = text + '.'
47
+ return text
48
+
49
+ for i in range(num_return_sequences):
50
+ print(process(all_prompts[i]['generated_text']))
51
+ ```
52
+
53
+ You will get 10 text-to-video prompts, and you can pick one you like most.
54
+
55
+ ```
56
+ An underwater world, 25 ye boy, with aqua-green eyes, dk sandy blond hair, from the back, and on his back a fish, 23 ye old, weing glasses,ctoon chacte.
57
+ An underwater world, the video should capture the essence of tranquility and the beauty of nature.. a woman with short hair weing a green dress sitting at the desk.
58
+ An underwater world, the ocean is full of discded items, the water flows, and the light penetrating through the water.
59
+ An underwater world.. a woman with red eyes and red lips is looking forwd.
60
+ An underwater world.. an old man sitting in a chair, smoking a pipe, a little smoke coming out of the chair, a man is drinking a glass.
61
+ An underwater world. The ocean is filled with bioluminess as the water reflects a soft glow from a bioluminescent phosphorescent light source. The camera slowly moves away and zooms in..
62
+ An underwater world. the girl looks at the camera and smiles with happiness..
63
+ An underwater world, 1960s horror film..
64
+ An underwater world.. 4 men in 1940s style clothes walk ound a gothic castle. night, fe. A girl is running, and there e some flowers along the river.
65
+ An underwater world, -camera pan up . A girl is playing with her cat on a sunny day in the pk. A man is running and then falling down and dying.
66
+ ```
67
+
68
+ # License
69
+
70
+ The model is licensed under the [CC BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en).
71
+
72
+ # Citation
73
+ ```
74
+ @article{wang2024vidprom,
75
+ title={VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models},
76
+ author={Wang, Wenhao and Yang, Yi},
77
+ journal={arXiv preprint arXiv:2403.06098},
78
+ year={2024}
79
+ }
80
+ ```
81
+
82
+ # Acknowledgment
83
+
84
+ The fine-tuning process is helped by [Yaowei Zheng](https://github.com/hiyouga).
85
+
86
+ # Contact
87
+
88
+ If you have any questions, feel free to contact [Wenhao Wang](https://wangwenhao0716.github.io) (wangwenhao0716@gmail.com).