File size: 3,459 Bytes
7d1049e
 
58303c5
 
 
 
2cdebfd
7d1049e
965be9e
 
58303c5
 
 
744c751
 
 
 
 
 
 
 
cf3ca94
 
 
 
 
744c751
 
 
 
 
 
 
 
 
 
 
 
 
28bae25
1fd99a2
744c751
 
14741a8
 
1fd99a2
 
 
 
 
 
 
 
 
 
 
965be9e
 
 
 
 
 
 
0279224
965be9e
 
 
 
 
 
0279224
 
 
 
 
965be9e
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: cc-by-nc-4.0
datasets:
- WenhaoWang/VidProM
language:
- en
pipeline_tag: text-generation
---

# The first model for automatic text-to-video prompt generation. It is fine-tuned on [VidProM](https://huggingface.co/datasets/WenhaoWang/VidProM) using [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1).

# Usage

## Download the model
```
from transformers import pipeline
pipe = pipeline("text-generation", model="WenhaoWang/AutoT2VPrompt")
```

## Set the Parameters
```
input = "An undemwater world"      # The input text to generate text-to-video prompt.
max_length = 50                    # The maximum length of the generated text.
temperature = 1.2                  # Controls the randomness of the generation. Higher values lead to more random outputs.
top_k = 8                          # Limits the number of words considered at each step to the top k most likely words.
num_return_sequences = 10          # The number of different text-to-video prompts to generate from the same input.
```

## Generation
```
all_prompts = pipe(input, max_length = max_length, do_sample = True, temperature = temperature, top_k = top_k, num_return_sequences=num_return_sequences)

def process(text):
    text = text.replace('\n', '.')
    text = text.replace('  .', '.')
    text = text[:text.rfind('.')]
    text = text + '.'
    return text

for i in range(num_return_sequences):
    print(process(all_prompts[i]['generated_text']))
```

You will get 10 text-to-video prompts, and you can pick one you like most.

```
An undemwater world, with water and a coral reef with sunlight shining, 3d animation, real world, 4k .
An undemwater world. In a stunning close up shot, the intricacies of the alien landscape e captured in exquisite detail. The cinematic quality is exceptional, with every aspect captured in a  aspect ratio.
An undemwater world where bioluminescent organisms thrive.
An undemwater world. The girl is standing in the gden in autumn. The girl is holding a book in her hands. She has short brown hair. The girl is weing a long dress.
An undemwater world. The ocean is a vast, dk blue in deep blue. A creature with a single eye is in the distance, emitting blue and green light.
An undemwater world with bioluminescent creatures. The water is filled with glowing plants and colorful fish, all swimming in a vibrant and diverse sea. The creatures e playful and mischievous..
An undemwater world.. A young woman, SARA, is sitting at a table, looking distressed. She appes worried and anxious.
An undemwater world, . the man is talking to other people.
An undemwater world, a vibrant coral reef teeming with mine life and a symphony of colors that thrived beneath the waves..
An undemwater world, 1968 film, 32k resolution, shp, captured by Phantom High-Speed Camera, dynamic dramatic lighting, highly detailed cinematic chacters, hyper-detailed, insan.
```

# License

The model is licensed under the [CC BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en).

# Citation
```
@article{wang2024vidprom,
  title={VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models},
  author={Wang, Wenhao and Yang, Yi},
  journal={arXiv preprint arXiv:2403.06098},
  year={2024}
}
```

# Acknowledgment

The fine-tuning process is helped by [Yaowei Zheng](https://github.com/hiyouga).

# Contact

If you have any questions, feel free to contact Wenhao Wang (wangwenhao0716@gmail.com).