|
--- |
|
license: cc-by-nc-4.0 |
|
datasets: |
|
- WenhaoWang/VidProM |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# The first model for automatic text-to-video prompt generation. It is fine-tuned on [VidProM](https://huggingface.co/datasets/WenhaoWang/VidProM) using [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1). |
|
|
|
# Usage |
|
|
|
## Download the model |
|
``` |
|
from transformers import pipeline |
|
pipe = pipeline("text-generation", model="WenhaoWang/AutoT2VPrompt") |
|
``` |
|
|
|
## Set the Parameters |
|
``` |
|
input = "An undemwater world" # The input text to generate text-to-video prompt. |
|
max_length = 50 # The maximum length of the generated text. |
|
temperature = 1.2 # Controls the randomness of the generation. Higher values lead to more random outputs. |
|
top_k = 8 # Limits the number of words considered at each step to the top k most likely words. |
|
num_return_sequences = 10 # The number of different text-to-video prompts to generate from the same input. |
|
``` |
|
|
|
## Generation |
|
``` |
|
all_prompts = pipe(input, max_length = max_length, do_sample = True, temperature = temperature, top_k = top_k, num_return_sequences=num_return_sequences) |
|
|
|
def process(text): |
|
text = text.replace('\n', '.') |
|
text = text.replace(' .', '.') |
|
text = text[:text.rfind('.')] |
|
text = text + '.' |
|
return text |
|
|
|
for i in range(num_return_sequences): |
|
print(process(all_prompts[i]['generated_text'])) |
|
``` |
|
|
|
You will get 10 text-to-video prompts, and you can pick one you like most. |
|
|
|
``` |
|
An undemwater world, with water and a coral reef with sunlight shining, 3d animation, real world, 4k . |
|
An undemwater world. In a stunning close up shot, the intricacies of the alien landscape e captured in exquisite detail. The cinematic quality is exceptional, with every aspect captured in a aspect ratio. |
|
An undemwater world where bioluminescent organisms thrive. |
|
An undemwater world. The girl is standing in the gden in autumn. The girl is holding a book in her hands. She has short brown hair. The girl is weing a long dress. |
|
An undemwater world. The ocean is a vast, dk blue in deep blue. A creature with a single eye is in the distance, emitting blue and green light. |
|
An undemwater world with bioluminescent creatures. The water is filled with glowing plants and colorful fish, all swimming in a vibrant and diverse sea. The creatures e playful and mischievous.. |
|
An undemwater world.. A young woman, SARA, is sitting at a table, looking distressed. She appes worried and anxious. |
|
An undemwater world, . the man is talking to other people. |
|
An undemwater world, a vibrant coral reef teeming with mine life and a symphony of colors that thrived beneath the waves.. |
|
An undemwater world, 1968 film, 32k resolution, shp, captured by Phantom High-Speed Camera, dynamic dramatic lighting, highly detailed cinematic chacters, hyper-detailed, insan. |
|
``` |
|
|
|
# License |
|
|
|
The model is licensed under the [CC BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en). |
|
|
|
# Citation |
|
``` |
|
@article{wang2024vidprom, |
|
title={VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models}, |
|
author={Wang, Wenhao and Yang, Yi}, |
|
journal={arXiv preprint arXiv:2403.06098}, |
|
year={2024} |
|
} |
|
``` |
|
|
|
# Acknowledgment |
|
|
|
The fine-tuning process is helped by [Yaowei Zheng](https://github.com/hiyouga). |
|
|
|
# Contact |
|
|
|
If you have any questions, feel free to contact [Wenhao Wang](wangwenhao0716.github.io) (wangwenhao0716@gmail.com). |