camenduru
/

potat1

TextToVideoSDPipeline

Model card Files Files and versions Community

potat1 / README.md

camenduru's picture

Update README.md

2130869 about 1 year ago

|

1.41 kB

	# Potat 1️⃣
	First Open-Source 1024x576 Text To Video Model 🥳

	### Info
	Prototype Model <br />
	Trained with https://lambdalabs.com ❤ 1xA100 (40GB) <br />
	2197 clips, 68388 tagged frames ( [salesforce/blip2-opt-6.7b-coco](https://huggingface.co/Salesforce/blip2-opt-6.7b-coco) ) <br />
	train_steps: 10000 <br />

	### Dataset & Config
	https://huggingface.co/camenduru/potat1_dataset/tree/main

	### Repos
	https://github.com/Breakthrough/PySceneDetect <br />
	https://github.com/ExponentialML/Video-BLIP2-Preprocessor <br />
	https://github.com/ExponentialML/Text-To-Video-Finetuning <br />
	https://github.com/camenduru/Text-To-Video-Finetuning-colab <br />

	### Base Model
	https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis <br />
	https://www.modelscope.cn/models/damo/text-to-video-synthesis <br />

	Thanks to ModelScope ❤ ExponentialML ❤ @DiffusersLib ❤ @LambdaAPI ❤ @cerspense ❤ @CiaraRowles1 ❤ @p1atdev_art ❤ <br />

	Please try it 🐣 <br />

	<video src="https://user-images.githubusercontent.com/54370274/243275155-97282de4-e1df-49a0-851e-cb8b4b040441.mp4" data-canonical-src="https://user-images.githubusercontent.com/54370274/243275155-97282de4-e1df-49a0-851e-cb8b4b040441.mp4" controls="controls" muted="muted" class="d-block rounded-bottom-2 border-top width-fit" style="max-height:640px; min-height: 200px"></video>

	Potat 2️⃣ is in the oven ♨ <br />