pengxiang
/

TrackDiffusion_SVD_Stage1

Model card Files Files and versions Community

TrackDiffusion_SVD_Stage1 / README.md

pengxiang's picture

Create README.md

cc847ef verified 3 months ago

|

raw history blame contribute delete

No virus

1.31 kB

	---
	pipeline_tag: text-to-video
	license: other
	license_link: LICENSE
	---

	# TrackDiffusion Model Card

	<!-- Provide a quick summary of what the model is/does. -->
	TrackDiffusion is a diffusion model that takes in tracklets as conditions, and generates a video from it.
	![framework](https://github.com/pixeli99/TrackDiffusion/assets/46072190/56995825-0545-4adb-a8dd-53dfa736517b)

	## Model Details

	### Model Description

	TrackDiffusion is a novel video generation framework that enables fine-grained control over complex dynamics in video synthesis by conditioning the generation process on object trajectories.
	This approach allows for precise manipulation of object trajectories and interactions, addressing the challenges of managing appearance, disappearance, scale changes, and ensuring consistency across frames.
	## Uses

	### Direct Use

	We provide the weights for the entire unet, so you can replace it in diffusers pipeline, for example:

	```python
	pretrained_model_path = "stabilityai/stable-video-diffusion-img2vid"
	unet = UNetSpatioTemporalConditionModel.from_pretrained("/path/to/unet", torch_dtype=torch.float16,)
	pipe = StableVideoDiffusionPipeline.from_pretrained(
	pretrained_model_path,
	unet=unet,
	torch_dtype=torch.float16,
	variant="fp16",
	low_cpu_mem_usage=True)
	```