It's like InstantID, but you get a video instead. Nothing crazy here, it's simply a shortcut between two demos.
Let's see how it does work with gradio API:
1. We call InstantX/InstantID with a conditional pose from cinematic camera shot (example provided in the demo) 2. Then we send the previous generated image to ali-vilab/i2vgen-xl
— Note that generation can be quite long, so take the opportunity to brew you some coffee 😌 If you want to skip the queue, you can of course reproduce this pipeline manually