Workflow : I2V & T2V - transfer body movements from reference video with IC-Union-Control lora

#97
by RuneXX - opened

I2V & T2V - transfer body movements from reference video with IC-Union-Control lora

Works best for less action packed motions, and close ups.
Despite the example video above, for fast tik-tok routines or dance/fight scenes, it might not be as good (but will also do a Dev model workflow with more steps that might do this part better).
Works great for slower body motion transfer though, and close ups.

In the workflow you can choose between LTX audio, custom audio input, or using the audio from the input video.
And there is one workflow with SD Pose and another with DW Pose.

The more close the input image body position and body size is to the first frame of the input video the better the results.
Will make a workflow that uses Flux Klein to automagically re-pose the input image to that of the first video input frame, but that will come later.
But you can easily do so yourself for now with Qwen Image Edit or Flux Klein

(images used in the example video above are random picks from PromptHero website. Credit to the image creators)

Feel free to play around with it ;-)

https://huggingface.co/RuneXX/LTX-2.3-Workflows/tree/main/Control-reference

I've been using your workflows for a while now and love the hard work you put into these, thanks for that. I tried using this workflow and I must be missing something. When I run it, it generates the motion video just fine but doesn't go any further to generate the final video using that motion tracking video. I've tried so many things to get it to work. What am I missing here? Is there a Youtube tutorial video showing how to use it? I searched YT but could not find anything to help. I tried the SD Pose and DW Pose workflows. I even downloaded them again last night to see if anything changed, but I get the same results.

Thanks for any info on this.

Check out a crazy talking head video I did using one of your other workflows, it's called: "Hello, How are you doing?" @ https://www.youtube.com/watch?v=A1Lf9boQqCE

Oh the talking head youtube video looks really good ;-) nice quality .. that looks pretty amazing

Ibut doesn't go any further to generate the final video using that motion tracking video.

Do you get an error message in comfy? Maybe wrong size of the mask/pose video or something like that? or complaining about wrong frame count for the pose video?
I'll fix, but if you can point out where it fails for you its easier to narrow down where it went wrong ;-)

(ltx is very sensitive to the size and number of frames of the pose video)

Thanks, I had fun making that test video, with more to come!

No error message in Comfy. I used a 1024x1024 video for the motion reference, and a 1024x1024 image for the video generation. I didn't change anything else besides the prompt, all I did there was describe the image itself. The frame count for the pose video was 10 seconds (10 x 24 fps = 240 frames). Comfy runs the pose video part first like it should and creates the motion video with a black background and dots for the motion just like the Joker example above, but it stops running right after that part and never goes up to the final generation parts to continue running... I found the nodes "set reference video" and "get reference video". Do I need to separate your workflow into two workflows, and set the reference video using a load video node. I'll do a screen recording tonight to show you exactly what happens when I run it, if that'll help?

@RuneXX is it possible to add Canny edge or something which can extract only a person to the LTX-2.3_-_IV2V_TV2V_transfer_body_movements_IC-Union-Control-lora_DWPose workflow please (possibly switchable?)

Everything is working well with the current workflow but when I use a character lora to keep a consistent likeness with the reference image throughout the video, this works but i get alot of noise and distortion in the background. Its like black blobs or patches that appear randomly in the background.

I've tried different lora and union control strengths but i cant seem to get a good combination to remove the background noise

Or would you have any advice how to use a character / person trained lora with the workflow please? or should i use a different workflow?

It wouldn't be a bit change to make a version that has depth and canny instead ;-) will add

Was thinking of making a version where you can mask also, so you can put the new subject into the scene of the video (replace character), as well as a version where the person of the video is in a new scene (aka replace background, keep character). A bit like Wan Animate. Will give that a try also ;-) Now that Sam-3 is added to native comfy . Hopefully it works ok, even though it would be cheating a bit with first frame to Klein or similar

That would be awesome thank you!

Sign up or log in to comment