What's new in v4?

#29

by eatmemark - opened 6 days ago

•

Hi @TenStrip . Saw you updated your 10S-Nodes and released a new workflow. Checked the Git repo and other sources (admittedly not Discord) and couldn't find a list of changes or features anywhere. I see in the workflow itself there's a LoRA stacker of some sort, but I'm not certain of its purpose beyond the obvious. Mind if I ask what the highlights are or if there are any new features?

Always appreciative of your work.

NeuroAlexon

5 days ago

•

edited 5 days ago

Hi, @TenStrip . I noticed that the character voice has gotten worse in the v4 workflow. While in v3, my female characters on different seeds had distinct and interesting voices, in v4, the voices are very robotic and uniform. I tried to figure out the differences. I changed the models (1.2 -> 1.0) , lora (from v3), encoders (from v3), audio_weight (1.0 -> 0.6-0.9), and prompts, but the voice was still "wrong/robotic/same." Looking at the differences between v3 and v4, I see that some nodes have been removed (LTX Lanent Anchor Aware, LTX Likeness Anchor), some have been replaced (STG Guider Advanced -> STG Guider; LTX2 Lora Loader Advanced -> LTX LoRA Stack (AV)), and some have different values (for example, sigmas). Can you tell me which nodes, settings, or prompts affect the character voice? Should I consider v4 a lightweight version rather than a new version and continue using v3? Or am I missing something?

Thank you for your work!

NeuroAlexon

5 days ago

•

edited 5 days ago

@TenStrip , I (and perhaps many others) would be very grateful if you would describe the main features and differences from other versions of your workflow for each version. For example, I subjectively believe that some previous versions produce better results for my specific purpose. Perhaps I'm wrong, and the newer versions require more fine-tuning. Grok suggested that your fourth version is simpler and faster, with audio that may be worse, but better video and faster generation. He based this conclusion on the word "Basic" in the title. I'd really like to understand which workflow to use in which situations. If you have time and desire, perhaps you can make such a description.

p.s.
I've learned to get the voice a little more right with positive and negative prompts, but the voices still don't sound like the ones the older versions of the workflow generated without prompts.

TenStrip

Owner 4 days ago

All of that would be because of the data change in v1.2 with some underlying Echo data even though it was minimal audio weight it touched connectors and their audio in that model is pretty bad, terrible actually. It's not made to sample on those samplers. But the tradeoff was getting the enhanced reinforced motion and conditioning improvements and prompt adherence. If it's better on the older workflows it's because of the sigma changes, you can easily copy the 13 string and mirror that to v4 on the STG guider settings, only real difference.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment