LTX-2.3 Chinese Drama Character LoRA

A character LoRA for LTX-Video 2.3 (22B) trained on a 78-episode corpus of Chinese historical drama footage. Specialises the base model for live-action, photoreal cinematic generation in Han dynasty / wuxia settings, with stable identity for the most-frequent on-screen characters.

Model details

Field Value
Base model Lightricks/LTX-2.3-22B
Adapter type Plain LoRA (PEFT-style)
Rank 64
Alpha 64
Target modules to_k, to_q, to_v, to_out.0
Training steps 3000
Optimizer AdamW
Learning rate 1e-4, linear schedule
Mixed precision bf16
Gradient checkpointing enabled

Training data

  • Source: 78 episodes of a Chinese historical drama (Han dynasty setting), processed at extraction time into ~750 clean shots.
  • Caption format: enriched inline-weave prompts produced by Qwen3-Omni VL, with character identity triggers (char_0_person, char_1_person, ...) substituted at the start of each prompt, followed by camera/scene prose and a style anchor (live-action photorealistic, cinematic Chinese drama).
  • Captions are bilingual (English + Simplified Chinese). The deployed LoRA was trained on the Chinese variants.

Usage

The character triggers char_0_person, char_1_person, etc. correspond to the most-frequent identity clusters discovered by ArcFace + DBSCAN over the corpus. They must appear at the start of the prompt, comma-separated, terminated with a period.

Prompt format

char_0_person, char_1_person. Framed in a static eye level medium shot,
on a 35mm normal lens, with natural light. Set in a torch-lit Han dynasty
courtyard at dusk, the subjects face each other in tense silence.
Live-action photorealistic, cinematic Chinese drama.

Loading with diffusers

from diffusers import LTXPipeline
import torch

pipe = LTXPipeline.from_pretrained(
    "Lightricks/LTX-2.3-22B",
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

# Load the LoRA
pipe.load_lora_weights(
    "SyFeee/ltx2.3-chinese-drama-charlora",
    weight_name="lora_weights_step_03000.safetensors",
    adapter_name="cn_drama_char",
)
pipe.set_adapters(["cn_drama_char"], adapter_weights=[0.9])

video = pipe(
    prompt=(
        "char_0_person. Framed in a close-up on a 50mm normal lens, with shallow focus. "
        "Set in a candlelit Han dynasty study, the subject sits writing on bamboo scrolls. "
        "Live-action photorealistic, cinematic Chinese drama."
    ),
    negative_prompt="no CGI, no animation, no illustration, no painterly style, no anime",
    width=1280,
    height=544,
    num_frames=89,
    guidance_scale=4.0,
    num_inference_steps=20,
).frames[0]

Recommended LoRA strength

  • 0.8 β€” subtle stylistic touch, identity present but soft
  • 0.9 β€” default, validated against training distribution
  • 1.0+ β€” risks overfitting on facial features at the expense of camera/scene freedom

What this LoRA does well

  • Live-action photoreal output in the Han dynasty visual register (palace interiors, courtyards, traditional costume).
  • Stable character identity across cuts when the same trigger is used.
  • Strong response to camera/lens prose conventions baked into the training captions (dolly in, 35mm normal lens, handheld).

What it does NOT do

  • It cannot generate characters it never saw in training (no zero-shot identity transfer).
  • It is not an audio model. For audio-conditioned generation see the matching AV LoRA in the same project.
  • It does not include the IC-LoRA control adapters (pose / depth / canny). Those are shipped as separate repositories.

Triggers in the trained corpus

The trigger naming follows the fallback format char_{id}_person. Cluster frequencies in the training corpus:

Trigger Clips
char_0_person 442
char_0_person, char_1_person 180
char_0_person, char_1_person, char_2_person 53
4–10 character combinations 49
No characters detected (style-only) 20

So the model has the strongest signal for single-character and two-character compositions.

Related models

License

Apache 2.0. See LICENSE for terms.

Attribution: SyFe.

Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for SyFeee/ltx2.3-chinese-drama-charlora

Adapter
(335)
this model