Instructions to use fal/ltx2.3-audio-reactive-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use fal/ltx2.3-audio-reactive-lora with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Lightricks/LTX-2.3", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("fal/ltx2.3-audio-reactive-lora") prompt = "A man with short gray hair plays a red electric guitar." input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png") image = pipe(image=input_image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
LTX2.3 Audio Reactive LoRA
LoRA adapter for LTX-2.3 audio-reactive video generation.
LTX2.3 Audio Reactive LoRA is a LoRA adapter for LTX-2.3 designed to make video generation react more visibly to music and sound. It focuses on beat-locked visual motion: cubic forms, particles, light pulses, camera pushes, graphic texture, and material deformation moving in sync with kicks, bass, snares, hi-hats, and synth changes.
The LoRA is intended for audio-to-video and image-plus-audio-to-video workflows, especially with the fal.ai endpoint fal-ai/ltx-2.3-quality/audio-to-video/lora.
LoRA file:
https://huggingface.co/fal/ltx2.3-audio-reactive-lora/resolve/main/ltx2.3_audio_reactive_lora.safetensors
Try it on fal.ai:
https://fal.ai/models/fal-ai/ltx-2.3-quality/audio-to-video/lora
Direct fal.ai Example
Direct runnable fal.ai example:
https://fal.ai/models/fal-ai/ltx-2.3-quality/audio-to-video/lora?share=5884bbce-702a-4218-9683-a82a471a0b9b
Preview
Model Details
- Base model:
Lightricks/LTX-2.3 - Base model relation: adapter / LoRA
- Model type: LTX-2.3 LoRA adapter
- Primary use: audio-reactive video generation
- Best workflow: image first frame + audio + prompt
- Recommended endpoint:
fal-ai/ltx-2.3-quality/audio-to-video/lora - Recommended LoRA scale:
1.0to1.5 - Current working scale:
1.2to1.5 - Recommended FPS:
24 - Recommended segment length:
5sto15s - Recommended resolution:
1024x1024for square visualizer clips - Recommended negative prompt: empty string unless the specific workflow needs constraints
- Recommended first frame: structured visual material with clear shapes, depth, light sources, cubes, geometry, particles, layered graphic elements, or audio-visualizer forms
- License: follows the
LTX-2 Community License Agreementinherited from the LTX-2.3 base model
Prompt Language
Use language like this near the start of the prompt:
sound-driven video, audio-reactive motion, continuous visual flow
For stronger motion, repeat the audio-reactive instruction directly:
The video must be driven by the audio. The cubes must visibly move to the sound. The cubes must hit the beat: BAM BAM BAM.
Prompt Template
sound-driven video, audio-reactive motion, continuous visual flow.
This must be an aggressively audio-reactive cubic video. The cubes must visibly move to the sound. The cubes must visibly move to the sound.
The cubes must hit the beat: BAM BAM BAM. On every kick, large cubes slam, squash, jump, or punch forward. On every bass pulse, the whole 3D structure expands and compresses like a pressure engine.
On snares, cube layers snap sideways, cut, and reassemble. On hi-hats, tiny cube fragments, sparks, fine grain, color edges, and signal lines flicker fast.
On synth changes, surfaces ripple, panels unfold, glass blocks breathe, light seams stretch, and the camera pushes through depth.
Keep a premium dark 3D first-frame style: black glass, graphite, chrome, deep cobalt, electric cyan, acid green, controlled red, warm amber, tactile grain, color separation, subtle bloom.
No text, no logo, no border, no blank padding.
Example fal Input
{
"prompt": "sound-driven video, audio-reactive motion, continuous visual flow. This must be an aggressively audio-reactive cubic video. The cubes must visibly move to the sound. The cubes must hit the beat: BAM BAM BAM. On every kick, large cubes slam, squash, jump, or punch forward. On every bass pulse, the whole 3D structure expands and compresses like a pressure engine. On snares, cube layers snap sideways, cut, and reassemble. On hi-hats, tiny cube fragments, sparks, fine grain, color edges, and signal lines flicker fast.",
"audio_url": "https://...",
"image_url": "https://...",
"match_audio_length": true,
"resolution": {
"width": 1024,
"height": 1024
},
"frames_per_second": 24,
"num_inference_steps": 15,
"guidance_scale": 1,
"generate_audio": true,
"image_strength": 0.62,
"negative_prompt": "",
"enable_prompt_expansion": false,
"video_quality": "high",
"video_write_mode": "balanced",
"loras": [
{
"path": "https://huggingface.co/fal/ltx2.3-audio-reactive-lora/resolve/main/ltx2.3_audio_reactive_lora.safetensors",
"scale": 1.2,
"transformer": "both"
}
]
}
Notes
This LoRA is most useful when the input image already contains clear structures that can move with the music: cubes, layered architecture, particles, light seams, waveform-like forms, glass blocks, or abstract visualizer shapes. It can be used without an image, but image-first-frame generation gives stronger art direction and more consistent results.
It can also be tested in ComfyUI or other local LTX-2.3 workflows as a standard LoRA, as long as the workflow supports LTX-2.3 LoRA loading and audio-conditioned generation.
Limitations
- The LoRA improves audio-reactive motion but does not guarantee perfect beat detection in every clip.
- Stronger motion usually comes from lower image strength, stronger prompt language, and clearly structured first frames.
- Text inside first frames can drift during video generation; keep important text simple, high-contrast, and explicitly described as fixed if it must remain readable.
- Generated videos should be reviewed before publication, especially for text stability, logo fidelity, and sync.
Credits
Created by Lovis Odin for fal.ai.
- Downloads last month
- 8
Model tree for fal/ltx2.3-audio-reactive-lora
Base model
Lightricks/LTX-2.3

