StreamLip Audio Reconstruction Checkpoints

This repository contains the project-specific checkpoints and pinned runtime weights for the StreamLip audio reconstruction demo.

recon/streamlip_recon_timbrefix_step_002000.pt: default audio reconstruction checkpoint.
recon/streamlip_recon_residual_base_step_005000.pt: residual base checkpoint used by the recon configuration.
v5/streamlip_v5_olmo_step_002000_infer.pt: inference-only StreamLip V5 checkpoint. It keeps step and model and removes optimizer state.
streamlip-v5-lm/: StreamLip V5 language model directory.
norm/latent_norm_stats.npz: latent normalization statistics.
auto-avsr/vsr_trlrs2lrs3vox2avsp_base.pth: pinned Auto-AVSR weights required by the default local pipeline.
speaker/resnet50-11ad3fa6.pth: pinned speaker/timbre frontend weights.

The public pretrained Mimi and SmolLM2 dependencies are intentionally not duplicated here. Download them separately:

export HF_ENDPOINT=https://hf-mirror.com
hf download kyutai/mimi --local-dir ckpt/mimi
hf download HuggingFaceTB/SmolLM2-360M --local-dir ckpt/smollm2-360m

See the project README for full environment setup and raw-video inference commands.

Downloads last month: -; Downloads are not tracked for this model. How to track

pancx
/

streamlip-audio-recon-ckpt-pub

StreamLip Audio Reconstruction Checkpoints

Contents