StreamLip Audio Reconstruction Checkpoints

This repository contains the project-specific checkpoints and pinned runtime weights for the StreamLip audio reconstruction demo.

Contents

  • recon/streamlip_recon_timbrefix_step_002000.pt: default audio reconstruction checkpoint.
  • recon/streamlip_recon_residual_base_step_005000.pt: residual base checkpoint used by the recon configuration.
  • v5/streamlip_v5_olmo_step_002000_infer.pt: inference-only StreamLip V5 checkpoint. It keeps step and model and removes optimizer state.
  • streamlip-v5-lm/: StreamLip V5 language model directory.
  • norm/latent_norm_stats.npz: latent normalization statistics.
  • auto-avsr/vsr_trlrs2lrs3vox2avsp_base.pth: pinned Auto-AVSR weights required by the default local pipeline.
  • speaker/resnet50-11ad3fa6.pth: pinned speaker/timbre frontend weights.

The public pretrained Mimi and SmolLM2 dependencies are intentionally not duplicated here. Download them separately:

export HF_ENDPOINT=https://hf-mirror.com
hf download kyutai/mimi --local-dir ckpt/mimi
hf download HuggingFaceTB/SmolLM2-360M --local-dir ckpt/smollm2-360m

See the project README for full environment setup and raw-video inference commands.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support