You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Real-World AVSE Challenge (ISCSLP 2026) — Baseline (track2)

AV-ConvTasNet baseline checkpoint for the Real-World AVSE Challenge, saved with PyTorchModelHubMixin (config.json + model.safetensors). The trained video encoder is bundled in the weights, so no separate lip-reading backbone is needed.

Usage

from look2hear.models import AV_ConvTasNet
model = AV_ConvTasNet.from_pretrained("JusperLee/Real-World-AVSE-Baseline-Track2").eval()

Or directly in the challenge evaluation:

python eval_real.py --ckpt JusperLee/Real-World-AVSE-Baseline-Track2 \
  --track track2 --split dev --metrics all --mode both --save_dir enhanced_out

This repo is private: log in (huggingface-cli login or HF_TOKEN) and be granted access before downloading.

See the baseline repository for the model definition and full pipeline.

Downloads last month: 7

Safetensors

Model size

25M params

Tensor type

F32