Real-World AVSE Challenge (ISCSLP 2026) โ Baseline (track2)
AV-ConvTasNet baseline checkpoint for the Real-World AVSE Challenge, saved with
PyTorchModelHubMixin (config.json + model.safetensors). The trained video
encoder is bundled in the weights, so no separate lip-reading backbone is needed.
Usage
from look2hear.models import AV_ConvTasNet
model = AV_ConvTasNet.from_pretrained("JusperLee/Real-World-AVSE-Baseline-Track2").eval()
Or directly in the challenge evaluation:
python eval_real.py --ckpt JusperLee/Real-World-AVSE-Baseline-Track2 \
--track track2 --split dev --metrics all --mode both --save_dir enhanced_out
This repo is private: log in (
huggingface-cli loginorHF_TOKEN) and be granted access before downloading.
See the baseline repository for the model definition and full pipeline.
- Downloads last month
- 7