metadata
metrics:
- cer
Introduction
This repository provides the baseline model files for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023).
Usage
Please download these model files and use them in the baseline code.
Performance
The following table shows these models' performance on their own tasks.
Training Data | Task | CER | File Name |
---|---|---|---|
CN-CVS (<4s) | Pre-training | / | model_avg_14_23_cncvs_4s.pth |
CN-CVS (full) | Pre-training | / | model_avg_last10_cncvs_4s_30s.pth |
CN-CVS + CNVSRC-Single.Dev | Single-speaker VSR (T1) | 46.01% | model_avg_last5_cncvs_cnvsrc-single.pth |
CN-CVS + CNVSRC-Multi.Dev | Multi-speaker VSR (T2) | 58.42% | model_avg_last5_cncvs_cnvsrc-multi.pth |