|
--- |
|
tags: |
|
- espnet |
|
- audio |
|
- speech-recognition |
|
language: zh |
|
datasets: |
|
- commonvoice |
|
license: cc-by-4.0 |
|
--- |
|
|
|
## ESPnet2 ASR model |
|
|
|
### `espnet/shihlun-asr-commonvoice-zh-TW` |
|
This model was trained by Shih-Lun Wu using the commonvoice recipe in [espnet](https://github.com/espnet/espnet/). |
|
### Demo: How to use in ESPnet2 |
|
```bash |
|
cd espnet |
|
pip install -e . |
|
cd egs2/commonvoice/asr1 |
|
./asr.sh \ |
|
--stage 1 \ |
|
--stop_stage 13 \ |
|
--nj 32 \ |
|
--inference_nj 32 \ |
|
--skip_train true \ |
|
--train_set "train_zh_TW" \ |
|
--valid_set "dev_zh_TW" \ |
|
--test_sets "dev_zh_TW test_zh_TW" \ |
|
--lang "zh_TW" \ |
|
--local_data_opts "--lang zh-TW" \ |
|
--speed_perturb_factors "0.9 1.0 1.1" \ |
|
--lm_train_text "data/train_zh_TW/text" \ |
|
--token_type bpe \ |
|
--nbpe 2542 \ |
|
--bpemode "unigram" \ |
|
--bpe_train_text "data/train_zh_TW/text" \ |
|
--use_lm false \ |
|
--inference_asr_model "valid.acc.best.pth" \ |
|
--download_model "espnet/shihlun-asr-commonvoice-zh-TW" |
|
``` |
|
|
|
<!-- Generated by scripts/utils/show_asr_result.sh --> |
|
## RESULTS |
|
### Environments |
|
- date: `Thu Sep 1 21:49:10 UTC 2022` |
|
- python version: `3.9.12 (main, Jun 1 2022, 11:38:51) [GCC 7.5.0]` |
|
- espnet version: `espnet 202207` |
|
- pytorch version: `pytorch 1.12.1+cu102` |
|
- Git hash: `13db69d3befc3c82a5ff5a11e28bf79d5030603f` |
|
- Commit date: `Mon Aug 29 13:44:35 2022 +0000` |
|
|
|
### asr_train_asr_conformer5_raw_zh_TW_bpe2542_sp_lr1.0 |
|
#### CER |
|
|
|
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err| |
|
|---|---|---|---|---|---|---|---|---| |
|
|inference_asr_model_valid.acc.best/dev_zh_TW|2627|22200|97.7|2.1|0.2|0.0|2.4|9.5| |
|
|inference_asr_model_valid.acc.best/test_zh_TW|2627|21991|98.0|1.6|0.4|0.1|2.1|7.7| |
|
|
|
#### TER |
|
|
|
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err| |
|
|---|---|---|---|---|---|---|---|---| |
|
|inference_asr_model_valid.acc.best/dev_zh_TW|2627|24827|98.6|1.2|0.2|0.0|1.5|4.0| |
|
|inference_asr_model_valid.acc.best/test_zh_TW|2627|24618|98.8|0.9|0.4|0.1|1.3|3.4| |