Edit model card

ESPnet2 ASR model

espnet/shihlun_asr_whisper_medium_finetuned_chime4

This model was trained by Shih-Lun Wu (slseanwu) using the chime4 recipe in espnet.

Demo: How to use in ESPnet2

cd espnet
pip install -e .
cd egs2/chime4/asr1

train_set=tr05_multi_noisy_si284 # tr05_multi_noisy (original training data) or tr05_multi_noisy_si284 (add si284 data)
valid_set=dt05_multi_isolated_1ch_track
test_sets="dt05_real_isolated_1ch_track dt05_simu_isolated_1ch_track et05_real_isolated_1ch_track et05_simu_isolated_1ch_track"

asr_tag=whisper_medium_finetune_lr1e-5_adamw_wd1e-2_3epochs
asr_config=conf/tuning/train_asr_whisper_full.yaml
inference_config=conf/decode_asr_whisper_noctc_greedy.yaml

./asr.sh \
    --skip_data_prep false \
    --skip_train true \
    --skip_eval false \
    --lang en \
    --ngpu 1 \
    --nj 4 \
    --stage 1 \
    --stop_stage 13 \
    --gpu_inference true \
    --inference_nj 1 \
    --token_type whisper_multilingual \
    --feats_normalize '' \
    --max_wav_duration 30 \
    --feats_type raw \
    --use_lm false \
    --cleaner whisper_en \
    --asr_tag "${asr_tag}" \
    --asr_config "${asr_config}" \
    --inference_config "${inference_config}" \
    --inference_asr_model valid.acc.ave.pth \
    --train_set "${train_set}" \
    --valid_set "${valid_set}" \
    --test_sets "${test_sets}" "$@"

RESULTS

Environments

  • date: Tue Jan 10 04:15:30 CST 2023
  • python version: 3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0]
  • espnet version: espnet 202211
  • pytorch version: pytorch 1.12.1
  • Git hash: d89be931dcc8f61437ac49cbe39a773f2054c50c
    • Commit date: Mon Jan 9 11:06:45 2023 -0600

asr_whisper_medium_finetune_lr1e-5_adamw_wd1e-2_3epochs

WER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/dt05_real_isolated_1ch_track 1640 24791 97.8 1.7 0.5 0.3 2.5 24.5
decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/dt05_simu_isolated_1ch_track 1640 24792 96.1 3.0 0.9 0.5 4.4 35.6
decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/et05_real_isolated_1ch_track 1320 19341 96.4 2.9 0.7 0.5 4.1 33.0
decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/et05_simu_isolated_1ch_track 1320 19344 93.4 5.0 1.7 0.8 7.4 41.8
decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/dt05_real_isolated_1ch_track 1640 24791 97.7 1.8 0.5 0.4 2.8 25.5
decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/dt05_simu_isolated_1ch_track 1640 24792 96.0 3.3 0.8 0.7 4.8 36.0
decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/et05_real_isolated_1ch_track 1320 19341 96.1 3.3 0.6 0.7 4.6 34.9
decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/et05_simu_isolated_1ch_track 1320 19344 92.9 5.8 1.3 1.2 8.3 43.2

CER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/dt05_real_isolated_1ch_track 1640 141889 99.1 0.3 0.5 0.3 1.2 24.5
decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/dt05_simu_isolated_1ch_track 1640 141900 98.2 0.8 1.0 0.5 2.3 35.6
decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/et05_real_isolated_1ch_track 1320 110558 98.5 0.7 0.8 0.5 1.9 33.0
decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/et05_simu_isolated_1ch_track 1320 110572 96.5 1.6 1.9 0.8 4.3 41.8
decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/dt05_real_isolated_1ch_track 1640 141889 99.1 0.4 0.5 0.5 1.3 25.5
decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/dt05_simu_isolated_1ch_track 1640 141900 98.2 0.9 0.9 0.6 2.4 36.0
decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/et05_real_isolated_1ch_track 1320 110558 98.4 0.9 0.7 0.6 2.2 34.9
decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/et05_simu_isolated_1ch_track 1320 110572 96.3 2.0 1.7 1.2 4.9 43.2
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .