Efficient Conformer v2 for non-streaming ASR

Specification: https://github.com/wenet-e2e/wenet/pull/1636

Results

  • Feature info:
    • using fbank feature, cmvn, speed perturb, dither
  • Training info:
    • train_u2++_efficonformer_v2.yaml
    • 8 gpu, batch size 16, acc_grad 1, 120 epochs
    • lr 0.001, warmup_steps 35000
  • Model info:
    • Model Params: 50,341,278
    • Downsample rate: 1/2 (conv2d2) * 1/4 (efficonformer block)
    • encoder_dim 256, output_size 256, head 8, linear_units 2048
    • num_blocks 12, cnn_module_kernel 15, group_size 3
  • Decoding info:
    • ctc_weight 0.5, reverse_weight 0.3, average_num 20

test clean

decoding mode full 18 16
attention decoder 3.49 3.71 3.72
ctc_greedy_search 3.49 3.74 3.77
ctc prefix beam search 3.47 3.72 3.74
attention rescoring 3.12 3.38 3.36

test other

decoding mode full 18 16
attention decoder 8.15 9.05 9.03
ctc_greedy_search 8.73 9.82 9.83
ctc prefix beam search 8.70 9.81 9.79
attention rescoring 8.05 9.08 9.10

Start to Use

Install WeNet follow: https://wenet.org.cn/wenet/install.html#install-for-training

Decode

cd examples/librispeech/s0

cp exp/wenet_efficient_conformer_librispeech_v2/decode.sh ./
cp exp/wenet_efficient_conformer_librispeech_v2/wer.sh ./

dir=exp/wenet_efficient_conformer_librispeech_v2
decoding_chunk_size=-1
. ./decode.sh ${dir} 20 ${decoding_chunk_size}

# WER
. ./wer.sh test_clean wenet_efficient_conformer_librispeech_v2 ${decoding_chunk_size}
. ./wer.sh test_other wenet_efficient_conformer_librispeech_v2 ${decoding_chunk_size}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.