Edit model card

Welcome

If you find this model helpful, please like this model and star us on https://github.com/LianjiaTech/BELLE and https://github.com/shuaijiang/Whisper-Finetune

Belle-whisper-large-v2-zh

Fine tune whisper-large-v2 to enhance Chinese speech recognition capabilities, Belle-whisper-large-v2-zh demonstrates a 30-70% relative improvement in performance on Chinese ASR benchmarks, including AISHELL1, AISHELL2, WENETSPEECH, and HKUST.

Usage


from transformers import pipeline

transcriber = pipeline(
  "automatic-speech-recognition", 
  model="BELLE-2/Belle-whisper-large-v2-zh"
)

transcriber.model.config.forced_decoder_ids = (
  transcriber.tokenizer.get_decoder_prompt_ids(
    language="zh", 
    task="transcribe"
  )
)

transcription = transcriber("my_audio.wav") 

Fine-tuning

Model (Re)Sample Rate Train Datasets Fine-tuning (full or peft)
Belle-whisper-large-v2-zh 16KHz AISHELL-1 AISHELL-2 WenetSpeech HKUST full fine-tuning

If you want to fine-thuning the model on your datasets, please reference to the github repo

CER(%) ↓

Model Language Tag aishell_1_test(↓) aishell_2_test(↓) wenetspeech_net(↓) wenetspeech_meeting(↓) HKUST_dev(↓)
whisper-large-v2 Chinese 8.818 6.183 12.343 26.413 31.917
Belle-whisper-large-v2-zh Chinese 2.549 3.746 8.503 14.598 16.289

Citation

Please cite our paper and github when using our code, data or model.

@misc{BELLE,
  author = {BELLEGroup},
  title = {BELLE: Be Everyone's Large Language model Engine},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
}
Downloads last month
382
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using BELLE-2/Belle-whisper-large-v2-zh 3