Edit model card

Welcome

If you find this model helpful, please like this model and star us on https://github.com/LianjiaTech/BELLE and https://github.com/shuaijiang/Whisper-Finetune

Belle-whisper-large-v3-zh-punct

Fine tune whisper-large-v3-zh to enhance Chinese punctuation mark capabilities while maintaining comparable performance, Belle-whisper-large-v3-zh-punct demonstrates similar performance to Belle-whisper-large-v3-zh on Chinese ASR benchmarks, including AISHELL1, AISHELL2, WENETSPEECH, and HKUST.

The punctuation marks come from model punc_ct-transformer_cn-en-common-vocab471067-large, and are added to the training datasets.

Usage


from transformers import pipeline

transcriber = pipeline(
  "automatic-speech-recognition", 
  model="BELLE-2/Belle-whisper-large-v3-zh-punct"
)

transcriber.model.config.forced_decoder_ids = (
  transcriber.tokenizer.get_decoder_prompt_ids(
    language="zh", 
    task="transcribe"
  )
)

transcription = transcriber("my_audio.wav") 

Fine-tuning

Model (Re)Sample Rate Train Datasets Fine-tuning (full or peft)
Belle-whisper-large-v3-zh-punct 16KHz AISHELL-1 AISHELL-2 WenetSpeech HKUST lora fine-tuning

To incorporate punctuation marks without compromising performance, Lora fine-tuning was employed. If you want to fine-thuning the model on your datasets, please reference to the github repo

CER(%) ↓

Model Language Tag aishell_1_test(↓) aishell_2_test(↓) wenetspeech_net(↓) wenetspeech_meeting(↓) HKUST_dev(↓)
whisper-large-v3 Chinese 8.085 5.475 11.72 20.15 28.597
Belle-whisper-large-v3-zh Chinese 2.781 3.786 8.865 11.246 16.440
Belle-whisper-large-v3-zh-punct Chinese 2.945 3.808 8.998 10.973 17.196

It is worth mentioning that compared to Belle-whisper-large-v3-zh, Belle-whisper-large-v3-zh-punct even has a slight improvement in complex acoustic scenes(such as wenetspeech_meeting). And the punctation marks of Belle-whisper-large-v3-zh-punct are removed to compute the CER.

Citation

Please cite our paper and github when using our code, data or model.

@misc{BELLE,
  author = {BELLEGroup},
  title = {BELLE: Be Everyone's Large Language model Engine},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
}
Downloads last month
191
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.