shuaijiang commited on
Commit
5e6e62e
1 Parent(s): a7a18f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md CHANGED
@@ -1,3 +1,65 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ metrics:
4
+ - cer
5
  ---
6
+ ## Welcome
7
+ If you find this model helpful, please *like* this model and star us on https://github.com/LianjiaTech/BELLE !
8
+
9
+ # Belle-distilwhisper-large-v2-zh
10
+ Fine tune [distilwhisper-large-v2](https://huggingface.co/distil-whisper/distil-large-v2) to improve Chinese speech recognition.
11
+
12
+ Same to distilwhisper-large-v2, Belle-distilwhisper-large-v2-zh is 5.8 times faster with 51% fewer parameters compare to whisper-large-v2.
13
+
14
+ **Note that** distilwhisper-large-v2 can not transcribe Chinese(only output English) on Chinese ASR benchmark(AISHELL1, AISHELL2, WENETSPEECH, HKUST).
15
+
16
+ ## Usage
17
+ ```python
18
+
19
+ from transformers import pipeline
20
+
21
+ transcriber = pipeline(
22
+ "automatic-speech-recognition",
23
+ model="BELLE-2/Belle-distilwhisper-large-v2-zh"
24
+ )
25
+
26
+ transcriber.model.config.forced_decoder_ids = (
27
+ transcriber.tokenizer.get_decoder_prompt_ids(
28
+ language="zh",
29
+ task="transcribe"
30
+ )
31
+ )
32
+
33
+ transcription = transcriber("my_audio.wav")
34
+
35
+ ```
36
+
37
+ ## Fine-tuning
38
+ | Model | (Re)Sample Rate | Train Datasets | Fine-tuning (full or peft) |
39
+ |:----------------:|:-------:|:----------------------------------------------------------:|:-----------:|
40
+ | Belle-distilwhisper-large-v2-zh | 16KHz | [AISHELL-1](https://openslr.magicdatatech.com/resources/33/) [AISHELL-2](https://www.aishelltech.com/aishell_2) [WenetSpeech](https://wenet.org.cn/WenetSpeech/) [HKUST](https://catalog.ldc.upenn.edu/LDC2005S15) | [full fine-tuning](https://github.com/shuaijiang/Whisper-Finetune) |
41
+
42
+ If you want to fine-thuning the model on your datasets, please reference to the [github repo](https://github.com/shuaijiang/Whisper-Finetune)
43
+
44
+
45
+ ## CER(%)
46
+ | Model | Parameters(M) |Language Tag | aishell_1_test |aishell_2_test| wenetspeech_net | wenetspeech_meeting | HKUST_dev|
47
+ |:----------------:|:-------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|:-------:|
48
+ | whisper-large-v2 |1550 |Chinese | 8.818 | 6.183 | 12.343 | 26.413 | 31.917 |
49
+ | distilwhisper-large-v2 |756| Chinese | - | - | - | - | - |
50
+ | Belle-distilwhisper-large-v2-zh| 756 | Chinese | 5.958 | 6.477 | 12.786 | 17.039 | 20.771 |
51
+
52
+ ## Citation
53
+
54
+ Please cite our paper and github when using our code, data or model.
55
+
56
+ ```
57
+ @misc{BELLE,
58
+ author = {BELLEGroup},
59
+ title = {BELLE: Be Everyone's Large Language model Engine},
60
+ year = {2023},
61
+ publisher = {GitHub},
62
+ journal = {GitHub repository},
63
+ howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
64
+ }
65
+ ```