voidful commited on
Commit
c733de8
1 Parent(s): e4361b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -9,7 +9,7 @@ tags:
9
  - xlsr-fine-tuning-week
10
  license: apache-2.0
11
  model-index:
12
- - name: XLSR Wav2Vec2 Chinese (Hong Kong) by Voidful
13
  results:
14
  - task:
15
  name: Speech Recognition
@@ -25,7 +25,7 @@ model-index:
25
  ---
26
 
27
  # Wav2Vec2-Large-XLSR-53-hk
28
- Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Cantonese using the [Common Voice](https://huggingface.co/datasets/common_voice).
29
  When using this model, make sure that your speech input is sampled at 16kHz.
30
 
31
  ## Usage
@@ -46,7 +46,7 @@ model_name = "voidful/wav2vec2-large-xlsr-53-hk"
46
  device = "cuda"
47
  processor_name = "voidful/wav2vec2-large-xlsr-53-hk"
48
 
49
- chars_to_ignore_regex = r"[¥•"#$%&'()*+,-/:;<=>@[\]^_`{|}~⦅⦆「」、 、〃〈〉《》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏﹑﹔·'℃°•·.﹑︰〈〉─《﹖﹣﹂﹁﹔!?。。"#$%&'()*+,﹐-/:;<=>@[\]^_`{|}~⦅⦆「」、、〃》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏..!\"#$%&()*+,\-.\:;<=>?@\[\]\\\/^_`{|}~]"
50
 
51
  model = Wav2Vec2ForCTC.from_pretrained(model_name).to(device)
52
  processor = Wav2Vec2Processor.from_pretrained(processor_name)
@@ -78,7 +78,7 @@ predict(load_file_to_data('voice file path'))
78
  ```
79
 
80
  ## Evaluation
81
- The model can be evaluated as follows on the Chinese (Hong Kong) test data of Common Voice.
82
  CER calculation refer to https://huggingface.co/ctl/wav2vec2-large-xlsr-cantonese
83
 
84
  ```python
@@ -101,7 +101,7 @@ model_name = "voidful/wav2vec2-large-xlsr-53-hk"
101
  device = "cuda"
102
  processor_name = "voidful/wav2vec2-large-xlsr-53-hk"
103
 
104
- chars_to_ignore_regex = r"[¥•"#$%&'()*+,-/:;<=>@[\]^_`{|}~⦅⦆「」、 、〃〈〉《》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏﹑﹔·'℃°•·.﹑︰〈〉─《﹖﹣﹂﹁﹔!?。。"#$%&'()*+,﹐-/:;<=>@[\]^_`{|}~⦅⦆「」、、〃》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏..!\"#$%&()*+,\-.\:;<=>?@\[\]\\\/^_`{|}~]"
105
 
106
  model = Wav2Vec2ForCTC.from_pretrained(model_name).to(device)
107
  processor = Wav2Vec2Processor.from_pretrained(processor_name)
 
9
  - xlsr-fine-tuning-week
10
  license: apache-2.0
11
  model-index:
12
+ - name: XLSR Wav2Vec2 Cantonese (Hong Kong) by Voidful
13
  results:
14
  - task:
15
  name: Speech Recognition
 
25
  ---
26
 
27
  # Wav2Vec2-Large-XLSR-53-hk
28
+ Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Cantonese using the [Common Voice](https://huggingface.co/datasets/common_voice).
29
  When using this model, make sure that your speech input is sampled at 16kHz.
30
 
31
  ## Usage
 
46
  device = "cuda"
47
  processor_name = "voidful/wav2vec2-large-xlsr-53-hk"
48
 
49
+ chars_to_ignore_regex = r"[¥•"#$%&'()*+,-/:;<=>@[\]^_`{|}~⦅⦆「」、 、〃〈〉《》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏﹑﹔·'℃°•·.﹑︰〈〉─《﹖﹣﹂﹁﹔!?。。"#$%&'()*+,﹐-/:;<=>@[\]^_`{|}~⦅⦆「」、、〃》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏..!\\"#$%&()*+,\\-.\\:;<=>?@\\[\\]\\\\\\/^_`{|}~]"
50
 
51
  model = Wav2Vec2ForCTC.from_pretrained(model_name).to(device)
52
  processor = Wav2Vec2Processor.from_pretrained(processor_name)
 
78
  ```
79
 
80
  ## Evaluation
81
+ The model can be evaluated as follows on the Cantonese (Hong Kong) test data of Common Voice.
82
  CER calculation refer to https://huggingface.co/ctl/wav2vec2-large-xlsr-cantonese
83
 
84
  ```python
 
101
  device = "cuda"
102
  processor_name = "voidful/wav2vec2-large-xlsr-53-hk"
103
 
104
+ chars_to_ignore_regex = r"[¥•"#$%&'()*+,-/:;<=>@[\]^_`{|}~⦅⦆「」、 、〃〈〉《》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏﹑﹔·'℃°•·.﹑︰〈〉─《﹖﹣﹂﹁﹔!?。。"#$%&'()*+,﹐-/:;<=>@[\]^_`{|}~⦅⦆「」、、〃》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏..!\\"#$%&()*+,\\-.\\:;<=>?@\\[\\]\\\\\\/^_`{|}~]"
105
 
106
  model = Wav2Vec2ForCTC.from_pretrained(model_name).to(device)
107
  processor = Wav2Vec2Processor.from_pretrained(processor_name)