Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ tags:
|
|
9 |
- xlsr-fine-tuning-week
|
10 |
license: apache-2.0
|
11 |
model-index:
|
12 |
-
- name: XLSR Wav2Vec2
|
13 |
results:
|
14 |
- task:
|
15 |
name: Speech Recognition
|
@@ -25,7 +25,7 @@ model-index:
|
|
25 |
---
|
26 |
|
27 |
# Wav2Vec2-Large-XLSR-53-hk
|
28 |
-
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Cantonese using the [Common Voice](https://huggingface.co/datasets/common_voice).
|
29 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
30 |
|
31 |
## Usage
|
@@ -46,7 +46,7 @@ model_name = "voidful/wav2vec2-large-xlsr-53-hk"
|
|
46 |
device = "cuda"
|
47 |
processor_name = "voidful/wav2vec2-large-xlsr-53-hk"
|
48 |
|
49 |
-
chars_to_ignore_regex = r"[¥•"#$%&'()*+,-/:;<=>@[\]^_`{|}~⦅⦆「」、 、〃〈〉《》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏﹑﹔·'
|
50 |
|
51 |
model = Wav2Vec2ForCTC.from_pretrained(model_name).to(device)
|
52 |
processor = Wav2Vec2Processor.from_pretrained(processor_name)
|
@@ -78,7 +78,7 @@ predict(load_file_to_data('voice file path'))
|
|
78 |
```
|
79 |
|
80 |
## Evaluation
|
81 |
-
The model can be evaluated as follows on the
|
82 |
CER calculation refer to https://huggingface.co/ctl/wav2vec2-large-xlsr-cantonese
|
83 |
|
84 |
```python
|
@@ -101,7 +101,7 @@ model_name = "voidful/wav2vec2-large-xlsr-53-hk"
|
|
101 |
device = "cuda"
|
102 |
processor_name = "voidful/wav2vec2-large-xlsr-53-hk"
|
103 |
|
104 |
-
chars_to_ignore_regex = r"[¥•"#$%&'()*+,-/:;<=>@[\]^_`{|}~⦅⦆「」、 、〃〈〉《》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏﹑﹔·'
|
105 |
|
106 |
model = Wav2Vec2ForCTC.from_pretrained(model_name).to(device)
|
107 |
processor = Wav2Vec2Processor.from_pretrained(processor_name)
|
|
|
9 |
- xlsr-fine-tuning-week
|
10 |
license: apache-2.0
|
11 |
model-index:
|
12 |
+
- name: XLSR Wav2Vec2 Cantonese (Hong Kong) by Voidful
|
13 |
results:
|
14 |
- task:
|
15 |
name: Speech Recognition
|
|
|
25 |
---
|
26 |
|
27 |
# Wav2Vec2-Large-XLSR-53-hk
|
28 |
+
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Cantonese using the [Common Voice](https://huggingface.co/datasets/common_voice).
|
29 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
30 |
|
31 |
## Usage
|
|
|
46 |
device = "cuda"
|
47 |
processor_name = "voidful/wav2vec2-large-xlsr-53-hk"
|
48 |
|
49 |
+
chars_to_ignore_regex = r"[¥•"#$%&'()*+,-/:;<=>@[\]^_`{|}~⦅⦆「」、 、〃〈〉《》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏﹑﹔·'℃°•·.﹑︰〈〉─《﹖﹣﹂﹁﹔!?。。"#$%&'()*+,﹐-/:;<=>@[\]^_`{|}~⦅⦆「」、、〃》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏..!\\"#$%&()*+,\\-.\\:;<=>?@\\[\\]\\\\\\/^_`{|}~]"
|
50 |
|
51 |
model = Wav2Vec2ForCTC.from_pretrained(model_name).to(device)
|
52 |
processor = Wav2Vec2Processor.from_pretrained(processor_name)
|
|
|
78 |
```
|
79 |
|
80 |
## Evaluation
|
81 |
+
The model can be evaluated as follows on the Cantonese (Hong Kong) test data of Common Voice.
|
82 |
CER calculation refer to https://huggingface.co/ctl/wav2vec2-large-xlsr-cantonese
|
83 |
|
84 |
```python
|
|
|
101 |
device = "cuda"
|
102 |
processor_name = "voidful/wav2vec2-large-xlsr-53-hk"
|
103 |
|
104 |
+
chars_to_ignore_regex = r"[¥•"#$%&'()*+,-/:;<=>@[\]^_`{|}~⦅⦆「」、 、〃〈〉《》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏﹑﹔·'℃°•·.﹑︰〈〉─《﹖﹣﹂﹁﹔!?。。"#$%&'()*+,﹐-/:;<=>@[\]^_`{|}~⦅⦆「」、、〃》「」『』【】〔〕〖〗〘〙〚〛〜〝〞〟〰〾〿–—‘’‛“”„‟…‧﹏..!\\"#$%&()*+,\\-.\\:;<=>?@\\[\\]\\\\\\/^_`{|}~]"
|
105 |
|
106 |
model = Wav2Vec2ForCTC.from_pretrained(model_name).to(device)
|
107 |
processor = Wav2Vec2Processor.from_pretrained(processor_name)
|