huseinzol05 commited on
Commit
2d35b49
1 Parent(s): 276857a
Files changed (2) hide show
  1. README.md +19 -43
  2. tf_model.h5 +1 -1
README.md CHANGED
@@ -11,61 +11,37 @@ probably proofread and complete it, then remove this comment. -->
11
 
12
  # wav2vec2-xls-r-300m-mixed
13
 
14
- Finetuned https://huggingface.co/facebook/wav2vec2-xls-r-300m on https://github.com/huseinzol05/malaya-speech/tree/master/data/mixed-stt
 
15
 
16
- This model was finetuned on 3 languages,
17
 
18
- 1. Malay
19
- 2. Singlish
20
- 3. Mandarin
21
 
22
- **This model trained on a single RTX 3090 Ti 24GB VRAM, provided by https://mesolitica.com/**.
23
 
24
- ## Evaluation set
25
 
26
- Evaluation set from https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt with sizes,
27
 
28
- ```
29
- len(malay), len(singlish), len(mandarin)
30
- -> (765, 3579, 614)
31
- ```
32
 
33
- It achieves the following results on the evaluation set based on [evaluate-wav2vec2-xls-r-300m-mixed.ipynb](evaluate-wav2vec2-xls-r-300m-mixed.ipynb):
34
 
35
- Mixed evaluation,
36
 
37
- ```
38
- CER: 0.0481054244857041
39
- WER: 0.1322198446007387
40
- CER with LM: 0.041196586938584696
41
- WER with LM: 0.09880169127621556
42
- ```
43
 
44
- Malay evaluation,
 
 
45
 
46
- ```
47
- CER: 0.051636391937588406
48
- WER: 0.19561999547293663
49
- CER with LM: 0.03917689630621449
50
- WER with LM: 0.12710746406824835
51
- ```
52
 
53
- Singlish evaluation,
54
 
55
- ```
56
- CER: 0.0494915200071987
57
- WER: 0.12763802881676573
58
- CER with LM: 0.04271234986432335
59
- WER with LM: 0.09677160640413336
60
- ```
61
 
62
- Mandarin evaluation,
63
 
64
- ```
65
- CER: 0.035626554824269824
66
- WER: 0.07993515937860181
67
- CER with LM: 0.03487760945087219
68
- WER with LM: 0.07536807168546154
69
- ```
70
-
71
- Language model from https://huggingface.co/huseinzol05/language-model-bahasa-manglish-combined
 
11
 
12
  # wav2vec2-xls-r-300m-mixed
13
 
14
+ This model was trained from scratch on an unknown dataset.
15
+ It achieves the following results on the evaluation set:
16
 
 
17
 
18
+ ## Model description
 
 
19
 
20
+ More information needed
21
 
22
+ ## Intended uses & limitations
23
 
24
+ More information needed
25
 
26
+ ## Training and evaluation data
 
 
 
27
 
28
+ More information needed
29
 
30
+ ## Training procedure
31
 
32
+ ### Training hyperparameters
 
 
 
 
 
33
 
34
+ The following hyperparameters were used during training:
35
+ - optimizer: None
36
+ - training_precision: float32
37
 
38
+ ### Training results
 
 
 
 
 
39
 
 
40
 
 
 
 
 
 
 
41
 
42
+ ### Framework versions
43
 
44
+ - Transformers 4.18.0
45
+ - TensorFlow 2.6.0
46
+ - Datasets 2.1.0
47
+ - Tokenizers 0.12.1
 
 
 
 
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ab874f96f3fd9d022b30636174df952c354c9f14eecf60a3b9af36d252673905
3
  size 1262429728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a59f060b717345a69d7add89b01c3b6afc450fab5f6d19db679b0c9df6739172
3
  size 1262429728