alvanli commited on
Commit
8034c9c
1 Parent(s): b14b2d8
Files changed (4) hide show
  1. README.md +5 -4
  2. config.json +1 -1
  3. model.safetensors +1 -1
  4. optimizer.pt +1 -1
README.md CHANGED
@@ -23,7 +23,7 @@ model-index:
23
  metrics:
24
  - name: Normalized CER
25
  type: cer
26
- value: 8.94
27
  ---
28
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
  should probably proofread and complete it, then remove this comment. -->
@@ -41,17 +41,18 @@ For training,
41
  |--|--|
42
  |Common Voice 16.0 zh-HK Train|138|
43
  |Common Voice 16.0 yue Train|85|
 
44
  |Cantonese-ASR|72|
45
  |CantoMap|23|
46
  |[Pseudo-Labelled YouTube Data](https://huggingface.co/datasets/alvanlii/cantonese-youtube-pseudo-transcription)|438|
47
- |Total|756|
48
 
49
 
50
  For evaluation, Common Voice 16.0 yue Test set is used.
51
 
52
  ## Results
53
- - CER (lower is better): 0.1073
54
- - down from 0.1581 in the previous version dated Jan 28, 2023
 
55
  - GPU Inference with Fast Attention (example below): 0.055s/sample
56
  - Note all GPU evaluations are done on RTX 3090 GPU
57
  - GPU Inference: 0.308s/sample
 
23
  metrics:
24
  - name: Normalized CER
25
  type: cer
26
+ value: 7.93
27
  ---
28
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
  should probably proofread and complete it, then remove this comment. -->
 
41
  |--|--|
42
  |Common Voice 16.0 zh-HK Train|138|
43
  |Common Voice 16.0 yue Train|85|
44
+ |Common Voice 17.0 yue Train|178|
45
  |Cantonese-ASR|72|
46
  |CantoMap|23|
47
  |[Pseudo-Labelled YouTube Data](https://huggingface.co/datasets/alvanlii/cantonese-youtube-pseudo-transcription)|438|
 
48
 
49
 
50
  For evaluation, Common Voice 16.0 yue Test set is used.
51
 
52
  ## Results
53
+ - CER (lower is better): 0.0972
54
+ - down from 0.1073, 0.1581 in the previous versions
55
+ - CER (punctuations removed): 0.0793
56
  - GPU Inference with Fast Attention (example below): 0.055s/sample
57
  - Note all GPU evaluations are done on RTX 3090 GPU
58
  - GPU Inference: 0.308s/sample
config.json CHANGED
@@ -45,7 +45,7 @@
45
  "scale_embedding": false,
46
  "suppress_tokens": [],
47
  "torch_dtype": "float32",
48
- "transformers_version": "4.38.0.dev0",
49
  "use_cache": false,
50
  "use_weighted_layer_sum": false,
51
  "vocab_size": 51865
 
45
  "scale_embedding": false,
46
  "suppress_tokens": [],
47
  "torch_dtype": "float32",
48
+ "transformers_version": "4.39.3",
49
  "use_cache": false,
50
  "use_weighted_layer_sum": false,
51
  "vocab_size": 51865
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4db701d0355f26da25cc888d1f18e026f4c437e125b3a53a713f5c4372764abc
3
  size 966995080
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fde842c011b43b2a7b988ad9b9f1a082731bc29c3d58b0cdffa3e71252ef7256
3
  size 966995080
optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8cae99f24f5d58ff74074719ecf5fe28cb34eee7314559ec80a678ca930dbe9b
3
  size 1925064044
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:52d3d75f188b0e83816ddfe2c081aa5811fef9f12f90bcf95efbb6f22d4cb03a
3
  size 1925064044