Updating to remove spaces before punctuation

#1
by igitman - opened
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -35,7 +35,7 @@ model-index:
35
  metrics:
36
  - name: Test WER
37
  type: wer
38
- value: 5.77
39
  - task:
40
  type: Automatic Speech Recognition
41
  name: automatic-speech-recognition
@@ -49,7 +49,7 @@ model-index:
49
  metrics:
50
  - name: Test WER
51
  type: wer
52
- value: 11.47
53
  - task:
54
  type: Automatic Speech Recognition
55
  name: speech-recognition
@@ -63,7 +63,7 @@ model-index:
63
  metrics:
64
  - name: Test WER
65
  type: wer
66
- value: 15.60
67
  - task:
68
  type: Automatic Speech Recognition
69
  name: speech-recognition
@@ -77,7 +77,7 @@ model-index:
77
  metrics:
78
  - name: Test WER P&C
79
  type: wer
80
- value: 8.17
81
  - task:
82
  type: Automatic Speech Recognition
83
  name: automatic-speech-recognition
@@ -91,7 +91,7 @@ model-index:
91
  metrics:
92
  - name: Test WER P&C
93
  type: wer
94
- value: 22.48
95
  - task:
96
  type: Automatic Speech Recognition
97
  name: speech-recognition
@@ -105,7 +105,7 @@ model-index:
105
  metrics:
106
  - name: Test WER P&C
107
  type: wer
108
- value: 19.55
109
  ---
110
  # NVIDIA FastConformer-Hybrid Large (it)
111
 
@@ -206,14 +206,14 @@ a) On data without Punctuation and Capitalization
206
 
207
  | Version | Tokenizer | Vocabulary Size | MCV 12.0 Dev | MCV 12.0 Test | MLS Dev | MLS Test | VoxPopuli Dev | VoxPopuli Test |
208
  |---------|-----------------------|-----------------|--------------|---------------|---------|----------|---------------|----------------|
209
- | 1.18.0 | SentencePiece Unigram | 1024 | 5.14% | 5.68% | 13.83% | 11.71% | 12.80% | 15.72% |
210
 
211
 
212
  b) On data with Punctuation and Capitalization
213
 
214
  | Version | Tokenizer | Vocabulary Size | MCV 12.0 Dev | MCV 12.0 Test | MLS Dev | MLS Test | VoxPopuli Dev | VoxPopuli Test |
215
  |---------|-----------------------|-----------------|--------------|---------------|---------|----------|---------------|----------------|
216
- | 1.18.0 | SentencePiece Unigram | 1024 | 7.75% | 8.17% | 26.37% | 22.48% | 16.78% | 19.55% |
217
 
218
 
219
  ## Limitations
 
35
  metrics:
36
  - name: Test WER
37
  type: wer
38
+ value: 5.64
39
  - task:
40
  type: Automatic Speech Recognition
41
  name: automatic-speech-recognition
 
49
  metrics:
50
  - name: Test WER
51
  type: wer
52
+ value: 12.34
53
  - task:
54
  type: Automatic Speech Recognition
55
  name: speech-recognition
 
63
  metrics:
64
  - name: Test WER
65
  type: wer
66
+ value: 16.21
67
  - task:
68
  type: Automatic Speech Recognition
69
  name: speech-recognition
 
77
  metrics:
78
  - name: Test WER P&C
79
  type: wer
80
+ value: 8.07
81
  - task:
82
  type: Automatic Speech Recognition
83
  name: automatic-speech-recognition
 
91
  metrics:
92
  - name: Test WER P&C
93
  type: wer
94
+ value: 23.06
95
  - task:
96
  type: Automatic Speech Recognition
97
  name: speech-recognition
 
105
  metrics:
106
  - name: Test WER P&C
107
  type: wer
108
+ value: 20.04
109
  ---
110
  # NVIDIA FastConformer-Hybrid Large (it)
111
 
 
206
 
207
  | Version | Tokenizer | Vocabulary Size | MCV 12.0 Dev | MCV 12.0 Test | MLS Dev | MLS Test | VoxPopuli Dev | VoxPopuli Test |
208
  |---------|-----------------------|-----------------|--------------|---------------|---------|----------|---------------|----------------|
209
+ | 1.20.0 | SentencePiece BPE | 512 | 5.14% | 5.64% | 13.68% | 12.34% | 13.02% | 16.21% |
210
 
211
 
212
  b) On data with Punctuation and Capitalization
213
 
214
  | Version | Tokenizer | Vocabulary Size | MCV 12.0 Dev | MCV 12.0 Test | MLS Dev | MLS Test | VoxPopuli Dev | VoxPopuli Test |
215
  |---------|-----------------------|-----------------|--------------|---------------|---------|----------|---------------|----------------|
216
+ | 1.20.0 | SentencePiece BPE | 512 | 7.70% | 8.07% | 26.94% | 23.06% | 16.93% | 20.04% |
217
 
218
 
219
  ## Limitations