AlhitawiMohammed22 commited on
Commit
e676e40
·
1 Parent(s): 61b1e91

update readme

Browse files
Files changed (1) hide show
  1. README.md +153 -5
README.md CHANGED
@@ -1,13 +1,161 @@
1
  ---
2
- title: Hu Evaluation Metrics
3
- emoji: 🏃
4
- colorFrom: gray
5
  colorTo: red
6
  sdk: gradio
7
- sdk_version: 3.24.1
8
  app_file: app.py
9
  pinned: false
 
 
 
10
  license: apache-2.0
11
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ title: CER
3
+ emoji: 🤗🏃🤗🏃🤗🏃🤗🏃🤗
4
+ colorFrom: blue
5
  colorTo: red
6
  sdk: gradio
7
+ sdk_version: 3.19.1
8
  app_file: app.py
9
  pinned: false
10
+ tags:
11
+ - evaluate
12
+ - metric
13
  license: apache-2.0
14
  ---
15
+ ---
16
+
17
+ description: >-
18
+ Character error rate (CER) is a common metric of the performance of an automatic speech recognition system.
19
+
20
+ CER is similar to Word Error Rate (WER), but operates on character instead of word. Please refer to docs of WER for further information.
21
+
22
+ Character error rate can be computed as:
23
+
24
+ CER = (S + D + I) / N = (S + D + I) / (S + D + C)
25
+
26
+ where
27
+
28
+ S is the number of substitutions,
29
+ D is the number of deletions,
30
+ I is the number of insertions,
31
+ C is the number of correct characters,
32
+ N is the number of characters in the reference (N=S+D+C).
33
+
34
+ CER's output is not always a number between 0 and 1, in particular when there is a high number of insertions. This value is often associated to the percentage of characters that were incorrectly predicted. The lower the value, the better the
35
+ performance of the ASR system with a CER of 0 being a perfect score.
36
+ ---
37
+
38
+ # Metric Card for CER
39
+
40
+ ## Metric description
41
+
42
+ Character error rate (CER) is a common metric of the performance of an automatic speech recognition (ASR) system. CER is similar to Word Error Rate (WER), but operates on character instead of word.
43
+
44
+ Character error rate can be computed as:
45
+
46
+ `CER = (S + D + I) / N = (S + D + I) / (S + D + C)`
47
+
48
+ where
49
+
50
+ `S` is the number of substitutions,
51
+
52
+ `D` is the number of deletions,
53
+
54
+ `I` is the number of insertions,
55
+
56
+ `C` is the number of correct characters,
57
+
58
+ `N` is the number of characters in the reference (`N=S+D+C`).
59
+
60
+
61
+ ## How to use
62
+
63
+ The metric takes two inputs: references (a list of references for each speech input) and predictions (a list of transcriptions to score).
64
+
65
+ ```python
66
+ from evaluate import load
67
+ cer = load("cer")
68
+ cer_score = cer.compute(predictions=predictions, references=references)
69
+ ```
70
+ ## Output values
71
+
72
+ This metric outputs a float representing the character error rate.
73
+
74
+ ```
75
+ print(cer_score)
76
+ 0.34146341463414637
77
+ ```
78
+
79
+ The **lower** the CER value, the **better** the performance of the ASR system, with a CER of 0 being a perfect score.
80
+
81
+ However, CER's output is not always a number between 0 and 1, in particular when there is a high number of insertions (see [Examples](#Examples) below).
82
+
83
+ ### Values from popular papers
84
+
85
+ ## Examples
86
+
87
+ Perfect match between prediction and reference:
88
+
89
+ ```python
90
+ !pip install evaluate jiwer
91
+
92
+ from evaluate import load
93
+ cer = load("cer")
94
+ predictions = ["hello világ", "jó éjszakát hold"]
95
+ references = ["hello világ", "jó éjszakát hold"]
96
+ cer_score = cer.compute(predictions=predictions, references=references)
97
+ print(cer_score)
98
+ 0.0
99
+ ```
100
+ Partial match between prediction and reference:
101
+
102
+ ```python
103
+ from evaluate import load
104
+ cer = load("cer")
105
+ predictions = ["ez a jóslat", "van egy másik minta is"]
106
+ references = ["ez a hivatkozás", "van még egy"]
107
+ cer = evaluate.load("cer")
108
+ cer_score = cer.compute(predictions=predictions, references=references)
109
+ print(cer_score)
110
+ 0.9615384615384616
111
+ ```
112
+
113
+ No match between prediction and reference:
114
+
115
+ ```python
116
+ from evaluate import load
117
+ cer = load("cer")
118
+ predictions = ["üdvözlet"]
119
+ references = ["jó!"]
120
+ cer_score = cer.compute(predictions=predictions, references=references)
121
+ print(cer_score)
122
+ 1.5
123
+ ```
124
+
125
+ CER above 1 due to insertion errors:
126
+
127
+ ```python
128
+ from evaluate import load
129
+ cer = load("cer")
130
+ predictions = ["Helló Világ"]
131
+ references = ["Helló"]
132
+ cer_score = cer.compute(predictions=predictions, references=references)
133
+ print(cer_score)
134
+ 1.2
135
+ ```
136
+
137
+ ## Limitations and bias
138
+
139
+ .
140
+
141
+ Also, in some cases, instead of reporting the raw CER, a normalized CER is reported where the number of mistakes is divided by the sum of the number of edit operations (`I` + `S` + `D`) and `C` (the number of correct characters), which results in CER values that fall within the range of 0–100%.
142
+
143
+
144
+ ## Citation
145
+
146
+
147
+ ```bibtex
148
+ @inproceedings{morris2004,
149
+ author = {Morris, Andrew and Maier, Viktoria and Green, Phil},
150
+ year = {2004},
151
+ month = {01},
152
+ pages = {},
153
+ title = {From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition.}
154
+ }
155
+ ```
156
+
157
+ ## References
158
+
159
+ - [Hugging Face Tasks -- Automatic Speech Recognition](https://huggingface.co/tasks/automatic-speech-recognition)
160
+ - https://github.com/huggingface/evaluate
161