Add description to card metadata

#1
by julien-c HF staff - opened
Files changed (1) hide show
  1. README.md +49 -4
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: WER
3
- emoji: 🤗
4
  colorFrom: blue
5
  colorTo: red
6
  sdk: gradio
@@ -8,10 +8,55 @@ sdk_version: 3.0.2
8
  app_file: app.py
9
  pinned: false
10
  tags:
11
- - evaluate
12
- - metric
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  # Metric Card for WER
16
 
17
  ## Metric description
 
1
  ---
2
  title: WER
3
+ emoji: 🤗
4
  colorFrom: blue
5
  colorTo: red
6
  sdk: gradio
 
8
  app_file: app.py
9
  pinned: false
10
  tags:
11
+ - evaluate
12
+ - metric
13
+ description: >-
14
+ Word error rate (WER) is a common metric of the performance of an automatic
15
+ speech recognition system.
16
+
17
+
18
+ The general difficulty of measuring performance lies in the fact that the
19
+ recognized word sequence can have a different length from the reference word
20
+ sequence (supposedly the correct one). The WER is derived from the Levenshtein
21
+ distance, working at the word level instead of the phoneme level. The WER is a
22
+ valuable tool for comparing different systems as well as for evaluating
23
+ improvements within one system. This kind of measurement, however, provides no
24
+ details on the nature of translation errors and further work is therefore
25
+ required to identify the main source(s) of error and to focus any research
26
+ effort.
27
+
28
+
29
+ This problem is solved by first aligning the recognized word sequence with the
30
+ reference (spoken) word sequence using dynamic string alignment. Examination
31
+ of this issue is seen through a theory called the power law that states the
32
+ correlation between perplexity and word error rate.
33
+
34
+
35
+ Word error rate can then be computed as:
36
+
37
+
38
+ WER = (S + D + I) / N = (S + D + I) / (S + D + C)
39
+
40
 
41
+ where
42
+
43
+
44
+ S is the number of substitutions,
45
+
46
+ D is the number of deletions,
47
+
48
+ I is the number of insertions,
49
+
50
+ C is the number of correct words,
51
+
52
+ N is the number of words in the reference (N=S+D+C).
53
+
54
+
55
+ This value indicates the average number of errors per reference word. The
56
+ lower the value, the better the
57
+
58
+ performance of the ASR system with a WER of 0 being a perfect score.
59
+ ---
60
  # Metric Card for WER
61
 
62
  ## Metric description