nithinraok
commited on
Commit
•
aa04693
1
Parent(s):
214d4b4
Update README.md
Browse files
README.md
CHANGED
@@ -273,24 +273,23 @@ These are greedy WER numbers without external LM. More details on evaluation can
|
|
273 |
|
274 |
## Model Fairness Evaluation
|
275 |
|
276 |
-
As
|
277 |
-
dataset and results are reported as follows:
|
278 |
|
279 |
### Gender Bias:
|
280 |
|
281 |
| Gender | Male | Female | N/A | Other |
|
282 |
| :--- | :--- | :--- | :--- | :--- |
|
283 |
| Num utterances | 19325 | 24532 | 926 | 33 |
|
284 |
-
| % WER |
|
285 |
|
286 |
### Age Bias:
|
287 |
|
288 |
| Age Group | $(18-30)$ | $(31-45)$ | $(46-85)$ | $(1-100)$ |
|
289 |
| :--- | :--- | :--- | :--- | :--- |
|
290 |
| Num utterances | 15956 | 14585 | 13349 | 43890 |
|
291 |
-
| % WER |
|
292 |
-
|
293 |
|
|
|
294 |
|
295 |
## NVIDIA Riva: Deployment
|
296 |
|
|
|
273 |
|
274 |
## Model Fairness Evaluation
|
275 |
|
276 |
+
As outlined in the paper "Towards Measuring Fairness in AI: the Casual Conversations Dataset", we assessed the parakeet-tdt-1.1b model for fairness. The model was evaluated on the CausalConversations-v1 dataset, and the results are reported as follows:
|
|
|
277 |
|
278 |
### Gender Bias:
|
279 |
|
280 |
| Gender | Male | Female | N/A | Other |
|
281 |
| :--- | :--- | :--- | :--- | :--- |
|
282 |
| Num utterances | 19325 | 24532 | 926 | 33 |
|
283 |
+
| % WER | 17.18 | 14.61 | 19.06 | 37.57 |
|
284 |
|
285 |
### Age Bias:
|
286 |
|
287 |
| Age Group | $(18-30)$ | $(31-45)$ | $(46-85)$ | $(1-100)$ |
|
288 |
| :--- | :--- | :--- | :--- | :--- |
|
289 |
| Num utterances | 15956 | 14585 | 13349 | 43890 |
|
290 |
+
| % WER | 15.83 | 15.89 | 15.46 | 15.74 |
|
|
|
291 |
|
292 |
+
(Error rates for fairness evaluation are determined by normalizing both the reference and predicted text, similar to the methods used in the evaluations found at https://github.com/huggingface/open_asr_leaderboard.)
|
293 |
|
294 |
## NVIDIA Riva: Deployment
|
295 |
|