ardaatahan commited on
Commit
1543414
·
0 Parent(s):

initial commit

Browse files
.gitignore ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # OS generated files
2
+ .DS_Store
3
+ Thumbs.db
4
+
5
+ # Environment files
6
+ *.env
7
+ .env
8
+
9
+ # Python virtual environment
10
+ venv/
11
+ env/
12
+ *.pyc
13
+ __pycache__/
14
+
15
+ # Hugging Face related
16
+ .huggingface
17
+
18
+ # Project specific
19
+ argmaxinc/
20
+ table_data.json
21
+
22
+ # Jupyter Notebook
23
+ .ipynb_checkpoints
24
+
25
+ # PyCharm
26
+ .idea/
27
+
28
+ # VS Code
29
+ .vscode/
30
+
31
+ # Gradio temporary files
32
+ gradio_cached_examples/
33
+
34
+ # Logs
35
+ *.log
36
+
37
+ # Dependency directories
38
+ node_modules/
39
+
40
+ # Distribution / packaging
41
+ dist/
42
+ build/
43
+ *.egg-info/
44
+
45
+ # Temporary files
46
+ *.tmp
47
+ *.bak
48
+ *.swp
49
+
50
+ # Dataset files (if you don't want to track them)
51
+ *.jsonl
52
+
53
+ # Model files (if you don't want to track them)
54
+ *.pth
55
+ *.h5
56
+ *.ckpt
57
+
58
+ .gradio/
.pre-commit-config.yaml ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ repos:
2
+ - repo: https://github.com/pycqa/isort
3
+ rev: 5.13.2
4
+ hooks:
5
+ - id: isort
6
+ args: ["--profile", "black"]
7
+
8
+ - repo: https://github.com/psf/black
9
+ rev: 23.3.0
10
+ hooks:
11
+ - id: black
12
+ name: black
13
+ language: python
14
+
15
+ - repo: https://github.com/pre-commit/pre-commit-hooks
16
+ rev: v4.5.0
17
+ hooks:
18
+ - id: end-of-file-fixer
Makefile ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .PHONY: format use-huggingface-data use-local-data
2
+
3
+ format:
4
+ @pre-commit run --all-files
5
+
6
+ use-huggingface-data:
7
+ @python multilingual_generate.py download
8
+ @python performance_generate.py download
9
+ @python quality_generate.py
10
+
11
+ use-local-data:
12
+ @python performance_generate.py
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: WhisperKit Benchmarks
3
+ emoji: 🏆
4
+ colorFrom: green
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ app_file: main.py
8
+ license: mit
9
+ ---
10
+
11
+ ## Prerequisites
12
+
13
+ Ensure you have the following software installed:
14
+
15
+ - Python 3.10 or higher
16
+ - pip (Python package installer)
17
+
18
+ ## Installation
19
+
20
+ 1. **Clone the repository**:
21
+
22
+ ```sh
23
+ git clone https://github.com/argmaxinc/model-performance-dashboard.git
24
+ cd model-performance-dashboard
25
+ ```
26
+
27
+ 2. **Create a virtual environment**:
28
+
29
+ ```sh
30
+ python -m venv venv
31
+ source venv/bin/activate
32
+ ```
33
+
34
+ 3. **Install required packages**:
35
+ ```sh
36
+ pip install -r requirements.txt
37
+ ```
38
+
39
+ ## Usage
40
+
41
+ 1. **Run the application**:
42
+
43
+ ```sh
44
+ gradio main.py
45
+ ```
46
+
47
+ 2. **Access the application**:
48
+ After running main.py, a local server will start, and you will see an interface URL in the terminal. Open the URL in your web browser to interact with Argmax Benchmark dashboard.
49
+
50
+ ## Data Generation
51
+
52
+ The data generation process involves three main scripts: performance_generate.py, multilingual_generate.py, and quality_generate.py. Each script is responsible for updating a specific aspect of the benchmark data.
53
+
54
+ 1. **Performance Data Update (performance_generate.py)**:
55
+
56
+ - Downloads benchmark data from [WhisperKit Evals Dataset](https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset).
57
+ - Processes the data to extract performance metrics for various models, devices, and operating systems.
58
+ - Calculates metrics such as speed, tokens per second for long and short-form data.
59
+ - Saves the results in `performance_data.json` and `support_data.csv`.
60
+
61
+ 2. **Multilingual Data Update (multilingual_generate.py)**:
62
+
63
+ - Downloads multilingual evaluation data from [WhisperKit Multilingual Evals Dataset](https://huggingface.co/datasets/argmaxinc/whisperkit-evals-multilingual).
64
+ - Processes the data to generate confusion matrices for language detection.
65
+ - Calculates metrics for both forced and unforced language detection scenarios.
66
+ - Saves the results in `multilingual_confusion_matrices.json` and `multilingual_results.csv`.
67
+
68
+ 3. **Quality Data Update (quality_generate.py)**:
69
+ - Downloads quality evaluation data from [WhisperKit Evals](https://huggingface.co/datasets/argmaxinc/whisperkit-evals).
70
+ - Processes the data to calculate Word Error Rate (WER) and Quality of Inference (QoI) metrics for each dataset.
71
+ - Saves the results in `quality_data.json`.
72
+
73
+ ## Data Update
74
+
75
+ To update the dashboard with latest data from our HuggingFace datasets, run:
76
+
77
+ ```sh
78
+ make use-huggingface-data
79
+ ```
80
+
81
+ Alternatively, you can use our on-device testing code [TODO:INSERT_LINK_TO_OS_TEST_CODE] on your device to update the dashboard with your own data. After generating the Xcode data, place the resulting `.json` files in the `whisperkit-evals/xcresults/benchmark_data` directory, then run:
82
+
83
+ ```sh
84
+ make use-local-data
85
+ ```
constants.py ADDED
@@ -0,0 +1,254 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from textwrap import dedent
2
+
3
+ from iso639 import Lang
4
+
5
+ BANNER_TEXT = """
6
+ <div style="text-align: center;">
7
+ <h1><a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit Benchmarks</a></h1>
8
+ </div>
9
+ """
10
+
11
+
12
+ INTRO_LABEL = """We present comprehensive benchmarks for WhisperKit, our on-device ASR solution, compared against a reference implementation. These benchmarks aim to help developers and enterprises make informed decisions when choosing optimized or compressed variants of machine learning models for production use. Show more."""
13
+
14
+
15
+ INTRO_TEXT = """
16
+ <h3 style="display: flex;
17
+ justify-content: center;
18
+ align-items: center;
19
+ "></h2>
20
+ \n📈 Key Metrics:
21
+ Word Error Rate (WER) (⬇️): The percentage of words incorrectly transcribed. Lower is better.
22
+ Quality of Inference (QoI) (⬆️): Percentage of examples where WhisperKit performs no worse than the reference model. Higher is better.
23
+ Tokens per Second (⬆️): The number of output tokens generated per second. Higher is better.
24
+ Speed (⬆️): Input audio seconds transcribed per second. Higher is better.
25
+
26
+ 🎯 WhisperKit is evaluated across different datasets, with a focus on per-example no-regressions (QoI) and overall accuracy (WER).
27
+ \n💻 Our benchmarks include:
28
+ Reference: <a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a> (OpenAI's Whisper API)
29
+ On-device: <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a> (various versions and optimizations)
30
+
31
+ ℹ️ Reference Implementation:
32
+ <a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a> sets the reference standard. We assume it uses the equivalent of openai/whisper-large-v2 in float16 precision, along with additional undisclosed optimizations from OpenAI. As of 02/29/24, it costs $0.36 per hour of audio and has a 25MB file size limit per request.
33
+ \n🔍 We use two primary datasets:
34
+ <a href='https://huggingface.co/datasets/argmaxinc/librispeech'>LibriSpeech</a>: ~5 hours of short English audio clips
35
+ <a href='https://huggingface.co/datasets/argmaxinc/earnings22'>Earnings22</a>: ~120 hours of English audio from earnings calls
36
+
37
+ 🌐 Multilingual Benchmarks:
38
+ These benchmarks aim to demonstrate WhisperKit's capabilities across diverse languages, helping developers assess its suitability for multilingual applications.
39
+ \nDataset:
40
+ <a href='https://huggingface.co/datasets/argmaxinc/whisperkit-evals-multilingual'>Common Voice 17.0</a>: Short-form audio files (<30s/clip) for a maximum of 400 samples per language from Common Voice 17.0. Test set covers a wide range of languages to test model's versatility.
41
+
42
+ \nMetrics:
43
+ Average WER: Provides an overall measure of model performance across all languages.
44
+ Language-specific WER: Allows for detailed analysis of model performance for each supported language.
45
+ Language Detection Accuracy: Measured using a confusion matrix, showing the model's ability to identify the correct language.
46
+ Results are shown for both forced (correct language given as input) and unforced (model detects language) scenarios.
47
+
48
+ 🔄 Results are periodically updated using our automated evaluation pipeline on Apple Silicon Macs.
49
+ \n🛠️ Developers can use <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a> to reproduce these results or run evaluations on their own custom datasets.
50
+
51
+ 🔗 Links:
52
+ - <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a>
53
+ - <a href='https://github.com/argmaxinc/whisperkittools'>whisperkittools</a>
54
+ - <a href='https://huggingface.co/datasets/argmaxinc/librispeech'>LibriSpeech</a>
55
+ - <a href='https://huggingface.co/datasets/argmaxinc/earnings22'>Earnings22</a>
56
+ - <a href='https://huggingface.co/datasets/argmaxinc/whisperkit-evals-multilingual'>Common Voice 17.0</a>
57
+ - <a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a>
58
+ """
59
+
60
+
61
+ METHODOLOGY_TEXT = dedent(
62
+ """
63
+ # Methodology
64
+
65
+ ## Overview
66
+ WhisperKit Benchmarks is the one-stop shop for on-device performance and quality testing of WhisperKit models across supported devices, OS versions and audio datasets.
67
+
68
+ ## Metrics
69
+
70
+ - **Speed factor** (⬆️): Computed as the ratio of input audio length to end-to-end WhisperKit latency for transcribing that audio. A speed factor of N means N seconds of input audio was transcribed in 1 second.
71
+ - **Tok/s (Tokens per second)** (⬆️): Total number of text decoder forward passes divided by the end-to-end processing time.
72
+ - This metric varies with input data given that the pace of speech changes the text decoder % of overall latency. This metric should not be confused with the reciprocal of the text decoder latency which is constant across input files.
73
+ - **WER (Word Error Rate)** (⬇️): The ratio of words incorrectly transcribed when comparing the model's output to reference transcriptions, with lower values indicating better accuracy.
74
+ - **QoI (Quality of Inference)** (⬆️): The ratio of examples where WhisperKit performs no worse than the reference model.
75
+ - This metric does not capture improvements to the reference. It only measures potential regressions.
76
+ - **Parity %**: The percentage difference between a model's Average WER on a given device and its Average WER on the Apple M2 Ultra, where a negative value indicates worse performance compared to the M2 Ultra.
77
+ - **Multilingual results**: Separated into "language hinted" and "language predicted" categories to evaluate performance with and without prior knowledge of the input language.
78
+
79
+ ## Data
80
+
81
+ - **Short-form**: 5 hours of English audiobook clips with 30s/clip comprising the [librispeech test set](https://huggingface.co/datasets/argmaxinc/librispeech). Proxy for average streaming performance.
82
+ - **Long-form**: 12 hours of earnings call recordings with ~1hr/clip in English with various accents. Built by randomly selecting 10% of the [earnings22 test set](https://huggingface.co/datasets/argmaxinc/earnings22-12hours). Proxy for average from-file performance.
83
+ - Full datasets are used for English Quality tests and random 10-minute subsets are used for Performance tests.
84
+ - **Multilingual**: Max 400 samples per language with <30s/clip from [Common Voice 17.0 Test Set](https://huggingface.co/datasets/argmaxinc/common_voice_17_0-argmax_subset-400). Common Voice covers 77 of the 99 languages supported by Whisper.
85
+
86
+ ## Performance Measurement
87
+
88
+ 1. On-device testing is conducted with [WhisperKit Regression Test Automations](https://github.com/argmaxinc/WhisperKit/blob/main/BENCHMARKS.md) on iPhones, iPads, and Macs, across different iOS and macOS versions.
89
+ 2. Performance is recorded on 10-minute datasets described above for short- and long-form
90
+ 3. Quality metrics are recorded on full datasets on Apple M2 Ultra Mac Studios to allow for fast processing of many configurations and providing a consistent, high-performance baseline for all evaluations displayed in the English Quality tab.
91
+ 4. Quality is also sanity-checked on 10-minute datasets in order to catch potential correctness regressions across different device and OS combinations despite running the same version of WhisperKit.
92
+ 5. Results are aggregated and presented in the dashboard, allowing for easy comparison and analysis.
93
+
94
+ ## Dashboard Features
95
+
96
+ - Performance: Interactive filtering by model, device, OS, and performance metrics
97
+ - Timeline: Visualizations of performance trends
98
+ - English Quality: English transcription quality on short- and long-form audio
99
+ - Multilingual Quality: Multilingual (77) transcription quality on short-form audio with and without language prediction
100
+ - Device Support: Matrix of supported device, OS and model version combinations. Unsupported combinations are marked with :warning:.
101
+ - This methodology ensures a comprehensive and fair evaluation of speech recognition models supported by WhisperKit across a wide range of scenarios and use cases.
102
+ """
103
+ )
104
+
105
+ PERFORMANCE_TEXT = dedent(
106
+ """
107
+ ## Metrics
108
+ - **Speed factor** (⬆️): Computed as the ratio of input audio length to end-to-end WhisperKit latency for transcribing that audio. A speed factor of N means N seconds of input audio was transcribed in 1 second.
109
+ - **Tok/s (Tokens per second)** (⬆️): Total number of text decoder forward passes divided by the end-to-end processing time.
110
+ - **Parity %**: The percentage difference between a model's Average WER on a given device and its Average WER on the Apple M2 Ultra, where a negative value indicates worse performance compared to the M2 Ultra.
111
+
112
+ ## Data
113
+
114
+ - **Short-form**: 5 hours of English audiobook clips with 30s/clip comprising the [librispeech test set](https://huggingface.co/datasets/argmaxinc/librispeech).
115
+ - **Long-form**: 12 hours of earnings call recordings with ~1hr/clip in English with various accents. Built by randomly selecting 10% of the [earnings22 test set](https://huggingface.co/datasets/argmaxinc/earnings22-12hours).
116
+ """
117
+ )
118
+
119
+ QUALITY_TEXT = dedent(
120
+ """
121
+ ## Metrics
122
+ - **WER (Word Error Rate)** (⬇️): The ratio of words incorrectly transcribed when comparing the model's output to reference transcriptions, with lower values indicating better accuracy.
123
+ - **QoI (Quality of Inference)** (⬆️): The ratio of examples where WhisperKit performs no worse than the reference model.
124
+ - This metric does not capture improvements to the reference. It only measures potential regressions.
125
+ """
126
+ )
127
+
128
+ COL_NAMES = {
129
+ "model.model_version": "Model",
130
+ "device.product_name": "Device",
131
+ "device.os": "OS",
132
+ "average_wer": "Average WER",
133
+ "qoi": "QoI",
134
+ "speed": "Speed",
135
+ "tokens_per_second": "Tok / s",
136
+ "model": "Model",
137
+ "device": "Device",
138
+ "os": "OS",
139
+ "parity": "Parity %",
140
+ }
141
+
142
+
143
+ CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
144
+
145
+
146
+ CITATION_BUTTON_TEXT = r"""@misc{whisperkit-argmax,
147
+ title = {WhisperKit},
148
+ author = {Argmax, Inc.},
149
+ year = {2024},
150
+ URL = {https://github.com/argmaxinc/WhisperKit}
151
+ }"""
152
+
153
+
154
+ HEADER = """<div align="center">
155
+ <div position: relative>
156
+ <img
157
+ src=""
158
+ style="display:block;width:7%;height:auto;"
159
+ />
160
+ </div>
161
+ </div>"""
162
+
163
+
164
+ EARNINGS22_URL = (
165
+ "https://huggingface.co/datasets/argmaxinc/earnings22-debug/resolve/main/{0}"
166
+ )
167
+ LIBRISPEECH_URL = (
168
+ "https://huggingface.co/datasets/argmaxinc/librispeech-debug/resolve/main/{0}"
169
+ )
170
+
171
+ AUDIO_URL = (
172
+ "https://huggingface.co/datasets/argmaxinc/whisperkit-test-data/resolve/main/"
173
+ )
174
+
175
+ WHISPER_OPEN_AI_LINK = "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/{}/{}"
176
+
177
+ BASE_WHISPERKIT_BENCHMARK_URL = "https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data"
178
+
179
+ AVAILABLE_LANGUAGES = [
180
+ "af",
181
+ "am",
182
+ "ar",
183
+ "as",
184
+ "az",
185
+ "ba",
186
+ "be",
187
+ "bg",
188
+ "bn",
189
+ "br",
190
+ "ca",
191
+ "cs",
192
+ "cy",
193
+ "da",
194
+ "de",
195
+ "el",
196
+ "en",
197
+ "es",
198
+ "et",
199
+ "eu",
200
+ "fa",
201
+ "fi",
202
+ "fr",
203
+ "gl",
204
+ "ha",
205
+ "he",
206
+ "hi",
207
+ "hu",
208
+ "hy",
209
+ "id",
210
+ "it",
211
+ "ja",
212
+ "ka",
213
+ "kk",
214
+ "ko",
215
+ "lo",
216
+ "lt",
217
+ "lv",
218
+ "mk",
219
+ "ml",
220
+ "mn",
221
+ "mr",
222
+ "mt",
223
+ "ne",
224
+ "nl",
225
+ "nn",
226
+ "oc",
227
+ "pa",
228
+ "pl",
229
+ "ps",
230
+ "pt",
231
+ "ro",
232
+ "ru",
233
+ "sk",
234
+ "sl",
235
+ "sq",
236
+ "sr",
237
+ "sv",
238
+ "sw",
239
+ "ta",
240
+ "te",
241
+ "th",
242
+ "tk",
243
+ "tr",
244
+ "tt",
245
+ "uk",
246
+ "ur",
247
+ "uz",
248
+ "vi",
249
+ "yi",
250
+ "yo",
251
+ "yue",
252
+ "zh",
253
+ ]
254
+ LANGUAGE_MAP = {lang: Lang(lang).name for lang in AVAILABLE_LANGUAGES}
dashboard_data/config.json ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "whisperkit-coreml",
3
+ "version": "0.2",
4
+ "device_support": [
5
+ {
6
+ "identifiers": ["iPhone11", "iPhone12", "Watch7", "Watch8"],
7
+ "models": {
8
+ "default": "openai_whisper-tiny",
9
+ "supported": [
10
+ "openai_whisper-tiny",
11
+ "openai_whisper-tiny.en",
12
+ "openai_whisper-base",
13
+ "openai_whisper-base.en"
14
+ ]
15
+ }
16
+ },
17
+ {
18
+ "identifiers": ["iPhone13", "iPad13,18", "iPad13,1"],
19
+ "models": {
20
+ "default": "openai_whisper-base",
21
+ "supported": [
22
+ "openai_whisper-tiny",
23
+ "openai_whisper-tiny.en",
24
+ "openai_whisper-base",
25
+ "openai_whisper-base.en",
26
+ "openai_whisper-small",
27
+ "openai_whisper-small.en"
28
+ ]
29
+ }
30
+ },
31
+ {
32
+ "identifiers": [
33
+ "iPhone14",
34
+ "iPhone15",
35
+ "iPhone16",
36
+ "iPhone17",
37
+ "iPad14,1",
38
+ "iPad14,2"
39
+ ],
40
+ "models": {
41
+ "default": "openai_whisper-base",
42
+ "supported": [
43
+ "openai_whisper-tiny",
44
+ "openai_whisper-tiny.en",
45
+ "openai_whisper-base",
46
+ "openai_whisper-base.en",
47
+ "openai_whisper-small",
48
+ "openai_whisper-small.en",
49
+ "openai_whisper-large-v2_949MB",
50
+ "openai_whisper-large-v2_turbo_955MB",
51
+ "openai_whisper-large-v3_947MB",
52
+ "openai_whisper-large-v3_turbo_954MB",
53
+ "distil-whisper_distil-large-v3_594MB",
54
+ "distil-whisper_distil-large-v3_turbo_600MB",
55
+ "openai_whisper-large-v3-v20240930_626MB",
56
+ "openai_whisper-large-v3-v20240930_turbo_632MB"
57
+ ]
58
+ }
59
+ },
60
+ {
61
+ "identifiers": [
62
+ "Mac13",
63
+ "iMac21",
64
+ "MacBookAir10,1",
65
+ "MacBookPro17",
66
+ "MacBookPro18",
67
+ "Macmini9",
68
+ "iPad13,16",
69
+ "iPad13,4",
70
+ "iPad13,8"
71
+ ],
72
+ "models": {
73
+ "default": "openai_whisper-large-v3-v20240930",
74
+ "supported": [
75
+ "openai_whisper-tiny",
76
+ "openai_whisper-tiny.en",
77
+ "openai_whisper-base",
78
+ "openai_whisper-base.en",
79
+ "openai_whisper-small",
80
+ "openai_whisper-small.en",
81
+ "openai_whisper-large-v2",
82
+ "openai_whisper-large-v2_949MB",
83
+ "openai_whisper-large-v3",
84
+ "openai_whisper-large-v3_947MB",
85
+ "distil-whisper_distil-large-v3",
86
+ "distil-whisper_distil-large-v3_594MB",
87
+ "openai_whisper-large-v3-v20240930",
88
+ "openai_whisper-large-v3-v20240930_626MB"
89
+ ]
90
+ }
91
+ },
92
+ {
93
+ "identifiers": [
94
+ "Mac14",
95
+ "Mac15",
96
+ "Mac16",
97
+ "iPad14,3",
98
+ "iPad14,4",
99
+ "iPad14,5",
100
+ "iPad14,6",
101
+ "iPad14,8",
102
+ "iPad14,9",
103
+ "iPad14,10",
104
+ "iPad14,11",
105
+ "iPad16"
106
+ ],
107
+ "models": {
108
+ "default": "openai_whisper-large-v3-v20240930",
109
+ "supported": [
110
+ "openai_whisper-tiny",
111
+ "openai_whisper-tiny.en",
112
+ "openai_whisper-base",
113
+ "openai_whisper-base.en",
114
+ "openai_whisper-small",
115
+ "openai_whisper-small.en",
116
+ "openai_whisper-large-v2",
117
+ "openai_whisper-large-v2_949MB",
118
+ "openai_whisper-large-v2_turbo",
119
+ "openai_whisper-large-v2_turbo_955MB",
120
+ "openai_whisper-large-v3",
121
+ "openai_whisper-large-v3_947MB",
122
+ "openai_whisper-large-v3_turbo",
123
+ "openai_whisper-large-v3_turbo_954MB",
124
+ "distil-whisper_distil-large-v3",
125
+ "distil-whisper_distil-large-v3_594MB",
126
+ "distil-whisper_distil-large-v3_turbo",
127
+ "distil-whisper_distil-large-v3_turbo_600MB",
128
+ "openai_whisper-large-v3-v20240930",
129
+ "openai_whisper-large-v3-v20240930_turbo",
130
+ "openai_whisper-large-v3-v20240930_626MB",
131
+ "openai_whisper-large-v3-v20240930_turbo_632MB"
132
+ ]
133
+ }
134
+ }
135
+ ]
136
+ }
dashboard_data/device_map.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "Mac14,12": "Apple M2 Pro",
3
+ "Mac14,14": "Apple M2 Ultra",
4
+ "Mac15,3": "Apple M3",
5
+ "Mac15,9": "Apple M3 Max",
6
+ "iPad14,8": "iPad Air 11-inch (M2)",
7
+ "iPad16,1": "iPad mini (A17 Pro)",
8
+ "iPad16,3": "iPad Pro 11-inch (M4)",
9
+ "iPhone12,1": "iPhone 11",
10
+ "iPhone14,2": "iPhone 13 Pro",
11
+ "iPhone14,5": "iPhone 13",
12
+ "iPhone14,7": "iPhone 14",
13
+ "iPhone17,1": "iPhone 16 Pro"
14
+ }
dashboard_data/diff_checker_data.json ADDED
File without changes
dashboard_data/multilingual_confusion_matrices.json ADDED
The diff for this file is too large to render. See raw diff
 
dashboard_data/multilingual_results.csv ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Model,Forced Tokens,Average WER,WER_sl,WER_sk,WER_ur,WER_sw,WER_uz,WER_pl,WER_vi,WER_sq,WER_sv,WER_he,WER_mt,WER_hy,WER_am,WER_nn,WER_be,WER_da,WER_mr,WER_kk,WER_mn,WER_ja,WER_el,WER_lv,WER_oc,WER_it,WER_ca,WER_cs,WER_te,WER_ru,WER_tk,WER_ro,WER_yo,WER_yue,WER_yi,WER_pt,WER_ps,WER_zh,WER_uk,WER_sr,WER_pa,WER_ml,WER_mk,WER_ba,WER_ha,WER_ar,WER_gl,WER_hu,WER_nl,WER_bg,WER_bn,WER_ne,WER_af,WER_hi,WER_ka,WER_de,WER_as,WER_az,WER_br,WER_ko,WER_fi,WER_id,WER_fr,WER_es,WER_et,WER_en,WER_fa,WER_lt,WER_cy,WER_eu,WER_lo,WER_tt,WER_ta,WER_th,WER_tr
2
+ openai_whisper-large-v3-v20240930,False,51.57,36.9,29.71,46.48,64.04,110.02,14.74,14.89,69.25,18.09,29.11,86.41,74.32,145.83,50.03,79.08,19.43,67.19,43.57,116.51,26.33,21.75,32.92,73.51,14.39,20.36,14.41,140.14,15.81,112.64,15.2,95.06,51.16,103.7,16.37,111.73,27.24,24.08,62.2,104.28,121.81,48.07,102.63,104.87,40.62,18.12,16.39,11.46,21.95,98.71,86.28,37.8,43.31,137.87,14.01,103.2,38.1,100.68,20.79,16.62,12.28,16.31,5.94,32.21,12.54,60.73,35.6,57.45,42.35,103.14,98.21,44.83,31.0,31.24
3
+ openai_whisper-large-v3-v20240930,True,46.09,27.13,24.61,25.59,61.29,98.84,12.12,16.92,65.69,12.97,26.85,84.04,73.95,128.9,39.97,61.51,17.63,48.26,41.87,97.08,21.97,17.73,30.78,71.01,12.83,18.25,12.85,75.43,13.28,104.35,11.41,89.71,64.28,100.0,14.93,95.78,25.34,19.14,54.07,120.4,112.94,34.52,100.0,96.64,31.45,15.0,15.3,8.91,20.42,79.7,63.89,36.54,26.14,132.26,12.26,105.14,33.33,95.96,20.75,15.42,11.11,15.51,6.1,31.51,12.13,55.96,32.84,54.92,40.65,114.11,98.39,41.54,23.3,24.29
4
+ openai_whisper-tiny,False,105.22,121.79,133.13,113.57,119.78,118.34,103.44,99.27,119.73,82.19,122.66,112.51,132.53,120.31,103.18,115.1,99.88,101.13,125.86,114.12,82.61,130.16,112.75,100.2,89.71,82.85,125.93,113.15,109.31,117.09,118.71,109.16,88.8,120.37,81.94,115.21,79.77,115.73,114.84,103.05,105.04,117.77,116.19,109.58,159.43,78.61,129.6,76.43,122.17,100.32,104.44,118.43,102.01,140.29,79.95,100.86,146.83,110.82,110.13,93.9,124.4,67.17,68.41,113.76,33.14,122.06,133.54,112.59,132.9,106.52,123.61,100.96,110.41,125.91
5
+ openai_whisper-tiny,True,86.1,81.42,92.88,70.33,112.75,122.16,56.82,50.52,99.17,64.45,72.15,103.81,133.47,140.93,102.1,98.9,79.55,102.76,179.77,128.57,53.32,66.33,89.27,93.19,60.12,59.02,81.79,133.22,58.43,124.62,66.43,111.99,90.36,102.78,65.71,105.43,65.2,69.07,80.42,104.57,133.29,83.84,110.69,97.86,97.63,54.0,85.5,54.06,83.5,106.27,103.34,93.39,102.17,140.86,49.53,112.36,90.67,102.29,62.34,72.56,54.08,59.57,34.99,101.75,33.4,130.66,101.05,93.62,97.05,113.15,107.44,80.94,42.42,66.07
6
+ openai_whisper-small,False,96.89,116.31,109.59,106.84,110.05,117.73,69.24,74.63,110.83,52.47,97.58,109.76,138.13,118.51,93.33,113.81,66.04,101.28,127.1,115.44,89.93,120.81,101.47,88.06,67.04,51.78,117.66,120.07,92.0,116.14,86.09,106.41,48.07,105.56,37.94,105.77,117.15,101.7,108.27,101.54,102.57,115.41,118.11,104.62,151.92,65.17,66.19,60.49,118.61,100.46,102.89,99.37,100.76,141.61,49.17,100.37,121.83,107.77,137.18,68.85,104.92,54.88,47.03,96.83,18.2,119.04,121.42,111.48,116.31,118.1,120.3,100.63,115.33,108.8
7
+ openai_whisper-small,True,69.14,49.09,51.74,40.93,96.1,115.21,23.74,25.43,89.94,23.97,43.29,96.2,120.55,130.06,164.5,78.06,37.18,89.86,82.95,262.79,30.25,31.49,62.47,114.14,25.02,30.35,37.7,311.76,26.09,174.55,26.99,161.95,48.61,100.0,35.7,94.66,42.22,40.02,60.5,160.3,115.92,50.81,118.57,97.46,47.01,30.45,44.66,19.94,49.16,129.67,107.33,71.02,45.58,,23.87,131.95,62.3,,34.7,30.07,23.81,27.11,11.94,72.39,17.35,97.5,75.61,67.42,77.08,102.07,103.03,42.18,21.52,33.32
8
+ openai_whisper-large-v3,False,54.77,41.01,32.74,44.39,66.07,110.74,17.82,14.19,64.45,13.59,36.31,96.11,70.79,134.23,52.71,85.41,16.63,60.48,58.79,122.59,33.66,28.76,27.08,78.1,13.55,17.32,20.67,123.88,16.13,107.71,10.99,110.75,53.95,105.56,14.51,103.6,42.68,32.28,64.23,101.2,102.62,68.19,100.46,99.97,36.03,23.57,13.44,12.17,30.2,98.44,101.1,40.79,75.87,149.84,14.75,100.54,35.32,106.08,20.94,15.9,11.86,15.51,6.36,30.06,12.7,62.68,32.37,51.06,45.38,104.29,100.73,81.72,38.07,26.73
9
+ openai_whisper-large-v3,True,34.23,18.87,18.44,21.24,58.02,90.52,10.13,12.32,53.97,9.81,23.79,78.78,54.56,,29.37,45.53,13.89,42.37,48.61,87.75,20.38,12.35,21.06,65.39,11.11,14.69,12.04,61.25,13.0,99.39,5.39,97.25,14.27,101.85,13.75,88.95,25.41,15.59,41.4,57.1,107.34,20.59,99.25,91.39,23.08,13.06,12.44,7.03,17.37,,52.77,36.38,20.33,,9.89,,21.43,86.38,20.37,10.32,9.47,13.67,4.93,28.43,12.21,45.43,27.63,35.05,40.65,102.76,90.45,28.97,6.11,17.88
10
+ openai_whisper-large-v3-v20240930_626MB,False,52.29,39.68,29.99,49.08,66.59,107.43,15.31,15.95,71.18,17.19,32.01,88.37,79.06,135.02,51.08,80.09,20.74,71.26,47.37,105.47,25.78,22.21,34.77,74.12,15.26,20.99,15.98,139.45,16.29,106.18,16.59,95.23,51.42,101.85,16.46,107.76,29.67,27.49,64.5,103.61,115.9,47.55,100.79,103.61,38.22,19.62,17.52,11.63,24.46,98.93,85.04,39.69,47.4,133.75,14.69,104.02,38.49,101.45,22.74,16.62,12.28,17.04,6.02,34.3,13.39,62.2,38.5,60.13,45.51,103.6,98.12,48.42,35.09,31.2
11
+ openai_whisper-large-v3-v20240930_626MB,True,47.64,30.62,25.67,26.93,62.36,97.82,13.11,17.36,67.47,12.72,29.16,84.89,77.31,111.23,39.77,63.57,18.94,50.51,45.76,97.71,22.33,18.71,31.72,72.64,13.16,19.39,14.49,84.78,14.37,102.51,12.24,93.42,66.1,100.0,14.85,95.84,27.18,21.48,56.17,134.5,123.72,36.74,98.12,95.73,32.09,15.48,17.05,9.05,22.25,81.89,63.49,40.0,27.3,128.28,13.21,102.9,34.52,96.28,22.01,15.17,12.66,15.68,5.97,33.98,13.03,56.96,35.97,57.0,43.62,121.47,99.17,42.74,23.09,24.11
12
+ openai_whisper-large-v3-v20240930_547MB,False,61.3,56.47,43.16,61.91,88.9,109.85,24.17,22.93,88.74,26.09,45.97,96.7,107.38,134.76,57.25,85.1,27.18,70.85,71.7,109.68,30.21,32.19,50.95,80.91,20.97,30.91,27.66,137.02,20.13,112.37,27.76,99.08,65.2,116.67,21.24,103.29,38.97,40.09,73.1,103.36,116.51,67.78,109.58,103.61,53.89,27.16,30.42,17.39,39.13,102.58,86.68,51.97,58.32,132.81,19.31,103.2,59.92,103.96,26.14,23.24,18.05,23.09,8.1,47.03,16.56,84.54,55.51,75.58,63.11,111.89,103.54,61.24,58.17,42.66
13
+ openai_whisper-large-v3-v20240930_547MB,True,54.61,40.12,35.54,35.4,78.38,102.05,19.6,25.97,81.39,19.25,38.72,89.86,109.32,146.41,46.71,69.67,25.3,60.25,66.07,101.55,25.79,26.23,45.55,75.4,18.77,27.16,23.73,106.23,18.63,108.87,18.26,97.33,74.97,101.85,18.91,95.65,34.74,30.15,63.01,,120.76,47.96,104.64,100.75,41.63,20.54,27.11,14.18,33.31,96.72,74.91,48.5,38.49,129.01,17.62,101.41,47.22,99.32,26.25,20.31,17.25,22.0,7.84,45.12,16.12,77.07,49.49,71.63,57.09,114.95,101.79,53.56,39.58,33.54
14
+ openai_whisper-large-v2,False,94.09,119.27,112.44,106.95,110.77,122.16,75.3,61.28,112.19,43.62,91.08,112.01,137.54,118.3,90.82,118.11,25.33,100.79,152.88,115.67,79.34,113.99,69.07,91.12,50.45,40.5,112.0,113.84,99.36,123.07,96.59,110.1,52.67,100.0,47.38,106.7,125.66,95.49,,101.13,102.34,118.31,124.76,105.15,143.76,63.7,44.82,48.04,119.44,100.18,102.64,101.1,100.38,154.52,65.08,100.59,85.71,105.79,97.02,48.57,92.39,31.95,46.74,99.3,13.74,116.06,137.91,74.42,107.11,111.27,131.83,100.18,119.35,113.86
15
+ openai_whisper-large-v2,True,47.14,25.76,25.84,25.24,67.14,100.99,12.51,17.69,65.57,12.16,24.01,83.34,62.4,176.79,47.12,49.05,16.72,48.12,58.01,136.6,22.6,15.04,28.69,72.69,14.34,16.2,17.14,165.74,15.11,115.56,7.86,95.93,53.06,105.56,15.23,99.75,36.59,20.95,43.09,105.46,114.5,25.32,107.07,115.23,26.39,16.27,16.72,8.93,21.52,103.9,62.0,47.24,25.92,150.19,11.7,107.19,29.37,106.33,24.84,13.13,12.2,16.21,6.93,35.96,12.7,53.38,38.94,32.85,49.09,103.76,105.51,28.08,8.76,19.55
16
+ openai_whisper-base,False,104.18,125.5,143.16,112.16,113.79,122.43,99.57,99.07,123.02,75.03,98.01,114.12,138.03,114.66,99.56,122.48,81.21,101.78,137.07,115.04,91.47,131.68,117.3,96.63,78.16,69.85,128.78,132.18,103.59,125.21,114.33,106.87,72.69,125.93,59.81,114.59,74.81,113.57,119.31,103.18,105.38,123.9,123.21,109.11,160.77,70.55,110.22,80.49,122.28,100.7,104.89,108.98,101.32,143.37,61.29,100.7,134.52,111.44,136.38,102.17,126.88,58.74,58.79,115.46,25.71,122.69,149.93,110.1,126.44,116.79,132.15,101.37,112.61,122.57
17
+ openai_whisper-base,True,79.92,72.07,76.57,59.1,106.54,171.77,43.44,40.15,100.62,45.53,61.22,102.6,208.4,165.98,83.98,92.45,61.96,103.31,99.05,201.0,42.73,55.22,81.5,82.95,46.45,48.59,67.24,117.99,44.21,151.1,54.19,105.42,70.49,111.11,48.98,98.51,53.88,58.31,76.9,100.38,119.75,74.21,134.5,116.11,72.17,47.63,71.07,37.01,73.54,100.79,101.9,87.4,102.24,117.93,38.09,109.8,84.13,106.76,48.87,56.32,43.04,45.09,24.55,91.31,25.11,104.21,91.07,87.41,98.64,106.13,108.77,60.25,32.91,51.87
dashboard_data/performance_data.json ADDED
The diff for this file is too large to render. See raw diff
 
dashboard_data/quality_data.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"model": "openai/whisper-large-v3/947MB", "timestamp": "2024-10-18_16:59:10_GMT-0700", "average_wer": 9.74, "dataset_wer": {"librispeech": 2.41, "earnings22-12hours": 17.08}, "qoi": 0.94}
2
+ {"model": "openai/whisper-large-v2/turbo/955MB", "timestamp": "2024-10-18_16:52:35_GMT-0700", "average_wer": 7.27, "dataset_wer": {"librispeech": 2.4, "earnings22-12hours": 12.14}, "qoi": 0.94}
3
+ {"model": "openai/whisper-tiny.en", "timestamp": "2024-10-19_15:40:06_GMT-0700", "average_wer": 12.23, "dataset_wer": {"librispeech": 5.61, "earnings22-12hours": 18.86}, "qoi": 0.63}
4
+ {"model": "distil-whisper/distil-large-v3/594MB", "timestamp": "2024-10-20_13:02:33_GMT-0700", "average_wer": 8.96, "dataset_wer": {"librispeech": 2.87, "earnings22-12hours": 15.06}, "qoi": 0.86}
5
+ {"model": "openai/whisper-large-v2/949MB", "timestamp": "2024-10-18_19:51:30_GMT-0400", "average_wer": 7.88, "dataset_wer": {"librispeech": 2.38, "earnings22-12hours": 13.39}, "qoi": 0.94}
6
+ {"model": "openai/whisper-large-v3/turbo/954MB", "timestamp": "2024-10-20_13:49:26_GMT-0700", "average_wer": 22.75, "dataset_wer": {"librispeech": 2.51, "earnings22-12hours": 43.0}, "qoi": 0.93}
7
+ {"model": "distil-whisper/distil-large-v3", "timestamp": "2024-10-20_20:32:22_GMT-0700", "average_wer": 7.2, "dataset_wer": {"librispeech": 2.38, "earnings22-12hours": 12.02}, "qoi": 0.9}
8
+ {"model": "openai/whisper-large-v3-v20240930", "timestamp": "2024-10-18_18:35:46_GMT-0700", "average_wer": 6.74, "dataset_wer": {"librispeech": 1.93, "earnings22-12hours": 11.55}, "qoi": 0.94}
9
+ {"model": "openai/whisper-tiny", "timestamp": "2024-10-20_20:19:04_GMT-0700", "average_wer": 14.21, "dataset_wer": {"librispeech": 7.46, "earnings22-12hours": 20.97}, "qoi": 0.52}
10
+ {"model": "openai/whisper-large-v3-v20240930/turbo/632MB", "timestamp": "2024-10-18_20:10:30_GMT-0700", "average_wer": 6.86, "dataset_wer": {"librispeech": 1.95, "earnings22-12hours": 11.77}, "qoi": 0.93}
11
+ {"model": "openai/whisper-large-v2/turbo", "timestamp": "2024-10-18_14:58:38_GMT-0700", "average_wer": 7.25, "dataset_wer": {"librispeech": 2.4, "earnings22-12hours": 12.1}, "qoi": 0.96}
12
+ {"model": "openai/whisper-small", "timestamp": "2024-10-18_12:40:03_GMT-0700", "average_wer": 8.11, "dataset_wer": {"librispeech": 3.21, "earnings22-12hours": 13.0}, "qoi": 0.83}
13
+ {"model": "openai/whisper-large-v3-v20240930/turbo", "timestamp": "2024-10-18_19:37:26_GMT-0700", "average_wer": 6.72, "dataset_wer": {"librispeech": 1.92, "earnings22-12hours": 11.52}, "qoi": 0.94}
14
+ {"model": "openai/whisper-large-v3", "timestamp": "2024-10-18_18:01:14_GMT-0400", "average_wer": 6.85, "dataset_wer": {"librispeech": 2.02, "earnings22-12hours": 11.69}, "qoi": 0.95}
15
+ {"model": "openai/whisper-large-v3-v20240930/626MB", "timestamp": "2024-10-18_19:21:06_GMT-0700", "average_wer": 7.15, "dataset_wer": {"librispeech": 1.96, "earnings22-12hours": 12.35}, "qoi": 0.93}
16
+ {"model": "openai/whisper-base.en", "timestamp": "2024-10-20_12:31:44_GMT-0700", "average_wer": 9.59, "dataset_wer": {"librispeech": 3.98, "earnings22-12hours": 15.2}, "qoi": 0.75}
17
+ {"model": "openai/whisper-large-v3-v20240930/547MB", "timestamp": "2024-10-18_21:59:11_GMT-0400", "average_wer": 16.82, "dataset_wer": {"librispeech": 2.16, "earnings22-12hours": 31.49}, "qoi": 0.92}
18
+ {"model": "distil-whisper/distil-large-v3/turbo/600MB", "timestamp": "2024-10-18_17:50:17_GMT-0700", "average_wer": 8.33, "dataset_wer": {"librispeech": 2.8, "earnings22-12hours": 13.87}, "qoi": 0.86}
19
+ {"model": "openai/whisper-large-v2", "timestamp": "2024-10-18_17:07:15_GMT-0400", "average_wer": 7.32, "dataset_wer": {"librispeech": 2.36, "earnings22-12hours": 12.28}, "qoi": 0.97}
20
+ {"model": "openai/whisper-small.en", "timestamp": "2024-10-18_15:39:48_GMT-0400", "average_wer": 7.85, "dataset_wer": {"librispeech": 2.88, "earnings22-12hours": 12.82}, "qoi": 0.86}
21
+ {"model": "distil-whisper/distil-large-v3/turbo", "timestamp": "2024-10-20_12:45:20_GMT-0700", "average_wer": 7.2, "dataset_wer": {"librispeech": 2.35, "earnings22-12hours": 12.05}, "qoi": 0.9}
22
+ {"model": "openai/whisper-base", "timestamp": "2024-10-18_20:25:50_GMT-0700", "average_wer": 10.67, "dataset_wer": {"librispeech": 4.94, "earnings22-12hours": 16.4}, "qoi": 0.67}
23
+ {"model": "openai/whisper-large-v3/turbo", "timestamp": "2024-10-20_16:58:25_GMT-0400", "average_wer": 6.86, "dataset_wer": {"librispeech": 1.97, "earnings22-12hours": 11.74}, "qoi": 0.95}
dashboard_data/support_data.csv ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ,Model,Apple M2 Pro,Apple M2 Ultra,Apple M3,Apple M3 Max,iPad Air 11-inch (M2),iPad mini (A17 Pro),iPad Pro 11-inch (M4),iPhone 11,iPhone 13 Pro,iPhone 13,iPhone 14,iPhone 16 Pro
2
+ distil-whisper_distil-large-v3,distil-whisper_distil-large-v3,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
3
+ distil-whisper_distil-large-v3_594MB,distil-whisper_distil-large-v3_594MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
4
+ distil-whisper_distil-large-v3_turbo,distil-whisper_distil-large-v3_turbo,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad14%2C8_summary_2024-10-25T032747.json>iPadOS 17.6.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C1_summary_2024-10-25T054749.json>iPadOS 18.0.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C3_summary_2024-10-25T032747.json>iPadOS 18.1</a>,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
5
+ distil-whisper_distil-large-v3_turbo_600MB,distil-whisper_distil-large-v3_turbo_600MB,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
6
+ openai_whisper-base,openai_whisper-base,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,✅ iOS 17.6.1,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
7
+ openai_whisper-base.en,openai_whisper-base.en,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,✅ iOS 17.6.1,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
8
+ openai_whisper-large-v2,openai_whisper-large-v2,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
9
+ openai_whisper-large-v2_949MB,openai_whisper-large-v2_949MB,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPhone14%2C5_summary_2024-10-25T032747.json>iOS 17.3</a>,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
10
+ openai_whisper-large-v2_turbo,openai_whisper-large-v2_turbo,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad14%2C8_summary_2024-10-25T032747.json>iPadOS 17.6.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C1_summary_2024-10-25T054749.json>iPadOS 18.0.1</a>,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
11
+ openai_whisper-large-v2_turbo_955MB,openai_whisper-large-v2_turbo_955MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
12
+ openai_whisper-large-v3,openai_whisper-large-v3,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
13
+ openai_whisper-large-v3-v20240930,openai_whisper-large-v3-v20240930,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad14%2C8_summary_2024-10-25T032747.json>iPadOS 17.6.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C1_summary_2024-10-25T054749.json>iPadOS 18.0.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C3_summary_2024-10-25T032747.json>iPadOS 18.1</a>,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
14
+ openai_whisper-large-v3-v20240930_626MB,openai_whisper-large-v3-v20240930_626MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
15
+ openai_whisper-large-v3-v20240930_turbo,openai_whisper-large-v3-v20240930_turbo,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,Not Supported,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C1_summary_2024-10-25T054749.json>iPadOS 18.0.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C3_summary_2024-10-25T032747.json>iPadOS 18.1</a>,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
16
+ openai_whisper-large-v3-v20240930_turbo_632MB,openai_whisper-large-v3-v20240930_turbo_632MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPhone14%2C5_summary_2024-10-25T032747.json>iOS 17.3</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPhone14%2C7_summary_2024-10-25T032747.json>iOS 17.3</a>,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
17
+ openai_whisper-large-v3_947MB,openai_whisper-large-v3_947MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPhone14%2C5_summary_2024-10-25T032747.json>iOS 17.3</a>,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
18
+ openai_whisper-large-v3_turbo,openai_whisper-large-v3_turbo,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
19
+ openai_whisper-large-v3_turbo_954MB,openai_whisper-large-v3_turbo_954MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
20
+ openai_whisper-small,openai_whisper-small,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
21
+ openai_whisper-small.en,openai_whisper-small.en,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
22
+ openai_whisper-tiny,openai_whisper-tiny,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
23
+ openai_whisper-tiny.en,openai_whisper-tiny.en,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,✅ iOS 17.6.1,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
main.py ADDED
@@ -0,0 +1,1302 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Main module for the WhisperKit Evaluation Dashboard.
3
+ This module sets up and runs the Gradio interface for the WhisperKit Evaluation Dashboard,
4
+ allowing users to explore and compare speech recognition model performance across different
5
+ devices, operating systems, and datasets.
6
+ """
7
+
8
+ import json
9
+ import os
10
+ import re
11
+ from math import ceil, floor
12
+
13
+ import gradio as gr
14
+ import pandas as pd
15
+ from argmax_gradio_components import RangeSlider
16
+ from dotenv import load_dotenv
17
+ from huggingface_hub import login
18
+
19
+ # Import custom constants and utility functions
20
+ from constants import (
21
+ BANNER_TEXT,
22
+ CITATION_BUTTON_LABEL,
23
+ CITATION_BUTTON_TEXT,
24
+ COL_NAMES,
25
+ HEADER,
26
+ LANGUAGE_MAP,
27
+ METHODOLOGY_TEXT,
28
+ PERFORMANCE_TEXT,
29
+ QUALITY_TEXT,
30
+ )
31
+ from utils import (
32
+ add_datasets_to_performance_columns,
33
+ add_datasets_to_quality_columns,
34
+ calculate_parity,
35
+ create_confusion_matrix_plot,
36
+ create_initial_performance_column_dict,
37
+ create_initial_quality_column_dict,
38
+ css,
39
+ fields,
40
+ get_os_name_and_version,
41
+ make_dataset_wer_clickable_link,
42
+ make_model_name_clickable_link,
43
+ make_multilingual_model_clickable_link,
44
+ plot_metric,
45
+ read_json_line_by_line,
46
+ )
47
+
48
+ # Load environment variables
49
+ load_dotenv()
50
+
51
+ # Get the Hugging Face token from the environment variable
52
+ HF_TOKEN = os.getenv("HF_TOKEN")
53
+
54
+ # Use the token for login
55
+ login(token=HF_TOKEN, add_to_git_credential=True)
56
+
57
+ # Define repository and directory information
58
+ repo_id = "argmaxinc/whisperkit-evals-dataset"
59
+ directory = "xcresults/benchmark_results"
60
+ local_dir = ""
61
+
62
+ # Load benchmark data from JSON files
63
+ PERFORMANCE_DATA = read_json_line_by_line("dashboard_data/performance_data.json")
64
+ QUALITY_DATA = read_json_line_by_line("dashboard_data/quality_data.json")
65
+
66
+ # Convert JSON data to pandas DataFrames
67
+ quality_df = pd.json_normalize(QUALITY_DATA)
68
+ benchmark_df = pd.json_normalize(PERFORMANCE_DATA)
69
+
70
+ # Process timestamp data
71
+ benchmark_df["timestamp"] = pd.to_datetime(benchmark_df["timestamp"]).dt.tz_localize(
72
+ None
73
+ )
74
+ benchmark_df["timestamp"] = pd.to_datetime(benchmark_df["timestamp"]).dt.tz_localize(
75
+ None
76
+ )
77
+
78
+ # First create a temporary column for model length
79
+ sorted_quality_df = (
80
+ quality_df.assign(model_len=quality_df["model"].str.len())
81
+ .sort_values(
82
+ by=["model_len", "model", "timestamp"],
83
+ ascending=[True, True, False],
84
+ )
85
+ .drop(columns=["model_len"])
86
+ .drop_duplicates(subset=["model"], keep="first")
87
+ .reset_index(drop=True)
88
+ )
89
+
90
+ sorted_performance_df = (
91
+ benchmark_df.assign(model_len=benchmark_df["model"].str.len())
92
+ .sort_values(
93
+ by=["model_len", "model", "device", "os", "timestamp"],
94
+ ascending=[True, True, True, True, False],
95
+ )
96
+ .drop(columns=["model_len"])
97
+ .drop_duplicates(subset=["model", "device", "os"], keep="first")
98
+ .reset_index(drop=True)
99
+ )
100
+
101
+ # Identify dataset-specific columns
102
+ dataset_wer_columns = [
103
+ col for col in sorted_quality_df.columns if col.startswith("dataset_wer.")
104
+ ]
105
+ dataset_speed_columns = [
106
+ col for col in sorted_performance_df.columns if col.startswith("dataset_speed.")
107
+ ]
108
+ dataset_toks_columns = [
109
+ col
110
+ for col in sorted_performance_df.columns
111
+ if col.startswith("dataset_tokens_per_second.")
112
+ ]
113
+
114
+ # Extract dataset names
115
+ QUALITY_DATASETS = [col.split(".")[-1] for col in dataset_wer_columns]
116
+ PERFORMANCE_DATASETS = [col.split(".")[-1] for col in dataset_speed_columns]
117
+
118
+ # Prepare DataFrames for display
119
+ model_df = sorted_quality_df[
120
+ ["model", "average_wer", "qoi", "timestamp"] + dataset_wer_columns
121
+ ]
122
+ performance_df = sorted_performance_df[
123
+ [
124
+ "model",
125
+ "device",
126
+ "os",
127
+ "average_wer",
128
+ "qoi",
129
+ "speed",
130
+ "tokens_per_second",
131
+ "timestamp",
132
+ ]
133
+ + dataset_speed_columns
134
+ + dataset_toks_columns
135
+ ].copy()
136
+
137
+ # Rename columns for clarity
138
+ performance_df = performance_df.rename(
139
+ lambda x: COL_NAMES[x] if x in COL_NAMES else x, axis="columns"
140
+ )
141
+ model_df = model_df.rename(
142
+ lambda x: COL_NAMES[x] if x in COL_NAMES else x, axis="columns"
143
+ )
144
+
145
+ # Process dataset-specific columns
146
+ for col in dataset_wer_columns:
147
+ dataset_name = col.split(".")[-1]
148
+ model_df = model_df.rename(columns={col: dataset_name})
149
+ model_df[dataset_name] = model_df.apply(
150
+ lambda x: make_dataset_wer_clickable_link(x, dataset_name), axis=1
151
+ )
152
+
153
+ for col in dataset_speed_columns:
154
+ dataset_name = col.split(".")[-1]
155
+ performance_df = performance_df.rename(
156
+ columns={
157
+ col: f"{'Short-Form' if dataset_name == 'librispeech-10mins' else 'Long-Form'} Speed"
158
+ }
159
+ )
160
+
161
+ for col in dataset_toks_columns:
162
+ dataset_name = col.split(".")[-1]
163
+ performance_df = performance_df.rename(
164
+ columns={
165
+ col: f"{'Short-Form' if dataset_name == 'librispeech-10mins' else 'Long-Form'} Tok/s"
166
+ }
167
+ )
168
+
169
+ # Calculate parity with M2 Ultra
170
+ m2_ultra_wer = (
171
+ performance_df[performance_df["Device"] == "Apple M2 Ultra"]
172
+ .groupby("Model")["Average WER"]
173
+ .first()
174
+ )
175
+ performance_df["Parity %"] = performance_df.apply(
176
+ lambda row: calculate_parity(m2_ultra_wer, row), axis=1
177
+ )
178
+
179
+ # Process model names for display
180
+ model_df["model_raw"] = model_df["Model"].copy()
181
+ performance_df["model_raw"] = performance_df["Model"].copy()
182
+ model_df["Model"] = model_df["Model"].apply(lambda x: make_model_name_clickable_link(x))
183
+ performance_df["Model"] = performance_df["Model"].apply(
184
+ lambda x: make_model_name_clickable_link(x)
185
+ )
186
+
187
+ # Extract unique devices and OS versions
188
+ PERFORMANCE_DEVICES = performance_df["Device"].unique().tolist()
189
+ PERFORMANCE_OS = performance_df["OS"].apply(get_os_name_and_version).unique().tolist()
190
+ PERFORMANCE_OS.sort()
191
+
192
+ # Create initial column dictionaries and update with dataset information
193
+ initial_performance_column_dict = create_initial_performance_column_dict()
194
+ initial_quality_column_dict = create_initial_quality_column_dict()
195
+
196
+ performance_column_info = add_datasets_to_performance_columns(
197
+ initial_performance_column_dict, PERFORMANCE_DATASETS
198
+ )
199
+ quality_column_info = add_datasets_to_quality_columns(
200
+ initial_quality_column_dict, QUALITY_DATASETS
201
+ )
202
+
203
+ # Unpack the returned dictionaries
204
+ updated_performance_column_dict = performance_column_info["column_dict"]
205
+ updated_quality_column_dict = quality_column_info["column_dict"]
206
+
207
+ PerformanceAutoEvalColumn = performance_column_info["AutoEvalColumn"]
208
+ QualityAutoEvalColumn = quality_column_info["AutoEvalColumn"]
209
+
210
+ # Define column sets for different views
211
+ PERFORMANCE_COLS = performance_column_info["COLS"]
212
+ QUALITY_COLS = quality_column_info["COLS"]
213
+ PERFORMANCE_TYPES = performance_column_info["TYPES"]
214
+ QUALITY_TYPES = quality_column_info["TYPES"]
215
+ PERFORMANCE_ALWAYS_HERE_COLS = performance_column_info["ALWAYS_HERE_COLS"]
216
+ QUALITY_ALWAYS_HERE_COLS = quality_column_info["ALWAYS_HERE_COLS"]
217
+ PERFORMANCE_TOGGLE_COLS = performance_column_info["TOGGLE_COLS"]
218
+ QUALITY_TOGGLE_COLS = quality_column_info["TOGGLE_COLS"]
219
+ PERFORMANCE_SELECTED_COLS = performance_column_info["SELECTED_COLS"]
220
+ QUALITY_SELECTED_COLS = quality_column_info["SELECTED_COLS"]
221
+
222
+
223
+ def performance_filter(
224
+ df,
225
+ columns,
226
+ model_query,
227
+ exclude_models,
228
+ devices,
229
+ os,
230
+ short_speed_slider,
231
+ long_speed_slider,
232
+ short_toks_slider,
233
+ long_toks_slider,
234
+ ):
235
+ """
236
+ Filters the performance DataFrame based on specified criteria.
237
+ :param df: The DataFrame to be filtered.
238
+ :param columns: The columns to be included in the filtered DataFrame.
239
+ :param model_query: The query string to filter the 'Model' column.
240
+ :param exclude_models: Models to exclude from the results.
241
+ :param devices: The devices to filter the 'Device' column.
242
+ :param os: The list of operating systems to filter the 'OS' column.
243
+ :param short_speed_slider: The range of values to filter the 'Short-Form Speed' column.
244
+ :param long_speed_slider: The range of values to filter the 'Long-Form Speed' column.
245
+ :param short_toks_slider: The range of values to filter the 'Short-Form Tok/s' column.
246
+ :param long_toks_slider: The range of values to filter the 'Long-Form Tok/s' column.
247
+ :return: The filtered DataFrame.
248
+ """
249
+ # Select columns based on input and always-present columns
250
+ filtered_df = df[
251
+ PERFORMANCE_ALWAYS_HERE_COLS
252
+ + [c for c in PERFORMANCE_COLS if c in df.columns and c in columns]
253
+ ]
254
+
255
+ # Filter models based on query
256
+ if model_query:
257
+ filtered_df = filtered_df[
258
+ filtered_df["Model"].str.contains(
259
+ "|".join(q.strip() for q in model_query.split(";")), case=False
260
+ )
261
+ ]
262
+
263
+ # Exclude specified models
264
+ if exclude_models:
265
+ exclude_list = [m.strip() for m in exclude_models.split(";")]
266
+ filtered_df = filtered_df[
267
+ ~filtered_df["Model"].str.contains("|".join(exclude_list), case=False)
268
+ ]
269
+
270
+ # Filter by devices
271
+ filtered_df = (
272
+ filtered_df[
273
+ (
274
+ filtered_df["Device"].str.contains(
275
+ "|".join(re.escape(q.strip()) for q in devices), case=False
276
+ )
277
+ )
278
+ ]
279
+ if devices
280
+ else pd.DataFrame(columns=filtered_df.columns)
281
+ )
282
+
283
+ # Filter by operating systems
284
+ filtered_df = (
285
+ filtered_df[
286
+ (
287
+ filtered_df["OS"].str.contains(
288
+ "|".join(q.strip() for q in os), case=False
289
+ )
290
+ )
291
+ ]
292
+ if os
293
+ else pd.DataFrame(columns=filtered_df.columns)
294
+ )
295
+
296
+ # Apply short-form and long-form speed and tokens per second filters
297
+ min_short_speed, max_short_speed = short_speed_slider
298
+ min_long_speed, max_long_speed = long_speed_slider
299
+ min_short_toks, max_short_toks = short_toks_slider
300
+ min_long_toks, max_long_toks = long_toks_slider
301
+
302
+ filtered_df = filtered_df[
303
+ (filtered_df["Short-Form Speed"] >= min_short_speed)
304
+ & (filtered_df["Short-Form Speed"] <= max_short_speed)
305
+ & (filtered_df["Long-Form Speed"] >= min_long_speed)
306
+ & (filtered_df["Long-Form Speed"] <= max_long_speed)
307
+ & (filtered_df["Short-Form Tok/s"] >= min_short_toks)
308
+ & (filtered_df["Short-Form Tok/s"] <= max_short_toks)
309
+ & (filtered_df["Long-Form Tok/s"] >= min_long_toks)
310
+ & (filtered_df["Long-Form Tok/s"] <= max_long_toks)
311
+ ]
312
+
313
+ return filtered_df
314
+
315
+
316
+ def quality_filter(df, columns, model_query, wer_slider, qoi_slider, exclude_models):
317
+ """
318
+ Filters the quality DataFrame based on specified criteria.
319
+ :param df: The DataFrame to be filtered.
320
+ :param columns: The columns to be included in the filtered DataFrame.
321
+ :param model_query: The query string to filter the 'Model' column.
322
+ :param wer_slider: The range of values to filter the 'Average WER' column.
323
+ :param qoi_slider: The range of values to filter the 'QoI' column.
324
+ :param exclude_models: Models to exclude from the results.
325
+ :return: The filtered DataFrame.
326
+ """
327
+ # Select columns based on input and always-present columns
328
+ filtered_df = df[
329
+ QUALITY_ALWAYS_HERE_COLS
330
+ + [c for c in QUALITY_COLS if c in df.columns and c in columns]
331
+ ]
332
+
333
+ # Filter models based on query
334
+ if model_query:
335
+ filtered_df = filtered_df[
336
+ filtered_df["Model"].str.contains(
337
+ "|".join(q.strip() for q in model_query.split(";")), case=False
338
+ )
339
+ ]
340
+
341
+ # Exclude specified models
342
+ if exclude_models:
343
+ exclude_list = [m.strip() for m in exclude_models.split(";")]
344
+ filtered_df = filtered_df[
345
+ ~filtered_df["Model"].str.contains("|".join(exclude_list), case=False)
346
+ ]
347
+
348
+ # Apply WER and QoI filters
349
+ min_wer_slider, max_wer_slider = wer_slider
350
+ min_qoi_slider, max_qoi_slider = qoi_slider
351
+ if "Average WER" in filtered_df.columns:
352
+ filtered_df = filtered_df[
353
+ (filtered_df["Average WER"] >= min_wer_slider)
354
+ & (filtered_df["Average WER"] <= max_wer_slider)
355
+ ]
356
+ if "QoI" in filtered_df.columns:
357
+ filtered_df = filtered_df[
358
+ (filtered_df["QoI"] >= min_qoi_slider)
359
+ & (filtered_df["QoI"] <= max_qoi_slider)
360
+ ]
361
+
362
+ return filtered_df
363
+
364
+
365
+ diff_tab = gr.TabItem("Difference Checker", elem_id="diff_checker", id=2)
366
+ text_diff_elems = []
367
+
368
+ tabs = gr.Tabs(elem_id="tab-elems")
369
+
370
+ multilingual_df = pd.read_csv("dashboard_data/multilingual_results.csv")
371
+ multilingual_models_df = multilingual_df[["Model"]].drop_duplicates()
372
+ multilingual_models_buttons = []
373
+ for model in multilingual_models_df["Model"]:
374
+ elem_id = (
375
+ f"{model}".replace(" ", "_").replace('"', "").replace("'", "").replace(",", "")
376
+ )
377
+ multilingual_models_buttons.append(
378
+ gr.Button(value=model, elem_id=elem_id, visible=False)
379
+ )
380
+ multilingual_models_df["Model"] = multilingual_models_df["Model"].apply(
381
+ lambda x: make_multilingual_model_clickable_link(x)
382
+ )
383
+
384
+ with open("dashboard_data/multilingual_confusion_matrices.json", "r") as file:
385
+ confusion_matrix_map = dict(json.load(file))
386
+
387
+
388
+ def update_multilingual_results(selected_model):
389
+ """
390
+ Updates the multilingual results display based on the selected model.
391
+
392
+ This function processes the multilingual data for the chosen model,
393
+ calculates average WER for different scenarios (language hinted vs. predicted),
394
+ and prepares language-specific WER data for display.
395
+
396
+ :param selected_model: The name of the selected model
397
+ :return: A list containing updated components for the Gradio interface
398
+ """
399
+ if selected_model is None:
400
+ return "# Select a model from the dropdown to view results."
401
+
402
+ # Filter data for the selected model
403
+ model_data = multilingual_df[multilingual_df["Model"] == selected_model]
404
+
405
+ if model_data.empty:
406
+ return f"# No data available for model: {selected_model}"
407
+
408
+ # Separate data for forced and not forced scenarios
409
+ forced_data = model_data[model_data["Forced Tokens"] == True]
410
+ not_forced_data = model_data[model_data["Forced Tokens"] == False]
411
+
412
+ result_text = f"# Model: {selected_model}\n\n"
413
+
414
+ # Prepare average WER data
415
+ average_wer_data = []
416
+ if not forced_data.empty:
417
+ average_wer_data.append(
418
+ {
419
+ "Scenario": "Language Hinted",
420
+ "Average WER": forced_data.iloc[0]["Average WER"],
421
+ }
422
+ )
423
+ if not not_forced_data.empty:
424
+ average_wer_data.append(
425
+ {
426
+ "Scenario": "Language Predicted",
427
+ "Average WER": not_forced_data.iloc[0]["Average WER"],
428
+ }
429
+ )
430
+ average_wer_df = pd.DataFrame(average_wer_data)
431
+ average_wer_df["Average WER"] = average_wer_df["Average WER"].apply(
432
+ lambda x: round(x, 2)
433
+ )
434
+
435
+ # Prepare language-specific WER data
436
+ lang_columns = [col for col in model_data.columns if col.startswith("WER_")]
437
+ lang_wer_data = []
438
+ for column in lang_columns:
439
+ lang = column.split("_")[1]
440
+ forced_wer = forced_data[column].iloc[0] if not forced_data.empty else None
441
+ not_forced_wer = (
442
+ not_forced_data[column].iloc[0] if not not_forced_data.empty else None
443
+ )
444
+ if forced_wer is not None or not_forced_wer is not None:
445
+ lang_wer_data.append(
446
+ {
447
+ "Language": LANGUAGE_MAP[lang],
448
+ "Language Hinted WER": round(forced_wer, 2)
449
+ if forced_wer is not None
450
+ else "N/A",
451
+ "Language Predicted WER": round(not_forced_wer, 2)
452
+ if not_forced_wer is not None
453
+ else "N/A",
454
+ }
455
+ )
456
+ lang_wer_df = pd.DataFrame(lang_wer_data)
457
+ lang_wer_df = lang_wer_df.fillna("No Data")
458
+
459
+ # Create confusion matrix plot for unforced scenario
460
+ unforced_plot = None
461
+ if selected_model in confusion_matrix_map:
462
+ if "not_forced" in confusion_matrix_map[selected_model]:
463
+ unforced_plot = create_confusion_matrix_plot(
464
+ confusion_matrix_map[selected_model]["not_forced"]["matrix"],
465
+ confusion_matrix_map[selected_model]["not_forced"]["labels"],
466
+ False,
467
+ )
468
+
469
+ # Return updated components for Gradio interface
470
+ return [
471
+ gr.update(value=result_text),
472
+ gr.update(visible=True, value=average_wer_df),
473
+ gr.update(visible=True, value=lang_wer_df),
474
+ gr.update(visible=unforced_plot is not None, value=unforced_plot),
475
+ ]
476
+
477
+
478
+ font = [
479
+ "Zwizz Regular", # Local font
480
+ "IBM Plex Mono", # Monospace font
481
+ "ui-sans-serif",
482
+ "system-ui",
483
+ "sans-serif",
484
+ ]
485
+
486
+ # Define the Gradio interface
487
+ with gr.Blocks(css=css, theme=gr.themes.Base(font=font)) as demo:
488
+ # Add header and banner to the interface
489
+ gr.HTML(HEADER)
490
+ gr.HTML(BANNER_TEXT, elem_classes="markdown-text")
491
+
492
+ # Create tabs for different sections of the dashboard
493
+ with tabs.render():
494
+ # Performance Tab
495
+ with gr.TabItem("Performance", elem_id="benchmark", id=0):
496
+ with gr.Row():
497
+ with gr.Column(scale=1):
498
+ with gr.Row():
499
+ with gr.Column(scale=6, elem_classes="filter_models_column"):
500
+ filter_performance_models = gr.Textbox(
501
+ placeholder="🔍 Filter Model (separate multiple queries with ';')",
502
+ label="Filter Models",
503
+ )
504
+ with gr.Column(scale=4, elem_classes="exclude_models_column"):
505
+ exclude_performance_models = gr.Textbox(
506
+ placeholder="🔍 Exclude (separate multiple queries with ';')",
507
+ label="Exclude Models",
508
+ )
509
+ with gr.Row():
510
+ with gr.Accordion("See All Columns", open=False):
511
+ with gr.Row():
512
+ with gr.Column(scale=9, elem_id="performance_columns"):
513
+ performance_shown_columns = gr.CheckboxGroup(
514
+ choices=PERFORMANCE_TOGGLE_COLS,
515
+ value=PERFORMANCE_SELECTED_COLS,
516
+ label="Toggle Columns",
517
+ elem_id="column-select",
518
+ interactive=True,
519
+ )
520
+ with gr.Column(
521
+ scale=1,
522
+ min_width=200,
523
+ elem_id="performance_select_columns",
524
+ ):
525
+ with gr.Row():
526
+ select_all_button = gr.Button(
527
+ "Select All",
528
+ elem_id="select-all-button",
529
+ interactive=True,
530
+ )
531
+ deselect_all_button = gr.Button(
532
+ "Deselect All",
533
+ elem_id="deselect-all-button",
534
+ interactive=True,
535
+ )
536
+
537
+ def select_all_columns():
538
+ return PERFORMANCE_TOGGLE_COLS
539
+
540
+ def deselect_all_columns():
541
+ return []
542
+
543
+ select_all_button.click(
544
+ select_all_columns,
545
+ inputs=[],
546
+ outputs=performance_shown_columns,
547
+ )
548
+ deselect_all_button.click(
549
+ deselect_all_columns,
550
+ inputs=[],
551
+ outputs=performance_shown_columns,
552
+ )
553
+
554
+ with gr.Row():
555
+ with gr.Accordion("Filter Devices", open=False):
556
+ with gr.Row():
557
+ with gr.Column(
558
+ scale=9, elem_id="filter_devices_column"
559
+ ):
560
+ performance_shown_devices = gr.CheckboxGroup(
561
+ choices=PERFORMANCE_DEVICES,
562
+ value=PERFORMANCE_DEVICES,
563
+ label="Filter Devices",
564
+ interactive=True,
565
+ )
566
+ with gr.Column(
567
+ scale=1,
568
+ min_width=200,
569
+ elem_id="filter_select_devices",
570
+ ):
571
+ with gr.Row():
572
+ select_all_devices_button = gr.Button(
573
+ "Select All",
574
+ elem_id="select-all-devices-button",
575
+ interactive=True,
576
+ )
577
+ deselect_all_devices_button = gr.Button(
578
+ "Deselect All",
579
+ elem_id="deselect-all-devices-button",
580
+ interactive=True,
581
+ )
582
+
583
+ def select_all_devices():
584
+ return PERFORMANCE_DEVICES
585
+
586
+ def deselect_all_devices():
587
+ return []
588
+
589
+ select_all_devices_button.click(
590
+ select_all_devices,
591
+ inputs=[],
592
+ outputs=performance_shown_devices,
593
+ )
594
+ deselect_all_devices_button.click(
595
+ deselect_all_devices,
596
+ inputs=[],
597
+ outputs=performance_shown_devices,
598
+ )
599
+ with gr.Row():
600
+ performance_shown_os = gr.CheckboxGroup(
601
+ choices=PERFORMANCE_OS,
602
+ value=PERFORMANCE_OS,
603
+ label="Filter OS",
604
+ interactive=True,
605
+ )
606
+ with gr.Column(scale=1):
607
+ with gr.Accordion("See Performance Filters"):
608
+ with gr.Row():
609
+ with gr.Row():
610
+ min_short_speed, max_short_speed = floor(
611
+ min(performance_df["Short-Form Speed"])
612
+ ), ceil(max(performance_df["Short-Form Speed"]))
613
+ short_speed_slider = RangeSlider(
614
+ value=[min_short_speed, max_short_speed],
615
+ minimum=min_short_speed,
616
+ maximum=max_short_speed,
617
+ step=0.001,
618
+ label="Short-Form Speed",
619
+ )
620
+ with gr.Row():
621
+ min_long_speed, max_long_speed = floor(
622
+ min(performance_df["Long-Form Speed"])
623
+ ), ceil(max(performance_df["Long-Form Speed"]))
624
+ long_speed_slider = RangeSlider(
625
+ value=[min_long_speed, max_long_speed],
626
+ minimum=min_long_speed,
627
+ maximum=max_long_speed,
628
+ step=0.001,
629
+ label="Long-Form Speed",
630
+ )
631
+ with gr.Row():
632
+ with gr.Row():
633
+ min_short_toks, max_short_toks = floor(
634
+ min(performance_df["Short-Form Tok/s"])
635
+ ), ceil(max(performance_df["Short-Form Tok/s"]))
636
+ short_toks_slider = RangeSlider(
637
+ value=[min_short_toks, max_short_toks],
638
+ minimum=min_short_toks,
639
+ maximum=max_short_toks,
640
+ step=0.001,
641
+ label="Short-Form Tok/s",
642
+ )
643
+ with gr.Row():
644
+ min_long_toks, max_long_toks = floor(
645
+ min(performance_df["Long-Form Tok/s"])
646
+ ), ceil(max(performance_df["Long-Form Tok/s"]))
647
+ long_toks_slider = RangeSlider(
648
+ value=[min_long_toks, max_long_toks],
649
+ minimum=min_long_toks,
650
+ maximum=max_long_toks,
651
+ step=0.001,
652
+ label="Long-Form Tok/s",
653
+ )
654
+ with gr.Row():
655
+ gr.Markdown(PERFORMANCE_TEXT, elem_classes="markdown-text")
656
+ with gr.Row():
657
+ leaderboard_df = gr.components.Dataframe(
658
+ value=performance_df[
659
+ PERFORMANCE_ALWAYS_HERE_COLS + performance_shown_columns.value
660
+ ],
661
+ headers=[
662
+ PERFORMANCE_ALWAYS_HERE_COLS + performance_shown_columns.value
663
+ ],
664
+ datatype=[
665
+ c.type
666
+ for c in fields(PerformanceAutoEvalColumn)
667
+ if c.name in PERFORMANCE_COLS
668
+ ],
669
+ elem_id="leaderboard-table",
670
+ elem_classes="large-table",
671
+ interactive=False,
672
+ )
673
+
674
+ # Copy of the leaderboard dataframe to apply filters to
675
+ hidden_leaderboard_df = gr.components.Dataframe(
676
+ value=performance_df,
677
+ headers=PERFORMANCE_COLS,
678
+ datatype=[
679
+ c.type
680
+ for c in fields(PerformanceAutoEvalColumn)
681
+ if c.name in PERFORMANCE_COLS
682
+ ],
683
+ visible=False,
684
+ )
685
+
686
+ # Inputs for the dataframe filter function
687
+ performance_filter_inputs = [
688
+ hidden_leaderboard_df,
689
+ performance_shown_columns,
690
+ filter_performance_models,
691
+ exclude_performance_models,
692
+ performance_shown_devices,
693
+ performance_shown_os,
694
+ short_speed_slider,
695
+ long_speed_slider,
696
+ short_toks_slider,
697
+ long_toks_slider,
698
+ ]
699
+
700
+ filter_output = leaderboard_df
701
+ filter_performance_models.change(
702
+ performance_filter, performance_filter_inputs, filter_output
703
+ )
704
+ exclude_performance_models.change(
705
+ performance_filter, performance_filter_inputs, filter_output
706
+ )
707
+ performance_shown_columns.change(
708
+ performance_filter, performance_filter_inputs, filter_output
709
+ )
710
+ performance_shown_devices.change(
711
+ performance_filter, performance_filter_inputs, filter_output
712
+ )
713
+ performance_shown_os.change(
714
+ performance_filter, performance_filter_inputs, filter_output
715
+ )
716
+ short_speed_slider.change(
717
+ performance_filter, performance_filter_inputs, filter_output
718
+ )
719
+ long_speed_slider.change(
720
+ performance_filter, performance_filter_inputs, filter_output
721
+ )
722
+ short_toks_slider.change(
723
+ performance_filter, performance_filter_inputs, filter_output
724
+ )
725
+ long_toks_slider.change(
726
+ performance_filter, performance_filter_inputs, filter_output
727
+ )
728
+
729
+ # English Quality Tab
730
+ with gr.TabItem("English Quality", elem_id="timeline", id=1):
731
+ with gr.Row():
732
+ with gr.Column(scale=1):
733
+ with gr.Row():
734
+ with gr.Column(scale=6, elem_classes="filter_models_column"):
735
+ filter_quality_models = gr.Textbox(
736
+ placeholder="🔍 Filter Model (separate multiple queries with ';')",
737
+ label="Filter Models",
738
+ )
739
+ with gr.Column(scale=4, elem_classes="exclude_models_column"):
740
+ exclude_quality_models = gr.Textbox(
741
+ placeholder="🔍 Exclude Model (separate multiple models with ';')",
742
+ label="Exclude Models",
743
+ )
744
+ with gr.Row():
745
+ with gr.Accordion("See All Columns", open=False):
746
+ quality_shown_columns = gr.CheckboxGroup(
747
+ choices=QUALITY_TOGGLE_COLS,
748
+ value=QUALITY_SELECTED_COLS,
749
+ label="Toggle Columns",
750
+ elem_id="column-select",
751
+ interactive=True,
752
+ )
753
+ with gr.Column(scale=1):
754
+ with gr.Accordion("See Quality Filters"):
755
+ with gr.Row():
756
+ with gr.Row():
757
+ quality_min_avg_wer, quality_max_avg_wer = (
758
+ floor(min(model_df["Average WER"])),
759
+ ceil(max(model_df["Average WER"])) + 1,
760
+ )
761
+ wer_slider = RangeSlider(
762
+ value=[quality_min_avg_wer, quality_max_avg_wer],
763
+ minimum=quality_min_avg_wer,
764
+ maximum=quality_max_avg_wer,
765
+ label="Average WER",
766
+ )
767
+ with gr.Row():
768
+ quality_min_qoi, quality_max_qoi = floor(
769
+ min(model_df["QoI"])
770
+ ), ceil(max(model_df["QoI"] + 1))
771
+ qoi_slider = RangeSlider(
772
+ value=[quality_min_qoi, quality_max_qoi],
773
+ minimum=quality_min_qoi,
774
+ maximum=quality_max_qoi,
775
+ label="QoI",
776
+ )
777
+ with gr.Row():
778
+ gr.Markdown(QUALITY_TEXT)
779
+ with gr.Row():
780
+ quality_leaderboard_df = gr.components.Dataframe(
781
+ value=model_df[
782
+ QUALITY_ALWAYS_HERE_COLS + quality_shown_columns.value
783
+ ],
784
+ headers=[QUALITY_ALWAYS_HERE_COLS + quality_shown_columns.value],
785
+ datatype=[
786
+ c.type
787
+ for c in fields(QualityAutoEvalColumn)
788
+ if c.name in QUALITY_COLS
789
+ ],
790
+ elem_id="leaderboard-table",
791
+ elem_classes="large-table",
792
+ interactive=False,
793
+ )
794
+
795
+ # Copy of the leaderboard dataframe to apply filters to
796
+ hidden_quality_leaderboard_df = gr.components.Dataframe(
797
+ value=model_df,
798
+ headers=QUALITY_COLS,
799
+ datatype=[
800
+ c.type
801
+ for c in fields(QualityAutoEvalColumn)
802
+ if c.name in QUALITY_COLS
803
+ ],
804
+ visible=False,
805
+ )
806
+
807
+ # Inputs for the dataframe filter function
808
+ filter_inputs = [
809
+ hidden_quality_leaderboard_df,
810
+ quality_shown_columns,
811
+ filter_quality_models,
812
+ wer_slider,
813
+ qoi_slider,
814
+ exclude_quality_models,
815
+ ]
816
+ filter_output = quality_leaderboard_df
817
+ filter_quality_models.change(
818
+ quality_filter, filter_inputs, filter_output
819
+ )
820
+ exclude_quality_models.change(
821
+ quality_filter, filter_inputs, filter_output
822
+ )
823
+ quality_shown_columns.change(
824
+ quality_filter, filter_inputs, filter_output
825
+ )
826
+ wer_slider.change(quality_filter, filter_inputs, filter_output)
827
+ qoi_slider.change(quality_filter, filter_inputs, filter_output)
828
+
829
+ # Timeline Tab
830
+ with gr.TabItem("Timeline", elem_id="timeline", id=4):
831
+ # Create subtabs for different metrics
832
+ with gr.Tabs():
833
+ with gr.TabItem("QoI", id=0):
834
+ with gr.Row():
835
+ with gr.Column(scale=6):
836
+ filter_qoi = gr.Textbox(
837
+ placeholder="🔍 Filter Model-Device-OS (separate multiple queries with ';')",
838
+ label="Filter",
839
+ )
840
+ with gr.Column(scale=4):
841
+ exclude_qoi = gr.Textbox(
842
+ placeholder="🔍 Exclude Model-Device-OS (separate multiple with ';')",
843
+ label="Exclude",
844
+ )
845
+ with gr.Row():
846
+ with gr.Column():
847
+ qoi_plot = gr.Plot(container=True)
848
+ demo.load(
849
+ lambda x, y, z: plot_metric(
850
+ x,
851
+ "qoi",
852
+ "QoI",
853
+ "QoI Over Time for Model-Device-OS Combinations",
854
+ y,
855
+ z,
856
+ ),
857
+ [
858
+ gr.Dataframe(benchmark_df, visible=False),
859
+ filter_qoi,
860
+ exclude_qoi,
861
+ ],
862
+ qoi_plot,
863
+ )
864
+ filter_qoi.change(
865
+ lambda x, y, z: plot_metric(
866
+ x,
867
+ "qoi",
868
+ "QoI",
869
+ "QoI Over Time for Model-Device-OS Combinations",
870
+ y,
871
+ z,
872
+ ),
873
+ [
874
+ gr.Dataframe(benchmark_df, visible=False),
875
+ filter_qoi,
876
+ exclude_qoi,
877
+ ],
878
+ qoi_plot,
879
+ )
880
+ exclude_qoi.change(
881
+ lambda x, y, z: plot_metric(
882
+ x,
883
+ "qoi",
884
+ "QoI",
885
+ "QoI Over Time for Model-Device-OS Combinations",
886
+ y,
887
+ z,
888
+ ),
889
+ [
890
+ gr.Dataframe(benchmark_df, visible=False),
891
+ filter_qoi,
892
+ exclude_qoi,
893
+ ],
894
+ qoi_plot,
895
+ )
896
+
897
+ with gr.TabItem("Average WER", id=1):
898
+ with gr.Row():
899
+ with gr.Column(scale=6):
900
+ filter_average_wer = gr.Textbox(
901
+ placeholder="🔍 Filter Model-Device-OS (separate multiple queries with ';')",
902
+ label="Filter",
903
+ )
904
+ with gr.Column(scale=4):
905
+ exclude_average_wer = gr.Textbox(
906
+ placeholder="🔍 Exclude Model-Device-OS (separate multiple with ';')",
907
+ label="Exclude",
908
+ )
909
+ with gr.Row():
910
+ with gr.Column():
911
+ average_wer_plot = gr.Plot(container=True)
912
+ demo.load(
913
+ lambda x, y, z: plot_metric(
914
+ x,
915
+ "average_wer",
916
+ "Average WER",
917
+ "Average WER Over Time for Model-Device-OS Combinations",
918
+ y,
919
+ z,
920
+ ),
921
+ [
922
+ gr.Dataframe(benchmark_df, visible=False),
923
+ filter_average_wer,
924
+ exclude_average_wer,
925
+ ],
926
+ average_wer_plot,
927
+ )
928
+ filter_average_wer.change(
929
+ lambda x, y, z: plot_metric(
930
+ x,
931
+ "average_wer",
932
+ "Average WER",
933
+ "Average WER Over Time for Model-Device-OS Combinations",
934
+ y,
935
+ z,
936
+ ),
937
+ [
938
+ gr.Dataframe(benchmark_df, visible=False),
939
+ filter_average_wer,
940
+ exclude_average_wer,
941
+ ],
942
+ average_wer_plot,
943
+ )
944
+ exclude_average_wer.change(
945
+ lambda x, y, z: plot_metric(
946
+ x,
947
+ "average_wer",
948
+ "Average WER",
949
+ "Average WER Over Time for Model-Device-OS Combinations",
950
+ y,
951
+ z,
952
+ ),
953
+ [
954
+ gr.Dataframe(benchmark_df, visible=False),
955
+ filter_average_wer,
956
+ exclude_average_wer,
957
+ ],
958
+ average_wer_plot,
959
+ )
960
+
961
+ with gr.TabItem("Speed", id=2):
962
+ with gr.Row():
963
+ with gr.Column(scale=6):
964
+ filter_speed = gr.Textbox(
965
+ placeholder="🔍 Filter Model-Device-OS (separate multiple queries with ';')",
966
+ label="Filter",
967
+ )
968
+ with gr.Column(scale=4):
969
+ exclude_speed = gr.Textbox(
970
+ placeholder="🔍 Exclude Model-Device-OS (separate multiple with ';')",
971
+ label="Exclude",
972
+ )
973
+ with gr.Row():
974
+ with gr.Column():
975
+ speed_plot = gr.Plot(container=True)
976
+ demo.load(
977
+ lambda x, y, z: plot_metric(
978
+ x,
979
+ "speed",
980
+ "Speed",
981
+ "Speed Over Time for Model-Device-OS Combinations",
982
+ y,
983
+ z,
984
+ ),
985
+ [
986
+ gr.Dataframe(benchmark_df, visible=False),
987
+ filter_speed,
988
+ exclude_speed,
989
+ ],
990
+ speed_plot,
991
+ )
992
+ filter_speed.change(
993
+ lambda x, y, z: plot_metric(
994
+ x,
995
+ "speed",
996
+ "Speed",
997
+ "Speed Over Time for Model-Device-OS Combinations",
998
+ y,
999
+ z,
1000
+ ),
1001
+ [
1002
+ gr.Dataframe(benchmark_df, visible=False),
1003
+ filter_speed,
1004
+ exclude_speed,
1005
+ ],
1006
+ speed_plot,
1007
+ )
1008
+ exclude_speed.change(
1009
+ lambda x, y, z: plot_metric(
1010
+ x,
1011
+ "speed",
1012
+ "Speed",
1013
+ "Speed Over Time for Model-Device-OS Combinations",
1014
+ y,
1015
+ z,
1016
+ ),
1017
+ [
1018
+ gr.Dataframe(benchmark_df, visible=False),
1019
+ filter_speed,
1020
+ exclude_speed,
1021
+ ],
1022
+ speed_plot,
1023
+ )
1024
+
1025
+ with gr.TabItem("Tok/s", id=3):
1026
+ with gr.Row():
1027
+ with gr.Column(scale=6):
1028
+ filter_toks = gr.Textbox(
1029
+ placeholder="🔍 Filter Model-Device-OS (separate multiple queries with ';')",
1030
+ label="Filter",
1031
+ )
1032
+ with gr.Column(scale=4):
1033
+ exclude_toks = gr.Textbox(
1034
+ placeholder="🔍 Exclude Model-Device-OS (separate multiple with ';')",
1035
+ label="Exclude",
1036
+ )
1037
+ with gr.Row():
1038
+ with gr.Column():
1039
+ toks_plot = gr.Plot(container=True)
1040
+ demo.load(
1041
+ lambda x, y, z: plot_metric(
1042
+ x,
1043
+ "tokens_per_second",
1044
+ "Tok/s",
1045
+ "Tok/s Over Time for Model-Device-OS Combinations",
1046
+ y,
1047
+ z,
1048
+ ),
1049
+ [
1050
+ gr.Dataframe(benchmark_df, visible=False),
1051
+ filter_toks,
1052
+ exclude_toks,
1053
+ ],
1054
+ toks_plot,
1055
+ )
1056
+ filter_toks.change(
1057
+ lambda x, y, z: plot_metric(
1058
+ x,
1059
+ "tokens_per_second",
1060
+ "Tok/s",
1061
+ "Tok/s Over Time for Model-Device-OS Combinations",
1062
+ y,
1063
+ z,
1064
+ ),
1065
+ [
1066
+ gr.Dataframe(benchmark_df, visible=False),
1067
+ filter_toks,
1068
+ exclude_toks,
1069
+ ],
1070
+ toks_plot,
1071
+ )
1072
+ exclude_toks.change(
1073
+ lambda x, y, z: plot_metric(
1074
+ x,
1075
+ "tokens_per_second",
1076
+ "Tok/s",
1077
+ "Tok/s Over Time for Model-Device-OS Combinations",
1078
+ y,
1079
+ z,
1080
+ ),
1081
+ [
1082
+ gr.Dataframe(benchmark_df, visible=False),
1083
+ filter_toks,
1084
+ exclude_toks,
1085
+ ],
1086
+ toks_plot,
1087
+ )
1088
+
1089
+ # Multilingual Quality Tab
1090
+ with gr.TabItem("Multilingual Quality", elem_id="multilingual", id=5):
1091
+ if multilingual_df is not None:
1092
+ with gr.Row():
1093
+ with gr.Column(scale=1):
1094
+ # Display table of multilingual models
1095
+ model_table = gr.Dataframe(
1096
+ value=multilingual_models_df,
1097
+ headers=["Model"],
1098
+ datatype=["html"],
1099
+ elem_classes="left-side-table",
1100
+ )
1101
+ # Placeholders for confusion matrix plots
1102
+ with gr.Row():
1103
+ unforced_confusion_matrix = gr.Plot(visible=False)
1104
+ with gr.Row():
1105
+ forced_confusion_matrix = gr.Plot(visible=False)
1106
+
1107
+ with gr.Column(scale=1):
1108
+ # Display area for selected model results
1109
+ results_markdown = gr.Markdown(
1110
+ "# Select a model from the table on the left to view results.",
1111
+ elem_id="multilingual-results",
1112
+ )
1113
+ # Tables for displaying average WER and language-specific WER
1114
+ average_wer_table = gr.Dataframe(
1115
+ value=None, elem_id="average-wer-table", visible=False
1116
+ )
1117
+ language_wer_table = gr.Dataframe(
1118
+ value=None, elem_id="general-wer-table", visible=False
1119
+ )
1120
+
1121
+ # Set up click event to update results when a model is selected
1122
+ for button in multilingual_models_buttons:
1123
+ button.render()
1124
+ button.click(
1125
+ fn=lambda x: update_multilingual_results(x),
1126
+ inputs=[button],
1127
+ outputs=[
1128
+ results_markdown,
1129
+ average_wer_table,
1130
+ language_wer_table,
1131
+ unforced_confusion_matrix,
1132
+ ],
1133
+ )
1134
+ else:
1135
+ # Display message if no multilingual data is available
1136
+ gr.Markdown("No multilingual benchmark results available.")
1137
+
1138
+ # Device Support Tab
1139
+ with gr.TabItem("Device Support", elem_id="device_support", id=6):
1140
+ # Load device support data from CSV
1141
+ support_data = pd.read_csv("dashboard_data/support_data.csv")
1142
+ support_data.set_index(support_data.columns[0], inplace=True)
1143
+ support_data["Model"] = support_data["Model"].apply(
1144
+ lambda x: x.replace("_", "/")
1145
+ )
1146
+ support_data["Model"] = support_data["Model"].apply(
1147
+ lambda x: make_model_name_clickable_link(x)
1148
+ )
1149
+ support_data = (
1150
+ support_data.assign(model_len=support_data["Model"].str.len())
1151
+ .sort_values(
1152
+ by=["model_len"],
1153
+ ascending=[True],
1154
+ )
1155
+ .drop(columns=["model_len"])
1156
+ )
1157
+
1158
+ with gr.Row():
1159
+ with gr.Column(scale=1):
1160
+ with gr.Row():
1161
+ with gr.Column(scale=6, elem_id="filter_models_column"):
1162
+ filter_support_models = gr.Textbox(
1163
+ placeholder="🔍 Filter Model (separate multiple queries with ';')",
1164
+ label="Filter Models",
1165
+ )
1166
+ with gr.Column(scale=4, elem_classes="exclude_models_column"):
1167
+ exclude_support_models = gr.Textbox(
1168
+ placeholder="🔍 Exclude Model (separate multiple models with ';')",
1169
+ label="Exclude Models",
1170
+ )
1171
+ with gr.Row():
1172
+ with gr.Accordion("See All Columns", open=False):
1173
+ with gr.Row():
1174
+ with gr.Column(scale=9):
1175
+ support_shown_columns = gr.CheckboxGroup(
1176
+ choices=support_data.columns.tolist()[
1177
+ 1:
1178
+ ], # Exclude 'Model' column
1179
+ value=support_data.columns.tolist()[1:],
1180
+ label="Toggle Columns",
1181
+ elem_id="support-column-select",
1182
+ interactive=True,
1183
+ )
1184
+ with gr.Column(scale=1, min_width=200):
1185
+ with gr.Row():
1186
+ select_all_support_button = gr.Button(
1187
+ "Select All",
1188
+ elem_id="select-all-support-button",
1189
+ interactive=True,
1190
+ )
1191
+ deselect_all_support_button = gr.Button(
1192
+ "Deselect All",
1193
+ elem_id="deselect-all-support-button",
1194
+ interactive=True,
1195
+ )
1196
+ with gr.Column():
1197
+ gr.Markdown(
1198
+ """
1199
+ ### Legend
1200
+ - ✅ Supported: The model is supported and tested on this device.
1201
+ - ⚠️ Failed: Either The model tests failed on this device or the Speed Factor for the test is less than 1.
1202
+ - ? Not Tested: The model is supported on this device but no test information available.
1203
+ - Not Supported: The model is not supported on this device as per the [WhisperKit configuration](https://huggingface.co/argmaxinc/whisperkit-coreml/blob/main/config.json).
1204
+ """
1205
+ )
1206
+
1207
+ # Display device support data in a table
1208
+ device_support_table = gr.Dataframe(
1209
+ value=support_data,
1210
+ headers=support_data.columns.tolist(),
1211
+ datatype=["html" for _ in support_data.columns],
1212
+ elem_id="device-support-table",
1213
+ elem_classes="large-table",
1214
+ interactive=False,
1215
+ )
1216
+
1217
+ # Hidden dataframe to store the original data
1218
+ hidden_support_df = gr.Dataframe(value=support_data, visible=False)
1219
+
1220
+ def filter_support_data(df, columns, model_query, exclude_models):
1221
+ filtered_df = df.copy()
1222
+
1223
+ # Filter models based on query
1224
+ if model_query:
1225
+ filtered_df = filtered_df[
1226
+ filtered_df["Model"].str.contains(
1227
+ "|".join(q.strip() for q in model_query.split(";")),
1228
+ case=False,
1229
+ regex=True,
1230
+ )
1231
+ ]
1232
+
1233
+ # Exclude specified models
1234
+ if exclude_models:
1235
+ exclude_list = [
1236
+ re.escape(m.strip()) for m in exclude_models.split(";")
1237
+ ]
1238
+ filtered_df = filtered_df[
1239
+ ~filtered_df["Model"].str.contains(
1240
+ "|".join(exclude_list), case=False, regex=True
1241
+ )
1242
+ ]
1243
+
1244
+ # Select columns
1245
+ selected_columns = ["Model"] + [
1246
+ col for col in columns if col in df.columns
1247
+ ]
1248
+ filtered_df = filtered_df[selected_columns]
1249
+
1250
+ return filtered_df
1251
+
1252
+ def select_all_support_columns():
1253
+ return support_data.columns.tolist()[1:] # Exclude 'Model' column
1254
+
1255
+ def deselect_all_support_columns():
1256
+ return []
1257
+
1258
+ # Connect the filter function to the input components
1259
+ filter_inputs = [
1260
+ hidden_support_df,
1261
+ support_shown_columns,
1262
+ filter_support_models,
1263
+ exclude_support_models,
1264
+ ]
1265
+ filter_support_models.change(
1266
+ filter_support_data, filter_inputs, device_support_table
1267
+ )
1268
+ exclude_support_models.change(
1269
+ filter_support_data, filter_inputs, device_support_table
1270
+ )
1271
+ support_shown_columns.change(
1272
+ filter_support_data, filter_inputs, device_support_table
1273
+ )
1274
+
1275
+ # Connect select all and deselect all buttons
1276
+ select_all_support_button.click(
1277
+ select_all_support_columns,
1278
+ inputs=[],
1279
+ outputs=support_shown_columns,
1280
+ )
1281
+ deselect_all_support_button.click(
1282
+ deselect_all_support_columns,
1283
+ inputs=[],
1284
+ outputs=support_shown_columns,
1285
+ )
1286
+
1287
+ # Methodology Tab
1288
+ with gr.TabItem("Methodology", elem_id="methodology", id=7):
1289
+ gr.Markdown(METHODOLOGY_TEXT, elem_id="methodology-text")
1290
+
1291
+ # Citation section
1292
+ with gr.Accordion("📙 Citation", open=False):
1293
+ citation_button = gr.Textbox(
1294
+ value=CITATION_BUTTON_TEXT,
1295
+ label=CITATION_BUTTON_LABEL,
1296
+ lines=7,
1297
+ elem_id="citation-button",
1298
+ show_copy_button=True,
1299
+ )
1300
+
1301
+ # Launch the Gradio interface
1302
+ demo.launch(debug=True, share=True, ssr_mode=False)
multilingual_generate.py ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import os
3
+ import shutil
4
+ import sys
5
+ from collections import defaultdict
6
+
7
+ import numpy as np
8
+ import pandas as pd
9
+ from sklearn.metrics import confusion_matrix
10
+
11
+ from utils import compute_average_wer, download_dataset
12
+
13
+
14
+ def main():
15
+ """
16
+ Main function to orchestrate the multilingual data generation process.
17
+
18
+ This function performs the following steps:
19
+ 1. Downloads multilingual evaluation data if requested.
20
+ 2. Processes multilingual evaluation files.
21
+ 3. Calculates and saves results, including Word Error Rate (WER) and
22
+ language detection confusion matrices.
23
+ """
24
+ source_repo = "argmaxinc/whisperkit-evals-multilingual"
25
+ source_subfolder = "WhisperKit"
26
+ source_directory = f"{source_repo}/{source_subfolder}"
27
+ if len(sys.argv) > 1 and sys.argv[1] == "download":
28
+ try:
29
+ shutil.rmtree(source_repo)
30
+ except:
31
+ print("Nothing to remove.")
32
+ download_dataset(source_repo, source_repo, source_subfolder)
33
+
34
+ results = defaultdict(
35
+ lambda: {
36
+ "average_wer": [],
37
+ "language_wer": defaultdict(list),
38
+ "language_detection": [],
39
+ }
40
+ )
41
+
42
+ confusion_matrices = {}
43
+
44
+ for subdir, _, files in os.walk(source_directory):
45
+ for filename in files:
46
+ if not filename.endswith(".json") or "summary" in filename:
47
+ continue
48
+
49
+ file_path = os.path.join(subdir, filename)
50
+ with open(file_path, "r") as f:
51
+ data = json.load(f)
52
+
53
+ subdir_components = subdir.split(os.path.sep)
54
+ is_forced = "forced" in subdir_components
55
+ model = subdir_components[-3] if not is_forced else subdir_components[-4]
56
+
57
+ key = f"{model}/{'forced' if is_forced else 'not_forced'}"
58
+
59
+ for item in data["results"]:
60
+ if "reference_language" not in item:
61
+ continue
62
+ reference_language = item["reference_language"]
63
+ wer = item["wer"]
64
+ detected_language = item["predicted_language"]
65
+
66
+ result = {
67
+ "reference": item["reference"],
68
+ "prediction": item["prediction"],
69
+ }
70
+
71
+ results[key]["average_wer"].append(result)
72
+ results[key]["language_wer"][reference_language].append(result)
73
+ results[key]["language_detection"].append(
74
+ (reference_language, detected_language)
75
+ )
76
+
77
+ calculate_and_save_results(results, confusion_matrices)
78
+
79
+
80
+ def calculate_and_save_results(results, confusion_matrices):
81
+ """
82
+ Calculates final multilingual metrics and saves them to CSV and JSON files.
83
+
84
+ :param results: Dictionary containing raw multilingual evaluation data.
85
+ :param confusion_matrices: Dictionary to store confusion matrices for language detection.
86
+
87
+ This function processes the raw multilingual data, calculates average metrics,
88
+ creates confusion matrices for language detection, and saves the results to:
89
+ 1. A CSV file with WER data for each model and language.
90
+ 2. A JSON file with confusion matrices for language detection.
91
+ """
92
+ wer_data = []
93
+ for key, data in results.items():
94
+ model, forced = key.rsplit("/", 1)
95
+ row = {
96
+ "Model": model,
97
+ "Forced Tokens": forced == "forced",
98
+ "Average WER": compute_average_wer(data["average_wer"]),
99
+ }
100
+ for lang, wers in data["language_wer"].items():
101
+ row[f"WER_{lang}"] = compute_average_wer(wers)
102
+ wer_data.append(row)
103
+
104
+ true_languages, detected_languages = zip(*data["language_detection"])
105
+ unique_languages = sorted(set(true_languages))
106
+ cm = confusion_matrix(
107
+ true_languages, detected_languages, labels=unique_languages
108
+ )
109
+
110
+ row_sums = cm.sum(axis=1)
111
+ cm_normalized = np.zeros_like(cm, dtype=float)
112
+ non_zero_rows = row_sums != 0
113
+ cm_normalized[non_zero_rows] = (
114
+ cm[non_zero_rows] / row_sums[non_zero_rows, np.newaxis]
115
+ )
116
+
117
+ if model not in confusion_matrices:
118
+ confusion_matrices[model] = {}
119
+ confusion_matrices[model][forced] = {
120
+ "matrix": cm_normalized.tolist(),
121
+ "labels": unique_languages,
122
+ }
123
+
124
+ df = pd.DataFrame(wer_data)
125
+ df.to_csv("dashboard_data/multilingual_results.csv", index=False)
126
+
127
+ with open("dashboard_data/multilingual_confusion_matrices.json", "w") as f:
128
+ json.dump(confusion_matrices, f, indent=2)
129
+
130
+
131
+ if __name__ == "__main__":
132
+ main()
performance_generate.py ADDED
@@ -0,0 +1,465 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import glob
2
+ import json
3
+ import os
4
+ import shutil
5
+ import sys
6
+ import urllib
7
+ from collections import defaultdict
8
+ from datetime import datetime
9
+ from statistics import mean
10
+
11
+ import pandas as pd
12
+ import requests
13
+
14
+ from constants import BASE_WHISPERKIT_BENCHMARK_URL
15
+ from text_normalizer import text_normalizer
16
+ from utils import compute_average_wer, dir_to_json, download_dataset
17
+
18
+
19
+ def fetch_evaluation_data(url):
20
+ """
21
+ Fetches evaluation data from the given URL.
22
+ :param url: The URL to fetch the evaluation data from.
23
+ :returns: The evaluation data as a dictionary.
24
+ :rauses: sys.exit if the request fails
25
+ """
26
+ response = requests.get(url)
27
+ if response.status_code == 200:
28
+ return json.loads(response.text)
29
+ else:
30
+ sys.exit(f"Failed to fetch WhisperKit evals: {response.text}")
31
+
32
+
33
+ def generate_device_map(base_dir):
34
+ """
35
+ Generates a mapping of device identifiers to their corresponding device models.
36
+
37
+ This function iterates through all summary files in the specified base directory and its subdirectories,
38
+ extracting device identifier and device model information. It stores this information in a dictionary,
39
+ where the keys are device identifiers and the values are device models.
40
+
41
+ :param base_dir: The base directory to search for summary files.
42
+ :returns: A dictionary mapping device identifiers to device models.
43
+ """
44
+ device_map = {}
45
+
46
+ # Find all summary files recursively
47
+ summary_files = glob.glob(f"{base_dir}/**/*summary*.json", recursive=True)
48
+
49
+ for file_path in summary_files:
50
+ try:
51
+ with open(file_path, "r") as f:
52
+ data = json.load(f)
53
+
54
+ # Extract device information and create simple mapping
55
+ if "deviceModel" in data and "deviceIdentifier" in data:
56
+ device_map[data["deviceIdentifier"]] = data["deviceModel"]
57
+
58
+ except json.JSONDecodeError:
59
+ print(f"Error reading {file_path}")
60
+ except Exception as e:
61
+ print(f"Error processing {file_path}: {e}")
62
+
63
+ # Save the device map to project root
64
+ output_path = "dashboard_data/device_map.json"
65
+
66
+ with open(output_path, "w") as f:
67
+ json.dump(device_map, f, indent=4, sort_keys=True)
68
+
69
+ return device_map
70
+
71
+
72
+ def get_device_name(device):
73
+ """
74
+ Gets the device name from the device map if it exists.
75
+ :param device: String representing the device name.
76
+ :returns: The device name from the device map if it exists, otherwise the input device name.
77
+ """
78
+ with open("dashboard_data/device_map.json", "r") as f:
79
+ device_map = json.load(f)
80
+ return device_map.get(device, device).replace(" ", "_")
81
+
82
+
83
+ def process_benchmark_file(file_path, dataset_dfs, results):
84
+ """
85
+ Processes a single benchmark file and updates the results dictionary.
86
+
87
+ :param file_path: Path to the benchmark JSON file.
88
+ :param dataset_dfs: Dictionary of DataFrames containing dataset information.
89
+ :param results: Dictionary to store the processed results.
90
+
91
+ This function reads a benchmark JSON file, extracts relevant information,
92
+ and updates the results dictionary with various metrics including WER,
93
+ speed, tokens per second, and quality of inference (QoI).
94
+ """
95
+ with open(file_path, "r") as file:
96
+ test_results = json.load(file)
97
+
98
+ if len(test_results) == 0:
99
+ return
100
+
101
+ first_test_result = test_results[0]
102
+ model = first_test_result["testInfo"]["model"]
103
+ device = first_test_result["testInfo"]["device"]
104
+ dataset_dir = first_test_result["testInfo"]["datasetDir"]
105
+ if "iPhone" in device or "iPad" in device:
106
+ version_numbers = first_test_result["staticAttributes"]["osVersion"].split(".")
107
+ if len(version_numbers) == 3 and version_numbers[-1] == "0":
108
+ version_numbers.pop()
109
+ os_info = f"""{'iOS' if 'iPhone' in device else 'iPadOS'}_{".".join(version_numbers)}"""
110
+ else:
111
+ os_info = f"macOS_{first_test_result['staticAttributes']['osVersion']}"
112
+ timestamp = first_test_result["testInfo"]["date"]
113
+ commit_hash_timestamp = file_path.split("/")[-2]
114
+ commit_timestamp, commit_hash = commit_hash_timestamp.split("_")
115
+
116
+ key = (model, device, os_info, commit_timestamp)
117
+ dataset_name = dataset_dir
118
+ for test_result in test_results:
119
+ test_info = test_result["testInfo"]
120
+ audio_file_name = test_info["audioFile"]
121
+
122
+ dataset_df = dataset_dfs[dataset_name]
123
+
124
+ wer_entry = {
125
+ "prediction": text_normalizer(test_info["prediction"]),
126
+ "reference": text_normalizer(test_info["reference"]),
127
+ }
128
+ results[key]["timestamp"] = timestamp
129
+ results[key]["average_wer"].append(wer_entry)
130
+ results[key]["dataset_wer"][dataset_name].append(wer_entry)
131
+
132
+ input_audio_seconds = test_info["timings"]["inputAudioSeconds"]
133
+ full_pipeline = test_info["timings"]["fullPipeline"]
134
+ total_decoding_loops = test_info["timings"]["totalDecodingLoops"]
135
+
136
+ results[key]["dataset_speed"][dataset_name][
137
+ "inputAudioSeconds"
138
+ ] += input_audio_seconds
139
+ results[key]["dataset_speed"][dataset_name]["fullPipeline"] += full_pipeline
140
+
141
+ results[key]["speed"]["inputAudioSeconds"] += input_audio_seconds
142
+ results[key]["speed"]["fullPipeline"] += full_pipeline
143
+
144
+ results[key]["commit_hash"] = commit_hash
145
+ results[key]["commit_timestamp"] = commit_timestamp
146
+
147
+ results[key]["dataset_tokens_per_second"][dataset_name][
148
+ "totalDecodingLoops"
149
+ ] += total_decoding_loops
150
+ results[key]["dataset_tokens_per_second"][dataset_name][
151
+ "fullPipeline"
152
+ ] += full_pipeline
153
+ results[key]["tokens_per_second"]["totalDecodingLoops"] += total_decoding_loops
154
+ results[key]["tokens_per_second"]["fullPipeline"] += full_pipeline
155
+
156
+ audio = audio_file_name.split(".")[0]
157
+ if dataset_name == "earnings22-10mins":
158
+ audio = audio.split("-")[0]
159
+
160
+ dataset_row = dataset_df.loc[dataset_df["file"].str.contains(audio)].iloc[0]
161
+ reference_wer = dataset_row["wer"]
162
+ prediction_wer = test_info["wer"]
163
+
164
+ results[key]["qoi"].append(1 if prediction_wer <= reference_wer else 0)
165
+
166
+ return key, dataset_name
167
+
168
+
169
+ def process_summary_file(file_path, results):
170
+ """
171
+ Processes a summary file and updates the results dictionary with device support information.
172
+
173
+ :param file_path: Path to the summary JSON file.
174
+ :param results: Dictionary to store the processed results.
175
+
176
+ This function reads a summary JSON file, extracts information about supported
177
+ and failed models for a specific device and OS combination, and updates the
178
+ results dictionary accordingly.
179
+ """
180
+ with open(file_path, "r") as file:
181
+ summary_data = json.load(file)
182
+
183
+ device = summary_data["deviceIdentifier"]
184
+ os = f"{'iPadOS' if 'iPad' in device else summary_data['osType']} {summary_data['osVersion']}"
185
+ commit_timestamp = summary_data["commitTimestamp"]
186
+
187
+ key = (device, os)
188
+ if key in results:
189
+ existing_timestamp = results[key]["commitTimestamp"]
190
+
191
+ existing_dt = datetime.strptime(existing_timestamp, "%Y-%m-%dT%H%M%S")
192
+ new_dt = datetime.strptime(commit_timestamp, "%Y-%m-%dT%H%M%S")
193
+
194
+ if new_dt <= existing_dt:
195
+ return
196
+ else:
197
+ results[key] = {}
198
+
199
+ supported_models = set(summary_data["modelsTested"])
200
+ failed_models = set()
201
+
202
+ dataset_count = 2
203
+ for model, value in summary_data["testResults"].items():
204
+ if model not in summary_data["failureInfo"]:
205
+ dataset_count = len(value)
206
+ break
207
+
208
+ for failed_model in summary_data["failureInfo"]:
209
+ if (
210
+ failed_model in summary_data["testResults"]
211
+ and len(summary_data["testResults"][failed_model]) == dataset_count
212
+ ):
213
+ continue
214
+ supported_models.discard(failed_model)
215
+ failed_models.add(failed_model)
216
+
217
+ results[key]["supportedModels"] = supported_models
218
+ results[key]["commitTimestamp"] = commit_timestamp
219
+ results[key]["failedModels"] = (failed_models, file_path)
220
+ results["modelsTested"] |= supported_models
221
+ results["devices"].add(device)
222
+
223
+
224
+ def calculate_and_save_performance_results(
225
+ performance_results, performance_output_path
226
+ ):
227
+ """
228
+ Calculates final performance metrics and saves them to a JSON file.
229
+
230
+ :param performance_results: Dictionary containing raw performance data.
231
+ :param performance_output_path: Path to save the processed performance results.
232
+
233
+ This function processes the raw performance data, calculates average metrics,
234
+ and writes the final results to a JSON file, with each entry representing
235
+ a unique combination of model, device, and OS.
236
+ """
237
+ not_supported = []
238
+ with open(performance_output_path, "w") as performance_file:
239
+ for key, data in performance_results.items():
240
+ model, device, os_info, timestamp = key
241
+ speed = round(
242
+ data["speed"]["inputAudioSeconds"] / data["speed"]["fullPipeline"], 2
243
+ )
244
+
245
+ if speed < 1.0:
246
+ not_supported.append((model, device, os_info))
247
+ continue
248
+
249
+ performance_entry = {
250
+ "model": model.replace("_", "/"),
251
+ "device": get_device_name(device).replace("_", " "),
252
+ "os": os_info.replace("_", " "),
253
+ "timestamp": data["timestamp"],
254
+ "speed": speed,
255
+ "tokens_per_second": round(
256
+ data["tokens_per_second"]["totalDecodingLoops"]
257
+ / data["tokens_per_second"]["fullPipeline"],
258
+ 2,
259
+ ),
260
+ "dataset_speed": {
261
+ dataset: round(
262
+ speed_info["inputAudioSeconds"] / speed_info["fullPipeline"], 2
263
+ )
264
+ for dataset, speed_info in data["dataset_speed"].items()
265
+ },
266
+ "dataset_tokens_per_second": {
267
+ dataset: round(
268
+ tps_info["totalDecodingLoops"] / tps_info["fullPipeline"], 2
269
+ )
270
+ for dataset, tps_info in data["dataset_tokens_per_second"].items()
271
+ },
272
+ "average_wer": compute_average_wer(data["average_wer"]),
273
+ "dataset_wer": {
274
+ dataset: compute_average_wer(wer)
275
+ for dataset, wer in data["dataset_wer"].items()
276
+ },
277
+ "qoi": round(mean(data["qoi"]), 2),
278
+ "commit_hash": data["commit_hash"],
279
+ "commit_timestamp": data["commit_timestamp"],
280
+ }
281
+
282
+ json.dump(performance_entry, performance_file)
283
+ performance_file.write("\n")
284
+ return not_supported
285
+
286
+
287
+ def calculate_and_save_support_results(
288
+ support_results, not_supported, support_output_path
289
+ ):
290
+ """
291
+ Calculates device support results and saves them to a CSV file.
292
+
293
+ :param support_results: Dictionary containing device support information.
294
+ :param support_output_path: Path to save the processed support results.
295
+
296
+ This function processes the device support data and creates a CSV file
297
+ showing which models are supported on different devices and OS versions,
298
+ using checkmarks, warning signs, quesiton marks or Not supported to
299
+ indicate support status.
300
+ """
301
+ all_models = sorted(support_results["modelsTested"])
302
+ all_devices = sorted(set(support_results["devices"]))
303
+
304
+ df = pd.DataFrame(index=all_models, columns=["Model"] + all_devices)
305
+
306
+ for model in all_models:
307
+ row = {"Model": model}
308
+ for device in all_devices:
309
+ row[device] = ""
310
+
311
+ for key, data in support_results.items():
312
+ if key in ["modelsTested", "devices"]:
313
+ continue
314
+ (device, os) = key
315
+ supported_models = data["supportedModels"]
316
+ failed_models, file_path = data["failedModels"]
317
+ directories = file_path.split("/")
318
+ commit_file, summary_file = directories[-2], directories[-1]
319
+ url = f"{BASE_WHISPERKIT_BENCHMARK_URL}/{commit_file}/{urllib.parse.quote(summary_file)}"
320
+
321
+ if model in supported_models:
322
+ current_value = row[device]
323
+ new_value = (
324
+ f"✅ {os}"
325
+ if current_value == ""
326
+ else f"{current_value}<p>✅ {os}</p>"
327
+ )
328
+ elif model in failed_models:
329
+ current_value = row[device]
330
+ new_value = (
331
+ f"""⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href={url}>{os}</a>"""
332
+ if current_value == ""
333
+ else f"""{current_value}<p>⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href={url}>{os}</a></p>"""
334
+ )
335
+ else:
336
+ current_value = row[device]
337
+ new_value = (
338
+ f"? {os}"
339
+ if current_value == ""
340
+ else f"{current_value}<p>? {os}</p>"
341
+ )
342
+ row[device] = new_value
343
+
344
+ df.loc[model] = row
345
+
346
+ remove_unsupported_cells(df, not_supported)
347
+
348
+ cols = df.columns.tolist()
349
+ cols = ["Model"] + [
350
+ get_device_name(col).replace("_", " ") for col in cols if col != "Model"
351
+ ]
352
+ df.columns = cols
353
+
354
+ df.to_csv(support_output_path, index=True)
355
+
356
+
357
+ def remove_unsupported_cells(df, not_supported):
358
+ """
359
+ Updates the DataFrame to mark unsupported model-device combinations.
360
+
361
+ This function reads a configuration file to determine which models are supported
362
+ on which devices. It then iterates over the DataFrame and sets the value to "Not supported"
363
+ for any model-device combination that is not supported according to the configuration.
364
+
365
+ :param df: A Pandas DataFrame where the index represents models and columns represent devices.
366
+ """
367
+ with open("dashboard_data/config.json", "r") as file:
368
+ config_data = json.load(file)
369
+
370
+ device_support = config_data["device_support"]
371
+ for info in device_support:
372
+ identifiers = set(info["identifiers"])
373
+ supported = set(info["models"]["supported"])
374
+
375
+ for model in df.index:
376
+ for device in df.columns:
377
+ if (
378
+ any(identifier in device for identifier in identifiers)
379
+ and model not in supported
380
+ ):
381
+ df.at[model, device] = "Not Supported"
382
+
383
+ for model, device, os in not_supported:
384
+ df.at[model, device] = "Not Supported"
385
+
386
+
387
+ def main():
388
+ """
389
+ Main function to orchestrate the performance data generation process.
390
+
391
+ This function performs the following steps:
392
+ 1. Downloads benchmark data if requested.
393
+ 2. Fetches evaluation data for various datasets.
394
+ 3. Processes benchmark files and summary files.
395
+ 4. Calculates and saves performance and support results.
396
+ """
397
+ source_xcresult_repo = "argmaxinc/whisperkit-evals-dataset"
398
+ source_xcresult_subfolder = "benchmark_data/"
399
+ source_xcresult_directory = f"{source_xcresult_repo}/{source_xcresult_subfolder}"
400
+ if len(sys.argv) > 1 and sys.argv[1] == "download":
401
+ try:
402
+ shutil.rmtree(source_xcresult_repo)
403
+ except:
404
+ print("Nothing to remove.")
405
+ download_dataset(
406
+ source_xcresult_repo, source_xcresult_repo, source_xcresult_subfolder
407
+ )
408
+
409
+ datasets = {
410
+ "Earnings-22": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
411
+ "LibriSpeech": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
412
+ "earnings22-10mins": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
413
+ "librispeech-10mins": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
414
+ "earnings22-12hours": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
415
+ "librispeech": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
416
+ }
417
+
418
+ dataset_dfs = {}
419
+ for dataset_name, url in datasets.items():
420
+ evals = fetch_evaluation_data(url)
421
+ dataset_dfs[dataset_name] = pd.json_normalize(evals["results"])
422
+
423
+ performance_results = defaultdict(
424
+ lambda: {
425
+ "average_wer": [],
426
+ "dataset_wer": defaultdict(list),
427
+ "qoi": [],
428
+ "speed": {"inputAudioSeconds": 0, "fullPipeline": 0},
429
+ "tokens_per_second": {"totalDecodingLoops": 0, "fullPipeline": 0},
430
+ "dataset_speed": defaultdict(
431
+ lambda: {"inputAudioSeconds": 0, "fullPipeline": 0}
432
+ ),
433
+ "dataset_tokens_per_second": defaultdict(
434
+ lambda: {"totalDecodingLoops": 0, "fullPipeline": 0}
435
+ ),
436
+ "timestamp": None,
437
+ "commit_hash": None,
438
+ "commit_timestamp": None,
439
+ }
440
+ )
441
+
442
+ support_results = {"modelsTested": set(), "devices": set()}
443
+
444
+ generate_device_map(source_xcresult_directory)
445
+
446
+ for subdir, _, files in os.walk(source_xcresult_directory):
447
+ for filename in files:
448
+ file_path = os.path.join(subdir, filename)
449
+ if not filename.endswith(".json"):
450
+ continue
451
+ elif "summary" in filename:
452
+ process_summary_file(file_path, support_results)
453
+ else:
454
+ process_benchmark_file(file_path, dataset_dfs, performance_results)
455
+
456
+ not_supported = calculate_and_save_performance_results(
457
+ performance_results, "dashboard_data/performance_data.json"
458
+ )
459
+ calculate_and_save_support_results(
460
+ support_results, not_supported, "dashboard_data/support_data.csv"
461
+ )
462
+
463
+
464
+ if __name__ == "__main__":
465
+ main()
quality_generate.py ADDED
@@ -0,0 +1,186 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import os
3
+ import shutil
4
+ import sys
5
+ from collections import defaultdict
6
+ from statistics import mean
7
+
8
+ import pandas as pd
9
+ import requests
10
+
11
+ from text_normalizer import text_normalizer
12
+ from utils import compute_average_wer, download_dataset
13
+
14
+
15
+ def fetch_evaluation_data(url):
16
+ """
17
+ Fetches evaluation data from the given URL.
18
+ :param url: The URL to fetch the evaluation data from.
19
+ :returns: The evaluation data as a dictionary.
20
+ :rauses: sys.exit if the request fails
21
+ """
22
+ response = requests.get(url)
23
+ if response.status_code == 200:
24
+ return json.loads(response.text)
25
+ else:
26
+ sys.exit(f"Failed to fetch WhisperKit evals: {response.text}")
27
+
28
+
29
+ def get_device_name(device):
30
+ """
31
+ Gets the device name from the device map if it exists.
32
+ :param device: String representing the device name.
33
+ :returns: The device name from the device map if it exists, otherwise the input device name.
34
+ """
35
+ with open("dashboard_data/device_map.json", "r") as f:
36
+ device_map = json.load(f)
37
+ return device_map.get(device, device).replace(" ", "_")
38
+
39
+
40
+ def process_quality_file(file_path, dataset_dfs, quality_results):
41
+ """
42
+ Processes a single quality file and updates the quality_results dictionary.
43
+
44
+ :param file_path: Path to the quality JSON file.
45
+ :param dataset_dfs: Dictionary of DataFrames containing dataset information.
46
+ :param quality_results: Dictionary to store the processed quality results.
47
+
48
+ This function reads a quality JSON file, extracts relevant information,
49
+ and updates the quality_results dictionary with various metrics including WER
50
+ and Quality of Inference (QoI) for different datasets.
51
+ """
52
+ with open(file_path, "r") as file:
53
+ test_results = json.load(file)
54
+
55
+ if len(test_results) == 0:
56
+ return
57
+
58
+ metadata = test_results["metadata"]
59
+ test_results = test_results["results"]
60
+ model = file_path.split("/")[-3].replace("_", "/")
61
+ device = metadata["inference_context"]["device_spec"]["product_name"]
62
+ device = get_device_name(device)
63
+ timestamp = file_path.split("/")[-1].split(".")[0]
64
+ key = model
65
+ dataset_name = metadata["dataset_name"]
66
+
67
+ for test_result in test_results:
68
+ audio_file_name = test_result["file"]
69
+
70
+ dataset_key = "Earnings-22" if "earnings22" in dataset_name else "LibriSpeech"
71
+ dataset_df = dataset_dfs[dataset_key]
72
+
73
+ wer_entry = {
74
+ "prediction": text_normalizer(test_result["prediction"]),
75
+ "reference": text_normalizer(test_result["reference"]),
76
+ }
77
+ quality_results[key]["timestamp"] = timestamp
78
+ quality_results[key]["dataset_wer"][dataset_name].append(wer_entry)
79
+
80
+ audio = audio_file_name.split(".")[0]
81
+ dataset_row = dataset_df.loc[dataset_df["file"].str.contains(audio)].iloc[0]
82
+ reference_wer = dataset_row["wer"]
83
+ prediction_wer = test_result["wer"]
84
+
85
+ quality_results[key]["qoi"].append(1 if prediction_wer <= reference_wer else 0)
86
+
87
+
88
+ def calculate_and_save_quality_results(quality_results, quality_output_path):
89
+ """
90
+ Calculates final quality metrics and saves them to a JSON file.
91
+
92
+ :param quality_results: Dictionary containing raw quality data.
93
+ :param quality_output_path: Path to save the processed quality results.
94
+
95
+ This function processes the raw quality data, calculates average metrics,
96
+ and writes the final results to a JSON file, with each entry representing
97
+ a unique model's quality metrics across different datasets, including
98
+ Word Error Rate (WER) and Quality of Inference (QoI).
99
+ """
100
+ with open(quality_output_path, "w") as quality_file:
101
+ for key, data in quality_results.items():
102
+ model = key
103
+
104
+ dataset_wers = {
105
+ dataset: compute_average_wer(wer)
106
+ for dataset, wer in data["dataset_wer"].items()
107
+ }
108
+ average_wer = (
109
+ sum(dataset_wers.values()) / len(dataset_wers)
110
+ if len(dataset_wers) != 0
111
+ else 0
112
+ )
113
+
114
+ quality_entry = {
115
+ "model": model.replace("_", "/"),
116
+ "timestamp": data["timestamp"],
117
+ "average_wer": round(average_wer, 2),
118
+ "dataset_wer": dataset_wers,
119
+ "qoi": round(mean(data["qoi"]), 2),
120
+ }
121
+
122
+ json.dump(quality_entry, quality_file)
123
+ quality_file.write("\n")
124
+
125
+
126
+ def main():
127
+ """
128
+ Main function to orchestrate the quality data generation process.
129
+
130
+ This function performs the following steps:
131
+ 1. Downloads quality data if requested.
132
+ 2. Fetches evaluation data for various datasets.
133
+ 3. Processes quality files for specific datasets.
134
+ 4. Calculates and saves quality results, including WER and QoI metrics.
135
+ """
136
+ if len(sys.argv) > 1 and sys.argv[1] == "download":
137
+ try:
138
+ shutil.rmtree("english")
139
+ except:
140
+ print("Nothing to remove.")
141
+ download_dataset("argmaxinc/whisperkit-evals", "english", "WhisperKit")
142
+
143
+ datasets = {
144
+ "Earnings-22": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
145
+ "LibriSpeech": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
146
+ "earnings22-10mins": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
147
+ "librispeech-10mins": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
148
+ "earnings22-12hours": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
149
+ "librispeech": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
150
+ }
151
+
152
+ dataset_dfs = {}
153
+ for dataset_name, url in datasets.items():
154
+ evals = fetch_evaluation_data(url)
155
+ dataset_dfs[dataset_name] = pd.json_normalize(evals["results"])
156
+
157
+ source_quality_directory = "argmaxinc/english/WhisperKit/"
158
+
159
+ quality_results = defaultdict(
160
+ lambda: {
161
+ "average_wer": [],
162
+ "dataset_wer": defaultdict(list),
163
+ "qoi": [],
164
+ "timestamp": None,
165
+ }
166
+ )
167
+
168
+ for subdir, _, files in os.walk(source_quality_directory):
169
+ dataset = subdir.split("/")[-1]
170
+ if dataset not in ["earnings22-12hours", "librispeech"]:
171
+ continue
172
+
173
+ for filename in files:
174
+ if not filename.endswith(".json"):
175
+ continue
176
+
177
+ file_path = os.path.join(subdir, filename)
178
+ process_quality_file(file_path, dataset_dfs, quality_results)
179
+
180
+ calculate_and_save_quality_results(
181
+ quality_results, "dashboard_data/quality_data.json"
182
+ )
183
+
184
+
185
+ if __name__ == "__main__":
186
+ main()
requirements.txt ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ aiofiles
2
+ aiohttp
3
+ aiosignal
4
+ altair
5
+ annotated-types
6
+ anyio
7
+ argmax_gradio_components
8
+ async-timeout
9
+ attrs
10
+ backports.tarfile
11
+ build
12
+ certifi
13
+ cffi
14
+ cfgv
15
+ charset-normalizer
16
+ click
17
+ contourpy
18
+ cycler
19
+ datasets
20
+ dill
21
+ distlib
22
+ dnspython
23
+ docutils
24
+ email_validator
25
+ exceptiongroup
26
+ fastapi
27
+ fastapi-cli
28
+ ffmpy
29
+ filelock
30
+ fonttools
31
+ frozenlist
32
+ fsspec
33
+ gradio==5.0.1
34
+ h11
35
+ httpcore
36
+ httptools
37
+ httpx
38
+ huggingface-hub
39
+ identify
40
+ idna
41
+ importlib_metadata
42
+ importlib_resources
43
+ jaraco.classes
44
+ jaraco.context
45
+ jaraco.functools
46
+ Jinja2
47
+ jsonschema
48
+ jsonschema-specifications
49
+ keyring
50
+ kiwisolver
51
+ markdown-it-py
52
+ MarkupSafe
53
+ matplotlib
54
+ mdurl
55
+ more-itertools
56
+ multidict
57
+ multiprocess
58
+ nh3
59
+ nodeenv
60
+ numpy
61
+ orjson
62
+ packaging
63
+ pandas
64
+ pillow
65
+ pkginfo
66
+ platformdirs
67
+ plotly
68
+ pre-commit
69
+ pyarrow
70
+ pyarrow-hotfix
71
+ pycparser
72
+ pydantic
73
+ pydantic_core
74
+ pydub
75
+ Pygments
76
+ pyparsing
77
+ pyproject_hooks
78
+ python-dateutil
79
+ python-dotenv
80
+ python-multipart
81
+ pytz
82
+ PyYAML
83
+ readme_renderer
84
+ referencing
85
+ requests
86
+ requests-toolbelt
87
+ rfc3986
88
+ rich
89
+ rpds-py
90
+ ruff
91
+ scipy
92
+ scikit-learn
93
+ semantic-version
94
+ shellingham
95
+ six
96
+ sniffio
97
+ soundfile
98
+ starlette
99
+ tenacity
100
+ text_normalizer
101
+ tomli
102
+ tomlkit
103
+ toolz
104
+ tqdm
105
+ twine
106
+ typer
107
+ typing_extensions
108
+ tzdata
109
+ ujson
110
+ urllib3
111
+ uvicorn
112
+ uvloop
113
+ virtualenv
114
+ watchfiles
115
+ websockets
116
+ xxhash
117
+ yarl
118
+ zipp
119
+ iso639-lang
120
+ evaluate
121
+ jiwer
122
+ regex
static/Zwizz-Medium.woff ADDED
Binary file (28.7 kB). View file
 
static/Zwizz-Regular.woff ADDED
Binary file (28.4 kB). View file
 
static/Zwizz-SemiBold.woff ADDED
Binary file (28.7 kB). View file
 
text_normalizer.py ADDED
@@ -0,0 +1,2374 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2022 The OpenAI team and The HuggingFace Team. All rights reserved.
2
+ # Most of the code is copy pasted from the original whisper repository
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License");
5
+ # you may not use this file except in compliance with the License.
6
+ # You may obtain a copy of the License at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # Unless required by applicable law or agreed to in writing, software
11
+ # distributed under the License is distributed on an "AS IS" BASIS,
12
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ # See the License for the specific language governing permissions and
14
+ # limitations under the License.
15
+
16
+ import re
17
+ import unicodedata
18
+ from fractions import Fraction
19
+ from typing import Iterator, List, Match, Optional, Union
20
+
21
+ import regex
22
+
23
+ abbr = {
24
+ "accessorise": "accessorize",
25
+ "accessorised": "accessorized",
26
+ "accessorises": "accessorizes",
27
+ "accessorising": "accessorizing",
28
+ "acclimatisation": "acclimatization",
29
+ "acclimatise": "acclimatize",
30
+ "acclimatised": "acclimatized",
31
+ "acclimatises": "acclimatizes",
32
+ "acclimatising": "acclimatizing",
33
+ "accoutrements": "accouterments",
34
+ "aeon": "eon",
35
+ "aeons": "eons",
36
+ "aerogramme": "aerogram",
37
+ "aerogrammes": "aerograms",
38
+ "aeroplane": "airplane",
39
+ "aeroplanes": "airplanes",
40
+ "aesthete": "esthete",
41
+ "aesthetes": "esthetes",
42
+ "aesthetic": "esthetic",
43
+ "aesthetically": "esthetically",
44
+ "aesthetics": "esthetics",
45
+ "aetiology": "etiology",
46
+ "ageing": "aging",
47
+ "aggrandisement": "aggrandizement",
48
+ "agonise": "agonize",
49
+ "agonised": "agonized",
50
+ "agonises": "agonizes",
51
+ "agonising": "agonizing",
52
+ "agonisingly": "agonizingly",
53
+ "almanack": "almanac",
54
+ "almanacks": "almanacs",
55
+ "aluminium": "aluminum",
56
+ "amortisable": "amortizable",
57
+ "amortisation": "amortization",
58
+ "amortisations": "amortizations",
59
+ "amortise": "amortize",
60
+ "amortised": "amortized",
61
+ "amortises": "amortizes",
62
+ "amortising": "amortizing",
63
+ "amphitheatre": "amphitheater",
64
+ "amphitheatres": "amphitheaters",
65
+ "anaemia": "anemia",
66
+ "anaemic": "anemic",
67
+ "anaesthesia": "anesthesia",
68
+ "anaesthetic": "anesthetic",
69
+ "anaesthetics": "anesthetics",
70
+ "anaesthetise": "anesthetize",
71
+ "anaesthetised": "anesthetized",
72
+ "anaesthetises": "anesthetizes",
73
+ "anaesthetising": "anesthetizing",
74
+ "anaesthetist": "anesthetist",
75
+ "anaesthetists": "anesthetists",
76
+ "anaesthetize": "anesthetize",
77
+ "anaesthetized": "anesthetized",
78
+ "anaesthetizes": "anesthetizes",
79
+ "anaesthetizing": "anesthetizing",
80
+ "analogue": "analog",
81
+ "analogues": "analogs",
82
+ "analyse": "analyze",
83
+ "analysed": "analyzed",
84
+ "analyses": "analyzes",
85
+ "analysing": "analyzing",
86
+ "anglicise": "anglicize",
87
+ "anglicised": "anglicized",
88
+ "anglicises": "anglicizes",
89
+ "anglicising": "anglicizing",
90
+ "annualised": "annualized",
91
+ "antagonise": "antagonize",
92
+ "antagonised": "antagonized",
93
+ "antagonises": "antagonizes",
94
+ "antagonising": "antagonizing",
95
+ "apologise": "apologize",
96
+ "apologised": "apologized",
97
+ "apologises": "apologizes",
98
+ "apologising": "apologizing",
99
+ "appal": "appall",
100
+ "appals": "appalls",
101
+ "appetiser": "appetizer",
102
+ "appetisers": "appetizers",
103
+ "appetising": "appetizing",
104
+ "appetisingly": "appetizingly",
105
+ "arbour": "arbor",
106
+ "arbours": "arbors",
107
+ "archaeologically": "archeologically",
108
+ "archaeologist": "archeologist",
109
+ "archaeologists": "archeologists",
110
+ "archaeology": "archeology</span>",
111
+ "archeological": "archaeological",
112
+ "ardour": "ardor",
113
+ "armour": "armor",
114
+ "armoured": "armored",
115
+ "armourer": "armorer",
116
+ "armourers": "armorers",
117
+ "armouries": "armories",
118
+ "armoury": "armory",
119
+ "artefact": "artifact",
120
+ "artefacts": "artifacts",
121
+ "authorise": "authorize",
122
+ "authorised": "authorized",
123
+ "authorises": "authorizes",
124
+ "authorising": "authorizing",
125
+ "axe": "ax",
126
+ "backpedalled": "backpedaled",
127
+ "backpedalling": "backpedaling",
128
+ "bannister": "banister",
129
+ "bannisters": "banisters",
130
+ "baptise": "baptize",
131
+ "baptised": "baptized",
132
+ "baptises": "baptizes",
133
+ "baptising": "baptizing",
134
+ "bastardise": "bastardize",
135
+ "bastardised": "bastardized",
136
+ "bastardises": "bastardizes",
137
+ "bastardising": "bastardizing",
138
+ "battleax": "battleaxe",
139
+ "baulk": "balk",
140
+ "baulked": "balked",
141
+ "baulking": "balking",
142
+ "baulks": "balks",
143
+ "bedevilled": "bedeviled",
144
+ "bedevilling": "bedeviling",
145
+ "behaviour": "behavior",
146
+ "behavioural": "behavioral",
147
+ "behaviourism": "behaviorism",
148
+ "behaviourist": "behaviorist",
149
+ "behaviourists": "behaviorists",
150
+ "behaviours": "behaviors",
151
+ "behove": "behoove",
152
+ "behoved": "behooved",
153
+ "behoves": "behooves",
154
+ "bejewelled": "bejeweled",
155
+ "belabour": "belabor",
156
+ "belaboured": "belabored",
157
+ "belabouring": "belaboring",
158
+ "belabours": "belabors",
159
+ "bevelled": "beveled",
160
+ "bevvies": "bevies",
161
+ "bevvy": "bevy",
162
+ "biassed": "biased",
163
+ "biassing": "biasing",
164
+ "bingeing": "binging",
165
+ "bougainvillaea": "bougainvillea",
166
+ "bougainvillaeas": "bougainvilleas",
167
+ "bowdlerise": "bowdlerize",
168
+ "bowdlerised": "bowdlerized",
169
+ "bowdlerises": "bowdlerizes",
170
+ "bowdlerising": "bowdlerizing",
171
+ "breathalyse": "breathalyze",
172
+ "breathalysed": "breathalyzed",
173
+ "breathalyser": "breathalyzer",
174
+ "breathalysers": "breathalyzers",
175
+ "breathalyses": "breathalyzes",
176
+ "breathalysing": "breathalyzing",
177
+ "brutalise": "brutalize",
178
+ "brutalised": "brutalized",
179
+ "brutalises": "brutalizes",
180
+ "brutalising": "brutalizing",
181
+ "busses": "buses",
182
+ "bussing": "busing",
183
+ "caesarean": "cesarean",
184
+ "caesareans": "cesareans",
185
+ "calibre": "caliber",
186
+ "calibres": "calibers",
187
+ "calliper": "caliper",
188
+ "callipers": "calipers",
189
+ "callisthenics": "calisthenics",
190
+ "canalise": "canalize",
191
+ "canalised": "canalized",
192
+ "canalises": "canalizes",
193
+ "canalising": "canalizing",
194
+ "cancelation": "cancellation",
195
+ "cancelations": "cancellations",
196
+ "cancelled": "canceled",
197
+ "cancelling": "canceling",
198
+ "candour": "candor",
199
+ "cannibalise": "cannibalize",
200
+ "cannibalised": "cannibalized",
201
+ "cannibalises": "cannibalizes",
202
+ "cannibalising": "cannibalizing",
203
+ "canonise": "canonize",
204
+ "canonised": "canonized",
205
+ "canonises": "canonizes",
206
+ "canonising": "canonizing",
207
+ "capitalise": "capitalize",
208
+ "capitalised": "capitalized",
209
+ "capitalises": "capitalizes",
210
+ "capitalising": "capitalizing",
211
+ "caramelise": "caramelize",
212
+ "caramelised": "caramelized",
213
+ "caramelises": "caramelizes",
214
+ "caramelising": "caramelizing",
215
+ "carbonise": "carbonize",
216
+ "carbonised": "carbonized",
217
+ "carbonises": "carbonizes",
218
+ "carbonising": "carbonizing",
219
+ "carolled": "caroled",
220
+ "carolling": "caroling",
221
+ "catalogue": "catalog",
222
+ "catalogued": "cataloged",
223
+ "catalogues": "catalogs",
224
+ "cataloguing": "cataloging",
225
+ "catalyse": "catalyze",
226
+ "catalysed": "catalyzed",
227
+ "catalyses": "catalyzes",
228
+ "catalysing": "catalyzing",
229
+ "categorise": "categorize",
230
+ "categorised": "categorized",
231
+ "categorises": "categorizes",
232
+ "categorising": "categorizing",
233
+ "cauterise": "cauterize",
234
+ "cauterised": "cauterized",
235
+ "cauterises": "cauterizes",
236
+ "cauterising": "cauterizing",
237
+ "cavilled": "caviled",
238
+ "cavilling": "caviling",
239
+ "centigramme": "centigram",
240
+ "centigrammes": "centigrams",
241
+ "centilitre": "centiliter",
242
+ "centilitres": "centiliters",
243
+ "centimetre": "centimeter",
244
+ "centimetres": "centimeters",
245
+ "centralise": "centralize",
246
+ "centralised": "centralized",
247
+ "centralises": "centralizes",
248
+ "centralising": "centralizing",
249
+ "centre": "center",
250
+ "centred": "centered",
251
+ "centrefold": "centerfold",
252
+ "centrefolds": "centerfolds",
253
+ "centrepiece": "centerpiece",
254
+ "centrepieces": "centerpieces",
255
+ "centres": "centers",
256
+ "channelled": "channeled",
257
+ "channelling": "channeling",
258
+ "characterise": "characterize",
259
+ "characterised": "characterized",
260
+ "characterises": "characterizes",
261
+ "characterising": "characterizing",
262
+ "cheque": "check",
263
+ "chequebook": "checkbook",
264
+ "chequebooks": "checkbooks",
265
+ "chequered": "checkered",
266
+ "cheques": "checks",
267
+ "chilli": "chili",
268
+ "chimaera": "chimera",
269
+ "chimaeras": "chimeras",
270
+ "chiselled": "chiseled",
271
+ "chiselling": "chiseling",
272
+ "circularise": "circularize",
273
+ "circularised": "circularized",
274
+ "circularises": "circularizes",
275
+ "circularising": "circularizing",
276
+ "civilise": "civilize",
277
+ "civilised": "civilized",
278
+ "civilises": "civilizes",
279
+ "civilising": "civilizing",
280
+ "clamour": "clamor",
281
+ "clamoured": "clamored",
282
+ "clamouring": "clamoring",
283
+ "clamours": "clamors",
284
+ "clangour": "clangor",
285
+ "clarinettist": "clarinetist",
286
+ "clarinettists": "clarinetists",
287
+ "collectivise": "collectivize",
288
+ "collectivised": "collectivized",
289
+ "collectivises": "collectivizes",
290
+ "collectivising": "collectivizing",
291
+ "colonisation": "colonization",
292
+ "colonise": "colonize",
293
+ "colonised": "colonized",
294
+ "coloniser": "colonizer",
295
+ "colonisers": "colonizers",
296
+ "colonises": "colonizes",
297
+ "colonising": "colonizing",
298
+ "colour": "color",
299
+ "colourant": "colorant",
300
+ "colourants": "colorants",
301
+ "coloured": "colored",
302
+ "coloureds": "coloreds",
303
+ "colourful": "colorful",
304
+ "colourfully": "colorfully",
305
+ "colouring": "coloring",
306
+ "colourize": "colorize",
307
+ "colourized": "colorized",
308
+ "colourizes": "colorizes",
309
+ "colourizing": "colorizing",
310
+ "colourless": "colorless",
311
+ "colours": "colors",
312
+ "commercialise": "commercialize",
313
+ "commercialised": "commercialized",
314
+ "commercialises": "commercializes",
315
+ "commercialising": "commercializing",
316
+ "compartmentalise": "compartmentalize",
317
+ "compartmentalised": "compartmentalized",
318
+ "compartmentalises": "compartmentalizes",
319
+ "compartmentalising": "compartmentalizing",
320
+ "computerise": "computerize",
321
+ "computerised": "computerized",
322
+ "computerises": "computerizes",
323
+ "computerising": "computerizing",
324
+ "conceptualise": "conceptualize",
325
+ "conceptualised": "conceptualized",
326
+ "conceptualises": "conceptualizes",
327
+ "conceptualising": "conceptualizing",
328
+ "connexion": "connection",
329
+ "connexions": "connections",
330
+ "contextualise": "contextualize",
331
+ "contextualised": "contextualized",
332
+ "contextualises": "contextualizes",
333
+ "contextualising": "contextualizing",
334
+ "cosier": "cozier",
335
+ "cosies": "cozies",
336
+ "cosiest": "coziest",
337
+ "cosily": "cozily",
338
+ "cosiness": "coziness",
339
+ "cosy": "cozy",
340
+ "councillor": "councilor",
341
+ "councillors": "councilors",
342
+ "counselled": "counseled",
343
+ "counselling": "counseling",
344
+ "counsellor": "counselor",
345
+ "counsellors": "counselors",
346
+ "crenelated": "crenellated",
347
+ "criminalise": "criminalize",
348
+ "criminalised": "criminalized",
349
+ "criminalises": "criminalizes",
350
+ "criminalising": "criminalizing",
351
+ "criticise": "criticize",
352
+ "criticised": "criticized",
353
+ "criticises": "criticizes",
354
+ "criticising": "criticizing",
355
+ "crueller": "crueler",
356
+ "cruellest": "cruelest",
357
+ "crystallisation": "crystallization",
358
+ "crystallise": "crystallize",
359
+ "crystallised": "crystallized",
360
+ "crystallises": "crystallizes",
361
+ "crystallising": "crystallizing",
362
+ "cudgelled": "cudgeled",
363
+ "cudgelling": "cudgeling",
364
+ "customise": "customize",
365
+ "customised": "customized",
366
+ "customises": "customizes",
367
+ "customising": "customizing",
368
+ "cypher": "cipher",
369
+ "cyphers": "ciphers",
370
+ "decentralisation": "decentralization",
371
+ "decentralise": "decentralize",
372
+ "decentralised": "decentralized",
373
+ "decentralises": "decentralizes",
374
+ "decentralising": "decentralizing",
375
+ "decriminalisation": "decriminalization",
376
+ "decriminalise": "decriminalize",
377
+ "decriminalised": "decriminalized",
378
+ "decriminalises": "decriminalizes",
379
+ "decriminalising": "decriminalizing",
380
+ "defence": "defense",
381
+ "defenceless": "defenseless",
382
+ "defences": "defenses",
383
+ "dehumanisation": "dehumanization",
384
+ "dehumanise": "dehumanize",
385
+ "dehumanised": "dehumanized",
386
+ "dehumanises": "dehumanizes",
387
+ "dehumanising": "dehumanizing",
388
+ "demeanour": "demeanor",
389
+ "demilitarisation": "demilitarization",
390
+ "demilitarise": "demilitarize",
391
+ "demilitarised": "demilitarized",
392
+ "demilitarises": "demilitarizes",
393
+ "demilitarising": "demilitarizing",
394
+ "demobilisation": "demobilization",
395
+ "demobilise": "demobilize",
396
+ "demobilised": "demobilized",
397
+ "demobilises": "demobilizes",
398
+ "demobilising": "demobilizing",
399
+ "democratisation": "democratization",
400
+ "democratise": "democratize",
401
+ "democratised": "democratized",
402
+ "democratises": "democratizes",
403
+ "democratising": "democratizing",
404
+ "demonise": "demonize",
405
+ "demonised": "demonized",
406
+ "demonises": "demonizes",
407
+ "demonising": "demonizing",
408
+ "demoralisation": "demoralization",
409
+ "demoralise": "demoralize",
410
+ "demoralised": "demoralized",
411
+ "demoralises": "demoralizes",
412
+ "demoralising": "demoralizing",
413
+ "denationalisation": "denationalization",
414
+ "denationalise": "denationalize",
415
+ "denationalised": "denationalized",
416
+ "denationalises": "denationalizes",
417
+ "denationalising": "denationalizing",
418
+ "deodorise": "deodorize",
419
+ "deodorised": "deodorized",
420
+ "deodorises": "deodorizes",
421
+ "deodorising": "deodorizing",
422
+ "depersonalise": "depersonalize",
423
+ "depersonalised": "depersonalized",
424
+ "depersonalises": "depersonalizes",
425
+ "depersonalising": "depersonalizing",
426
+ "deputise": "deputize",
427
+ "deputised": "deputized",
428
+ "deputises": "deputizes",
429
+ "deputising": "deputizing",
430
+ "desensitisation": "desensitization",
431
+ "desensitise": "desensitize",
432
+ "desensitised": "desensitized",
433
+ "desensitises": "desensitizes",
434
+ "desensitising": "desensitizing",
435
+ "destabilisation": "destabilization",
436
+ "destabilise": "destabilize",
437
+ "destabilised": "destabilized",
438
+ "destabilises": "destabilizes",
439
+ "destabilising": "destabilizing",
440
+ "dialled": "dialed",
441
+ "dialling": "dialing",
442
+ "dialogue": "dialog",
443
+ "dialogues": "dialogs",
444
+ "diarrhoea": "diarrhea",
445
+ "digitise": "digitize",
446
+ "digitised": "digitized",
447
+ "digitises": "digitizes",
448
+ "digitising": "digitizing",
449
+ "disc": "disk",
450
+ "discolour": "discolor",
451
+ "discoloured": "discolored",
452
+ "discolouring": "discoloring",
453
+ "discolours": "discolors",
454
+ "discs": "disks",
455
+ "disembowelled": "disemboweled",
456
+ "disembowelling": "disemboweling",
457
+ "disfavour": "disfavor",
458
+ "dishevelled": "disheveled",
459
+ "dishonour": "dishonor",
460
+ "dishonourable": "dishonorable",
461
+ "dishonourably": "dishonorably",
462
+ "dishonoured": "dishonored",
463
+ "dishonouring": "dishonoring",
464
+ "dishonours": "dishonors",
465
+ "disorganisation": "disorganization",
466
+ "disorganised": "disorganized",
467
+ "distil": "distill",
468
+ "distils": "distills",
469
+ "dramatisation": "dramatization",
470
+ "dramatisations": "dramatizations",
471
+ "dramatise": "dramatize",
472
+ "dramatised": "dramatized",
473
+ "dramatises": "dramatizes",
474
+ "dramatising": "dramatizing",
475
+ "draught": "draft",
476
+ "draughtboard": "draftboard",
477
+ "draughtboards": "draftboards",
478
+ "draughtier": "draftier",
479
+ "draughtiest": "draftiest",
480
+ "draughts": "drafts",
481
+ "draughtsman": "draftsman",
482
+ "draughtsmanship": "draftsmanship",
483
+ "draughtsmen": "draftsmen",
484
+ "draughtswoman": "draftswoman",
485
+ "draughtswomen": "draftswomen",
486
+ "draughty": "drafty",
487
+ "drivelled": "driveled",
488
+ "drivelling": "driveling",
489
+ "duelled": "dueled",
490
+ "duelling": "dueling",
491
+ "economise": "economize",
492
+ "economised": "economized",
493
+ "economises": "economizes",
494
+ "economising": "economizing",
495
+ "editorialise": "editorialize",
496
+ "editorialised": "editorialized",
497
+ "editorialises": "editorializes",
498
+ "editorialising": "editorializing",
499
+ "edoema": "edema",
500
+ "empathise": "empathize",
501
+ "empathised": "empathized",
502
+ "empathises": "empathizes",
503
+ "empathising": "empathizing",
504
+ "emphasise": "emphasize",
505
+ "emphasised": "emphasized",
506
+ "emphasises": "emphasizes",
507
+ "emphasising": "emphasizing",
508
+ "enamelled": "enameled",
509
+ "enamelling": "enameling",
510
+ "enamoured": "enamored",
511
+ "encyclopaedia": "encyclopedia",
512
+ "encyclopaedias": "encyclopedias",
513
+ "encyclopaedic": "encyclopedic",
514
+ "endeavour": "endeavor",
515
+ "endeavoured": "endeavored",
516
+ "endeavouring": "endeavoring",
517
+ "endeavours": "endeavors",
518
+ "energise": "energize",
519
+ "energised": "energized",
520
+ "energises": "energizes",
521
+ "energising": "energizing",
522
+ "enrol": "enroll",
523
+ "enrols": "enrolls",
524
+ "enthral": "enthrall",
525
+ "enthrals": "enthralls",
526
+ "epaulette": "epaulet",
527
+ "epaulettes": "epaulets",
528
+ "epicentre": "epicenter",
529
+ "epicentres": "epicenters",
530
+ "epilogue": "epilog",
531
+ "epilogues": "epilogs",
532
+ "epitomise": "epitomize",
533
+ "epitomised": "epitomized",
534
+ "epitomises": "epitomizes",
535
+ "epitomising": "epitomizing",
536
+ "equalisation": "equalization",
537
+ "equalise": "equalize",
538
+ "equalised": "equalized",
539
+ "equaliser": "equalizer",
540
+ "equalisers": "equalizers",
541
+ "equalises": "equalizes",
542
+ "equalising": "equalizing",
543
+ "eulogise": "eulogize",
544
+ "eulogised": "eulogized",
545
+ "eulogises": "eulogizes",
546
+ "eulogising": "eulogizing",
547
+ "evangelise": "evangelize",
548
+ "evangelised": "evangelized",
549
+ "evangelises": "evangelizes",
550
+ "evangelising": "evangelizing",
551
+ "exorcise": "exorcize",
552
+ "exorcised": "exorcized",
553
+ "exorcises": "exorcizes",
554
+ "exorcising": "exorcizing",
555
+ "extemporisation": "extemporization",
556
+ "extemporise": "extemporize",
557
+ "extemporised": "extemporized",
558
+ "extemporises": "extemporizes",
559
+ "extemporising": "extemporizing",
560
+ "externalisation": "externalization",
561
+ "externalisations": "externalizations",
562
+ "externalise": "externalize",
563
+ "externalised": "externalized",
564
+ "externalises": "externalizes",
565
+ "externalising": "externalizing",
566
+ "factorise": "factorize",
567
+ "factorised": "factorized",
568
+ "factorises": "factorizes",
569
+ "factorising": "factorizing",
570
+ "faecal": "fecal",
571
+ "faeces": "feces",
572
+ "familiarisation": "familiarization",
573
+ "familiarise": "familiarize",
574
+ "familiarised": "familiarized",
575
+ "familiarises": "familiarizes",
576
+ "familiarising": "familiarizing",
577
+ "fantasise": "fantasize",
578
+ "fantasised": "fantasized",
579
+ "fantasises": "fantasizes",
580
+ "fantasising": "fantasizing",
581
+ "favour": "favor",
582
+ "favourable": "favorable",
583
+ "favourably": "favorably",
584
+ "favoured": "favored",
585
+ "favouring": "favoring",
586
+ "favourite": "favorite",
587
+ "favourites": "favorites",
588
+ "favouritism": "favoritism",
589
+ "favours": "favors",
590
+ "feminise": "feminize",
591
+ "feminised": "feminized",
592
+ "feminises": "feminizes",
593
+ "feminising": "feminizing",
594
+ "fertilisation": "fertilization",
595
+ "fertilise": "fertilize",
596
+ "fertilised": "fertilized",
597
+ "fertiliser": "fertilizer",
598
+ "fertilisers": "fertilizers",
599
+ "fertilises": "fertilizes",
600
+ "fertilising": "fertilizing",
601
+ "fervour": "fervor",
602
+ "fibre": "fiber",
603
+ "fibreglass": "fiberglass",
604
+ "fibres": "fibers",
605
+ "fictionalisation": "fictionalization",
606
+ "fictionalisations": "fictionalizations",
607
+ "fictionalise": "fictionalize",
608
+ "fictionalised": "fictionalized",
609
+ "fictionalises": "fictionalizes",
610
+ "fictionalising": "fictionalizing",
611
+ "fillet": "filet",
612
+ "filleted": "fileted",
613
+ "filleting": "fileting",
614
+ "fillets": "filets",
615
+ "finalisation": "finalization",
616
+ "finalise": "finalize",
617
+ "finalised": "finalized",
618
+ "finalises": "finalizes",
619
+ "finalising": "finalizing",
620
+ "flautist": "flutist",
621
+ "flautists": "flutists",
622
+ "flavour": "flavor",
623
+ "flavoured": "flavored",
624
+ "flavouring": "flavoring",
625
+ "flavourings": "flavorings",
626
+ "flavourless": "flavorless",
627
+ "flavours": "flavors",
628
+ "flavoursome": "flavorsome",
629
+ "flyer / flier": "flier / flyer",
630
+ "foetal": "fetal",
631
+ "foetid": "fetid",
632
+ "foetus": "fetus",
633
+ "foetuses": "fetuses",
634
+ "formalisation": "formalization",
635
+ "formalise": "formalize",
636
+ "formalised": "formalized",
637
+ "formalises": "formalizes",
638
+ "formalising": "formalizing",
639
+ "fossilisation": "fossilization",
640
+ "fossilise": "fossilize",
641
+ "fossilised": "fossilized",
642
+ "fossilises": "fossilizes",
643
+ "fossilising": "fossilizing",
644
+ "fraternisation": "fraternization",
645
+ "fraternise": "fraternize",
646
+ "fraternised": "fraternized",
647
+ "fraternises": "fraternizes",
648
+ "fraternising": "fraternizing",
649
+ "fulfil": "fulfill",
650
+ "fulfilment": "fulfillment",
651
+ "fulfils": "fulfills",
652
+ "funnelled": "funneled",
653
+ "funnelling": "funneling",
654
+ "gage": "gauge",
655
+ "gaged": "gauged",
656
+ "gages": "gauges",
657
+ "gaging": "gauging",
658
+ "galvanise": "galvanize",
659
+ "galvanised": "galvanized",
660
+ "galvanises": "galvanizes",
661
+ "galvanising": "galvanizing",
662
+ "gambolled": "gamboled",
663
+ "gambolling": "gamboling",
664
+ "gaol": "jail",
665
+ "gaolbird": "jailbird",
666
+ "gaolbirds": "jailbirds",
667
+ "gaolbreak": "jailbreak",
668
+ "gaolbreaks": "jailbreaks",
669
+ "gaoled": "jailed",
670
+ "gaoler": "jailer",
671
+ "gaolers": "jailers",
672
+ "gaoling": "jailing",
673
+ "gaols": "jails",
674
+ "gasses": "gases",
675
+ "generalisation": "generalization",
676
+ "generalisations": "generalizations",
677
+ "generalise": "generalize",
678
+ "generalised": "generalized",
679
+ "generalises": "generalizes",
680
+ "generalising": "generalizing",
681
+ "ghettoise": "ghettoize",
682
+ "ghettoised": "ghettoized",
683
+ "ghettoises": "ghettoizes",
684
+ "ghettoising": "ghettoizing",
685
+ "gipsies": "gypsies",
686
+ "glamor": "glamour",
687
+ "glamorise": "glamorize",
688
+ "glamorised": "glamorized",
689
+ "glamorises": "glamorizes",
690
+ "glamorising": "glamorizing",
691
+ "globalisation": "globalization",
692
+ "globalise": "globalize",
693
+ "globalised": "globalized",
694
+ "globalises": "globalizes",
695
+ "globalising": "globalizing",
696
+ "glueing": "gluing",
697
+ "goitre": "goiter",
698
+ "goitres": "goiters",
699
+ "gonorrhoea": "gonorrhea",
700
+ "gramme": "gram",
701
+ "grammes": "grams",
702
+ "gravelled": "graveled",
703
+ "grey": "gray",
704
+ "greyed": "grayed",
705
+ "greying": "graying",
706
+ "greyish": "grayish",
707
+ "greyness": "grayness",
708
+ "greys": "grays",
709
+ "grovelled": "groveled",
710
+ "grovelling": "groveling",
711
+ "groyne": "groin",
712
+ "groynes": "groins",
713
+ "gruelling": "grueling",
714
+ "gruellingly": "gruelingly",
715
+ "gryphon": "griffin",
716
+ "gryphons": "griffins",
717
+ "gynaecological": "gynecological",
718
+ "gynaecologist": "gynecologist",
719
+ "gynaecologists": "gynecologists",
720
+ "gynaecology": "gynecology",
721
+ "haematological": "hematological",
722
+ "haematologist": "hematologist",
723
+ "haematologists": "hematologists",
724
+ "haematology": "hematology",
725
+ "haemoglobin": "hemoglobin",
726
+ "haemophilia": "hemophilia",
727
+ "haemophiliac": "hemophiliac",
728
+ "haemophiliacs": "hemophiliacs",
729
+ "haemorrhage": "hemorrhage",
730
+ "haemorrhaged": "hemorrhaged",
731
+ "haemorrhages": "hemorrhages",
732
+ "haemorrhaging": "hemorrhaging",
733
+ "haemorrhoids": "hemorrhoids",
734
+ "harbour": "harbor",
735
+ "harboured": "harbored",
736
+ "harbouring": "harboring",
737
+ "harbours": "harbors",
738
+ "harmonisation": "harmonization",
739
+ "harmonise": "harmonize",
740
+ "harmonised": "harmonized",
741
+ "harmonises": "harmonizes",
742
+ "harmonising": "harmonizing",
743
+ "homoeopath": "homeopath",
744
+ "homoeopathic": "homeopathic",
745
+ "homoeopaths": "homeopaths",
746
+ "homoeopathy": "homeopathy",
747
+ "homogenise": "homogenize",
748
+ "homogenised": "homogenized",
749
+ "homogenises": "homogenizes",
750
+ "homogenising": "homogenizing",
751
+ "honour": "honor",
752
+ "honourable": "honorable",
753
+ "honourably": "honorably",
754
+ "honoured": "honored",
755
+ "honouring": "honoring",
756
+ "honours": "honors",
757
+ "hospitalisation": "hospitalization",
758
+ "hospitalise": "hospitalize",
759
+ "hospitalised": "hospitalized",
760
+ "hospitalises": "hospitalizes",
761
+ "hospitalising": "hospitalizing",
762
+ "humanise": "humanize",
763
+ "humanised": "humanized",
764
+ "humanises": "humanizes",
765
+ "humanising": "humanizing",
766
+ "humour": "humor",
767
+ "humoured": "humored",
768
+ "humouring": "humoring",
769
+ "humourless": "humorless",
770
+ "humours": "humors",
771
+ "hybridise": "hybridize",
772
+ "hybridised": "hybridized",
773
+ "hybridises": "hybridizes",
774
+ "hybridising": "hybridizing",
775
+ "hypnotise": "hypnotize",
776
+ "hypnotised": "hypnotized",
777
+ "hypnotises": "hypnotizes",
778
+ "hypnotising": "hypnotizing",
779
+ "hypothesise": "hypothesize",
780
+ "hypothesised": "hypothesized",
781
+ "hypothesises": "hypothesizes",
782
+ "hypothesising": "hypothesizing",
783
+ "idealisation": "idealization",
784
+ "idealise": "idealize",
785
+ "idealised": "idealized",
786
+ "idealises": "idealizes",
787
+ "idealising": "idealizing",
788
+ "idolise": "idolize",
789
+ "idolised": "idolized",
790
+ "idolises": "idolizes",
791
+ "idolising": "idolizing",
792
+ "immobilisation": "immobilization",
793
+ "immobilise": "immobilize",
794
+ "immobilised": "immobilized",
795
+ "immobiliser": "immobilizer",
796
+ "immobilisers": "immobilizers",
797
+ "immobilises": "immobilizes",
798
+ "immobilising": "immobilizing",
799
+ "immortalise": "immortalize",
800
+ "immortalised": "immortalized",
801
+ "immortalises": "immortalizes",
802
+ "immortalising": "immortalizing",
803
+ "immunisation": "immunization",
804
+ "immunise": "immunize",
805
+ "immunised": "immunized",
806
+ "immunises": "immunizes",
807
+ "immunising": "immunizing",
808
+ "impanelled": "impaneled",
809
+ "impanelling": "impaneling",
810
+ "imperilled": "imperiled",
811
+ "imperilling": "imperiling",
812
+ "individualise": "individualize",
813
+ "individualised": "individualized",
814
+ "individualises": "individualizes",
815
+ "individualising": "individualizing",
816
+ "industrialise": "industrialize",
817
+ "industrialised": "industrialized",
818
+ "industrialises": "industrializes",
819
+ "industrialising": "industrializing",
820
+ "inflexion": "inflection",
821
+ "inflexions": "inflections",
822
+ "initialise": "initialize",
823
+ "initialised": "initialized",
824
+ "initialises": "initializes",
825
+ "initialising": "initializing",
826
+ "initialled": "initialed",
827
+ "initialling": "initialing",
828
+ "instal": "install",
829
+ "instalment": "installment",
830
+ "instalments": "installments",
831
+ "instals": "installs",
832
+ "instil": "instill",
833
+ "instils": "instills",
834
+ "institutionalisation": "institutionalization",
835
+ "institutionalise": "institutionalize",
836
+ "institutionalised": "institutionalized",
837
+ "institutionalises": "institutionalizes",
838
+ "institutionalising": "institutionalizing",
839
+ "intellectualise": "intellectualize",
840
+ "intellectualised": "intellectualized",
841
+ "intellectualises": "intellectualizes",
842
+ "intellectualising": "intellectualizing",
843
+ "internalisation": "internalization",
844
+ "internalise": "internalize",
845
+ "internalised": "internalized",
846
+ "internalises": "internalizes",
847
+ "internalising": "internalizing",
848
+ "internationalisation": "internationalization",
849
+ "internationalise": "internationalize",
850
+ "internationalised": "internationalized",
851
+ "internationalises": "internationalizes",
852
+ "internationalising": "internationalizing",
853
+ "ionisation": "ionization",
854
+ "ionise": "ionize",
855
+ "ionised": "ionized",
856
+ "ioniser": "ionizer",
857
+ "ionisers": "ionizers",
858
+ "ionises": "ionizes",
859
+ "ionising": "ionizing",
860
+ "italicise": "italicize",
861
+ "italicised": "italicized",
862
+ "italicises": "italicizes",
863
+ "italicising": "italicizing",
864
+ "itemise": "itemize",
865
+ "itemised": "itemized",
866
+ "itemises": "itemizes",
867
+ "itemising": "itemizing",
868
+ "jeopardise": "jeopardize",
869
+ "jeopardised": "jeopardized",
870
+ "jeopardises": "jeopardizes",
871
+ "jeopardising": "jeopardizing",
872
+ "jewelled": "jeweled",
873
+ "jeweller": "jeweler",
874
+ "jewellers": "jewelers",
875
+ "jewellery": "jewelry",
876
+ "judgement": "judgment",
877
+ "kilogramme": "kilogram",
878
+ "kilogrammes": "kilograms",
879
+ "kilometre": "kilometer",
880
+ "kilometres": "kilometers",
881
+ "labelled": "labeled",
882
+ "labelling": "labeling",
883
+ "labour": "labor",
884
+ "laboured": "labored",
885
+ "labourer": "laborer",
886
+ "labourers": "laborers",
887
+ "labouring": "laboring",
888
+ "labours": "labors",
889
+ "lacklustre": "lackluster",
890
+ "legalisation": "legalization",
891
+ "legalise": "legalize",
892
+ "legalised": "legalized",
893
+ "legalises": "legalizes",
894
+ "legalising": "legalizing",
895
+ "legitimise": "legitimize",
896
+ "legitimised": "legitimized",
897
+ "legitimises": "legitimizes",
898
+ "legitimising": "legitimizing",
899
+ "leukaemia": "leukemia",
900
+ "levelled": "leveled",
901
+ "leveller": "leveler",
902
+ "levellers": "levelers",
903
+ "levelling": "leveling",
904
+ "libelled": "libeled",
905
+ "libelling": "libeling",
906
+ "libellous": "libelous",
907
+ "liberalisation": "liberalization",
908
+ "liberalise": "liberalize",
909
+ "liberalised": "liberalized",
910
+ "liberalises": "liberalizes",
911
+ "liberalising": "liberalizing",
912
+ "licence": "license",
913
+ "licenced": "licensed",
914
+ "licences": "licenses",
915
+ "licencing": "licensing",
916
+ "likeable": "likable",
917
+ "lionisation": "lionization",
918
+ "lionise": "lionize",
919
+ "lionised": "lionized",
920
+ "lionises": "lionizes",
921
+ "lionising": "lionizing",
922
+ "liquidise": "liquidize",
923
+ "liquidised": "liquidized",
924
+ "liquidiser": "liquidizer",
925
+ "liquidisers": "liquidizers",
926
+ "liquidises": "liquidizes",
927
+ "liquidising": "liquidizing",
928
+ "litre": "liter",
929
+ "litres": "liters",
930
+ "localise": "localize",
931
+ "localised": "localized",
932
+ "localises": "localizes",
933
+ "localising": "localizing",
934
+ "louvre": "louver",
935
+ "louvred": "louvered",
936
+ "louvres": "louvers",
937
+ "lustre": "luster",
938
+ "magnetise": "magnetize",
939
+ "magnetised": "magnetized",
940
+ "magnetises": "magnetizes",
941
+ "magnetising": "magnetizing",
942
+ "manoeuvrability": "maneuverability",
943
+ "manoeuvrable": "maneuverable",
944
+ "manoeuvre": "maneuver",
945
+ "manoeuvred": "maneuvered",
946
+ "manoeuvres": "maneuvers",
947
+ "manoeuvring": "maneuvering",
948
+ "manoeuvrings": "maneuverings",
949
+ "marginalisation": "marginalization",
950
+ "marginalise": "marginalize",
951
+ "marginalised": "marginalized",
952
+ "marginalises": "marginalizes",
953
+ "marginalising": "marginalizing",
954
+ "marshalled": "marshaled",
955
+ "marshalling": "marshaling",
956
+ "marvelled": "marveled",
957
+ "marvelling": "marveling",
958
+ "marvellous": "marvelous",
959
+ "marvellously": "marvelously",
960
+ "materialisation": "materialization",
961
+ "materialise": "materialize",
962
+ "materialised": "materialized",
963
+ "materialises": "materializes",
964
+ "materialising": "materializing",
965
+ "maximisation": "maximization",
966
+ "maximise": "maximize",
967
+ "maximised": "maximized",
968
+ "maximises": "maximizes",
969
+ "maximising": "maximizing",
970
+ "meagre": "meager",
971
+ "mechanisation": "mechanization",
972
+ "mechanise": "mechanize",
973
+ "mechanised": "mechanized",
974
+ "mechanises": "mechanizes",
975
+ "mechanising": "mechanizing",
976
+ "mediaeval": "medieval",
977
+ "memorialise": "memorialize",
978
+ "memorialised": "memorialized",
979
+ "memorialises": "memorializes",
980
+ "memorialising": "memorializing",
981
+ "memorise": "memorize",
982
+ "memorised": "memorized",
983
+ "memorises": "memorizes",
984
+ "memorising": "memorizing",
985
+ "mesmerise": "mesmerize",
986
+ "mesmerised": "mesmerized",
987
+ "mesmerises": "mesmerizes",
988
+ "mesmerising": "mesmerizing",
989
+ "metabolise": "metabolize",
990
+ "metabolised": "metabolized",
991
+ "metabolises": "metabolizes",
992
+ "metabolising": "metabolizing",
993
+ "metre": "meter",
994
+ "metres": "meters",
995
+ "mhm": "hmm",
996
+ "micrometre": "micrometer",
997
+ "micrometres": "micrometers",
998
+ "militarise": "militarize",
999
+ "militarised": "militarized",
1000
+ "militarises": "militarizes",
1001
+ "militarising": "militarizing",
1002
+ "milligramme": "milligram",
1003
+ "milligrammes": "milligrams",
1004
+ "millilitre": "milliliter",
1005
+ "millilitres": "milliliters",
1006
+ "millimetre": "millimeter",
1007
+ "millimetres": "millimeters",
1008
+ "miniaturisation": "miniaturization",
1009
+ "miniaturise": "miniaturize",
1010
+ "miniaturised": "miniaturized",
1011
+ "miniaturises": "miniaturizes",
1012
+ "miniaturising": "miniaturizing",
1013
+ "minibusses": "minibuses",
1014
+ "minimise": "minimize",
1015
+ "minimised": "minimized",
1016
+ "minimises": "minimizes",
1017
+ "minimising": "minimizing",
1018
+ "misbehaviour": "misbehavior",
1019
+ "misdemeanour": "misdemeanor",
1020
+ "misdemeanours": "misdemeanors",
1021
+ "misspelt": "misspelled",
1022
+ "mitre": "miter",
1023
+ "mitres": "miters",
1024
+ "mm": "hmm",
1025
+ "mmm": "hmm",
1026
+ "mobilisation": "mobilization",
1027
+ "mobilise": "mobilize",
1028
+ "mobilised": "mobilized",
1029
+ "mobilises": "mobilizes",
1030
+ "mobilising": "mobilizing",
1031
+ "modelled": "modeled",
1032
+ "modeller": "modeler",
1033
+ "modellers": "modelers",
1034
+ "modelling": "modeling",
1035
+ "modernise": "modernize",
1036
+ "modernised": "modernized",
1037
+ "modernises": "modernizes",
1038
+ "modernising": "modernizing",
1039
+ "moisturise": "moisturize",
1040
+ "moisturised": "moisturized",
1041
+ "moisturiser": "moisturizer",
1042
+ "moisturisers": "moisturizers",
1043
+ "moisturises": "moisturizes",
1044
+ "moisturising": "moisturizing",
1045
+ "monologue": "monolog",
1046
+ "monologues": "monologs",
1047
+ "monopolisation": "monopolization",
1048
+ "monopolise": "monopolize",
1049
+ "monopolised": "monopolized",
1050
+ "monopolises": "monopolizes",
1051
+ "monopolising": "monopolizing",
1052
+ "moralise": "moralize",
1053
+ "moralised": "moralized",
1054
+ "moralises": "moralizes",
1055
+ "moralising": "moralizing",
1056
+ "motorised": "motorized",
1057
+ "mould": "mold",
1058
+ "moulded": "molded",
1059
+ "moulder": "molder",
1060
+ "mouldered": "moldered",
1061
+ "mouldering": "moldering",
1062
+ "moulders": "molders",
1063
+ "mouldier": "moldier",
1064
+ "mouldiest": "moldiest",
1065
+ "moulding": "molding",
1066
+ "mouldings": "moldings",
1067
+ "moulds": "molds",
1068
+ "mouldy": "moldy",
1069
+ "moult": "molt",
1070
+ "moulted": "molted",
1071
+ "moulting": "molting",
1072
+ "moults": "molts",
1073
+ "moustache": "mustache",
1074
+ "moustached": "mustached",
1075
+ "moustaches": "mustaches",
1076
+ "moustachioed": "mustachioed",
1077
+ "multicoloured": "multicolored",
1078
+ "nationalisation": "nationalization",
1079
+ "nationalisations": "nationalizations",
1080
+ "nationalise": "nationalize",
1081
+ "nationalised": "nationalized",
1082
+ "nationalises": "nationalizes",
1083
+ "nationalising": "nationalizing",
1084
+ "naturalisation": "naturalization",
1085
+ "naturalise": "naturalize",
1086
+ "naturalised": "naturalized",
1087
+ "naturalises": "naturalizes",
1088
+ "naturalising": "naturalizing",
1089
+ "neighbour": "neighbor",
1090
+ "neighbourhood": "neighborhood",
1091
+ "neighbourhoods": "neighborhoods",
1092
+ "neighbouring": "neighboring",
1093
+ "neighbourliness": "neighborliness",
1094
+ "neighbourly": "neighborly",
1095
+ "neighbours": "neighbors",
1096
+ "neutralisation": "neutralization",
1097
+ "neutralise": "neutralize",
1098
+ "neutralised": "neutralized",
1099
+ "neutralises": "neutralizes",
1100
+ "neutralising": "neutralizing",
1101
+ "normalisation": "normalization",
1102
+ "normalise": "normalize",
1103
+ "normalised": "normalized",
1104
+ "normalises": "normalizes",
1105
+ "normalising": "normalizing",
1106
+ "odour": "odor",
1107
+ "odourless": "odorless",
1108
+ "odours": "odors",
1109
+ "oesophagus": "esophagus",
1110
+ "oesophaguses": "esophaguses",
1111
+ "oestrogen": "estrogen",
1112
+ "offence": "offense",
1113
+ "offences": "offenses",
1114
+ "omelette": "omelet",
1115
+ "omelettes": "omelets",
1116
+ "optimise": "optimize",
1117
+ "optimised": "optimized",
1118
+ "optimises": "optimizes",
1119
+ "optimising": "optimizing",
1120
+ "organisation": "organization",
1121
+ "organisational": "organizational",
1122
+ "organisations": "organizations",
1123
+ "organise": "organize",
1124
+ "organised": "organized",
1125
+ "organiser": "organizer",
1126
+ "organisers": "organizers",
1127
+ "organises": "organizes",
1128
+ "organising": "organizing",
1129
+ "orthopaedic": "orthopedic",
1130
+ "orthopaedics": "orthopedics",
1131
+ "ostracise": "ostracize",
1132
+ "ostracised": "ostracized",
1133
+ "ostracises": "ostracizes",
1134
+ "ostracising": "ostracizing",
1135
+ "outmanoeuvre": "outmaneuver",
1136
+ "outmanoeuvred": "outmaneuvered",
1137
+ "outmanoeuvres": "outmaneuvers",
1138
+ "outmanoeuvring": "outmaneuvering",
1139
+ "overemphasise": "overemphasize",
1140
+ "overemphasised": "overemphasized",
1141
+ "overemphasises": "overemphasizes",
1142
+ "overemphasising": "overemphasizing",
1143
+ "oxidisation": "oxidization",
1144
+ "oxidise": "oxidize",
1145
+ "oxidised": "oxidized",
1146
+ "oxidises": "oxidizes",
1147
+ "oxidising": "oxidizing",
1148
+ "paederast": "pederast",
1149
+ "paederasts": "pederasts",
1150
+ "paediatric": "pediatric",
1151
+ "paediatrician": "pediatrician",
1152
+ "paediatricians": "pediatricians",
1153
+ "paediatrics": "pediatrics",
1154
+ "paedophile": "pedophile",
1155
+ "paedophiles": "pedophiles",
1156
+ "paedophilia": "pedophilia",
1157
+ "palaeolithic": "paleolithic",
1158
+ "palaeontologist": "paleontologist",
1159
+ "palaeontologists": "paleontologists",
1160
+ "palaeontology": "paleontology",
1161
+ "panelled": "paneled",
1162
+ "panelling": "paneling",
1163
+ "panellist": "panelist",
1164
+ "panellists": "panelists",
1165
+ "paralyse": "paralyze",
1166
+ "paralysed": "paralyzed",
1167
+ "paralyses": "paralyzes",
1168
+ "paralysing": "paralyzing",
1169
+ "parcelled": "parceled",
1170
+ "parcelling": "parceling",
1171
+ "parlour": "parlor",
1172
+ "parlours": "parlors",
1173
+ "particularise": "particularize",
1174
+ "particularised": "particularized",
1175
+ "particularises": "particularizes",
1176
+ "particularising": "particularizing",
1177
+ "passivisation": "passivization",
1178
+ "passivise": "passivize",
1179
+ "passivised": "passivized",
1180
+ "passivises": "passivizes",
1181
+ "passivising": "passivizing",
1182
+ "pasteurisation": "pasteurization",
1183
+ "pasteurise": "pasteurize",
1184
+ "pasteurised": "pasteurized",
1185
+ "pasteurises": "pasteurizes",
1186
+ "pasteurising": "pasteurizing",
1187
+ "patronise": "patronize",
1188
+ "patronised": "patronized",
1189
+ "patronises": "patronizes",
1190
+ "patronising": "patronizing",
1191
+ "patronisingly": "patronizingly",
1192
+ "pedalled": "pedaled",
1193
+ "pedalling": "pedaling",
1194
+ "pedestrianisation": "pedestrianization",
1195
+ "pedestrianise": "pedestrianize",
1196
+ "pedestrianised": "pedestrianized",
1197
+ "pedestrianises": "pedestrianizes",
1198
+ "pedestrianising": "pedestrianizing",
1199
+ "penalise": "penalize",
1200
+ "penalised": "penalized",
1201
+ "penalises": "penalizes",
1202
+ "penalising": "penalizing",
1203
+ "pencilled": "penciled",
1204
+ "pencilling": "penciling",
1205
+ "personalise": "personalize",
1206
+ "personalised": "personalized",
1207
+ "personalises": "personalizes",
1208
+ "personalising": "personalizing",
1209
+ "pharmacopoeia": "pharmacopeia",
1210
+ "pharmacopoeias": "pharmacopeias",
1211
+ "philosophise": "philosophize",
1212
+ "philosophised": "philosophized",
1213
+ "philosophises": "philosophizes",
1214
+ "philosophising": "philosophizing",
1215
+ "philtre": "filter",
1216
+ "philtres": "filters",
1217
+ "phoney": "phony",
1218
+ "plagiarise": "plagiarize",
1219
+ "plagiarised": "plagiarized",
1220
+ "plagiarises": "plagiarizes",
1221
+ "plagiarising": "plagiarizing",
1222
+ "plough": "plow",
1223
+ "ploughed": "plowed",
1224
+ "ploughing": "plowing",
1225
+ "ploughman": "plowman",
1226
+ "ploughmen": "plowmen",
1227
+ "ploughs": "plows",
1228
+ "ploughshare": "plowshare",
1229
+ "ploughshares": "plowshares",
1230
+ "polarisation": "polarization",
1231
+ "polarise": "polarize",
1232
+ "polarised": "polarized",
1233
+ "polarises": "polarizes",
1234
+ "polarising": "polarizing",
1235
+ "politicisation": "politicization",
1236
+ "politicise": "politicize",
1237
+ "politicised": "politicized",
1238
+ "politicises": "politicizes",
1239
+ "politicising": "politicizing",
1240
+ "popularisation": "popularization",
1241
+ "popularise": "popularize",
1242
+ "popularised": "popularized",
1243
+ "popularises": "popularizes",
1244
+ "popularising": "popularizing",
1245
+ "pouffe": "pouf",
1246
+ "pouffes": "poufs",
1247
+ "practise": "practice",
1248
+ "practised": "practiced",
1249
+ "practises": "practices",
1250
+ "practising": "practicing",
1251
+ "praesidium": "presidium",
1252
+ "praesidiums": "presidiums",
1253
+ "pressurisation": "pressurization",
1254
+ "pressurise": "pressurize",
1255
+ "pressurised": "pressurized",
1256
+ "pressurises": "pressurizes",
1257
+ "pressurising": "pressurizing",
1258
+ "pretence": "pretense",
1259
+ "pretences": "pretenses",
1260
+ "primaeval": "primeval",
1261
+ "prioritisation": "prioritization",
1262
+ "prioritise": "prioritize",
1263
+ "prioritised": "prioritized",
1264
+ "prioritises": "prioritizes",
1265
+ "prioritising": "prioritizing",
1266
+ "privatisation": "privatization",
1267
+ "privatisations": "privatizations",
1268
+ "privatise": "privatize",
1269
+ "privatised": "privatized",
1270
+ "privatises": "privatizes",
1271
+ "privatising": "privatizing",
1272
+ "professionalisation": "professionalization",
1273
+ "professionalise": "professionalize",
1274
+ "professionalised": "professionalized",
1275
+ "professionalises": "professionalizes",
1276
+ "professionalising": "professionalizing",
1277
+ "programme": "program",
1278
+ "programmes": "programs",
1279
+ "prologue": "prolog",
1280
+ "prologues": "prologs",
1281
+ "propagandise": "propagandize",
1282
+ "propagandised": "propagandized",
1283
+ "propagandises": "propagandizes",
1284
+ "propagandising": "propagandizing",
1285
+ "proselytise": "proselytize",
1286
+ "proselytised": "proselytized",
1287
+ "proselytiser": "proselytizer",
1288
+ "proselytisers": "proselytizers",
1289
+ "proselytises": "proselytizes",
1290
+ "proselytising": "proselytizing",
1291
+ "psychoanalyse": "psychoanalyze",
1292
+ "psychoanalysed": "psychoanalyzed",
1293
+ "psychoanalyses": "psychoanalyzes",
1294
+ "psychoanalysing": "psychoanalyzing",
1295
+ "publicise": "publicize",
1296
+ "publicised": "publicized",
1297
+ "publicises": "publicizes",
1298
+ "publicising": "publicizing",
1299
+ "pulverisation": "pulverization",
1300
+ "pulverise": "pulverize",
1301
+ "pulverised": "pulverized",
1302
+ "pulverises": "pulverizes",
1303
+ "pulverising": "pulverizing",
1304
+ "pummelled": "pummel",
1305
+ "pummelling": "pummeled",
1306
+ "pyjama": "pajama",
1307
+ "pyjamas": "pajamas",
1308
+ "pzazz": "pizzazz",
1309
+ "quarrelled": "quarreled",
1310
+ "quarrelling": "quarreling",
1311
+ "radicalise": "radicalize",
1312
+ "radicalised": "radicalized",
1313
+ "radicalises": "radicalizes",
1314
+ "radicalising": "radicalizing",
1315
+ "rancour": "rancor",
1316
+ "randomise": "randomize",
1317
+ "randomised": "randomized",
1318
+ "randomises": "randomizes",
1319
+ "randomising": "randomizing",
1320
+ "rationalisation": "rationalization",
1321
+ "rationalisations": "rationalizations",
1322
+ "rationalise": "rationalize",
1323
+ "rationalised": "rationalized",
1324
+ "rationalises": "rationalizes",
1325
+ "rationalising": "rationalizing",
1326
+ "ravelled": "raveled",
1327
+ "ravelling": "raveling",
1328
+ "realisable": "realizable",
1329
+ "realisation": "realization",
1330
+ "realisations": "realizations",
1331
+ "realise": "realize",
1332
+ "realised": "realized",
1333
+ "realises": "realizes",
1334
+ "realising": "realizing",
1335
+ "recognisable": "recognizable",
1336
+ "recognisably": "recognizably",
1337
+ "recognisance": "recognizance",
1338
+ "recognise": "recognize",
1339
+ "recognised": "recognized",
1340
+ "recognises": "recognizes",
1341
+ "recognising": "recognizing",
1342
+ "reconnoitre": "reconnoiter",
1343
+ "reconnoitred": "reconnoitered",
1344
+ "reconnoitres": "reconnoiters",
1345
+ "reconnoitring": "reconnoitering",
1346
+ "refuelled": "refueled",
1347
+ "refuelling": "refueling",
1348
+ "regularisation": "regularization",
1349
+ "regularise": "regularize",
1350
+ "regularised": "regularized",
1351
+ "regularises": "regularizes",
1352
+ "regularising": "regularizing",
1353
+ "remodelled": "remodeled",
1354
+ "remodelling": "remodeling",
1355
+ "remould": "remold",
1356
+ "remoulded": "remolded",
1357
+ "remoulding": "remolding",
1358
+ "remoulds": "remolds",
1359
+ "reorganisation": "reorganization",
1360
+ "reorganisations": "reorganizations",
1361
+ "reorganise": "reorganize",
1362
+ "reorganised": "reorganized",
1363
+ "reorganises": "reorganizes",
1364
+ "reorganising": "reorganizing",
1365
+ "revelled": "reveled",
1366
+ "reveller": "reveler",
1367
+ "revellers": "revelers",
1368
+ "revelling": "reveling",
1369
+ "revitalise": "revitalize",
1370
+ "revitalised": "revitalized",
1371
+ "revitalises": "revitalizes",
1372
+ "revitalising": "revitalizing",
1373
+ "revolutionise": "revolutionize",
1374
+ "revolutionised": "revolutionized",
1375
+ "revolutionises": "revolutionizes",
1376
+ "revolutionising": "revolutionizing",
1377
+ "rhapsodise": "rhapsodize",
1378
+ "rhapsodised": "rhapsodized",
1379
+ "rhapsodises": "rhapsodizes",
1380
+ "rhapsodising": "rhapsodizing",
1381
+ "rigour": "rigor",
1382
+ "rigours": "rigors",
1383
+ "ritualised": "ritualized",
1384
+ "rivalled": "rivaled",
1385
+ "rivalling": "rivaling",
1386
+ "romanticise": "romanticize",
1387
+ "romanticised": "romanticized",
1388
+ "romanticises": "romanticizes",
1389
+ "romanticising": "romanticizing",
1390
+ "rumour": "rumor",
1391
+ "rumoured": "rumored",
1392
+ "rumours": "rumors",
1393
+ "sabre": "saber",
1394
+ "sabres": "sabers",
1395
+ "saltpetre": "saltpeter",
1396
+ "sanitise": "sanitize",
1397
+ "sanitised": "sanitized",
1398
+ "sanitises": "sanitizes",
1399
+ "sanitising": "sanitizing",
1400
+ "satirise": "satirize",
1401
+ "satirised": "satirized",
1402
+ "satirises": "satirizes",
1403
+ "satirising": "satirizing",
1404
+ "saviour": "savior",
1405
+ "saviours": "saviors",
1406
+ "savour": "savor",
1407
+ "savoured": "savored",
1408
+ "savouries": "savories",
1409
+ "savouring": "savoring",
1410
+ "savours": "savors",
1411
+ "savoury": "savory",
1412
+ "scandalise": "scandalize",
1413
+ "scandalised": "scandalized",
1414
+ "scandalises": "scandalizes",
1415
+ "scandalising": "scandalizing",
1416
+ "sceptic": "skeptic",
1417
+ "sceptical": "skeptical",
1418
+ "sceptically": "skeptically",
1419
+ "scepticism": "skepticism",
1420
+ "sceptics": "skeptics",
1421
+ "sceptre": "scepter",
1422
+ "sceptres": "scepters",
1423
+ "scrutinise": "scrutinize",
1424
+ "scrutinised": "scrutinized",
1425
+ "scrutinises": "scrutinizes",
1426
+ "scrutinising": "scrutinizing",
1427
+ "secularisation": "secularization",
1428
+ "secularise": "secularize",
1429
+ "secularised": "secularized",
1430
+ "secularises": "secularizes",
1431
+ "secularising": "secularizing",
1432
+ "sensationalise": "sensationalize",
1433
+ "sensationalised": "sensationalized",
1434
+ "sensationalises": "sensationalizes",
1435
+ "sensationalising": "sensationalizing",
1436
+ "sensitise": "sensitize",
1437
+ "sensitised": "sensitized",
1438
+ "sensitises": "sensitizes",
1439
+ "sensitising": "sensitizing",
1440
+ "sentimentalise": "sentimentalize",
1441
+ "sentimentalised": "sentimentalized",
1442
+ "sentimentalises": "sentimentalizes",
1443
+ "sentimentalising": "sentimentalizing",
1444
+ "sepulchre": "sepulcher",
1445
+ "sepulchres": "sepulchers",
1446
+ "serialisation": "serialization",
1447
+ "serialisations": "serializations",
1448
+ "serialise": "serialize",
1449
+ "serialised": "serialized",
1450
+ "serialises": "serializes",
1451
+ "serialising": "serializing",
1452
+ "sermonise": "sermonize",
1453
+ "sermonised": "sermonized",
1454
+ "sermonises": "sermonizes",
1455
+ "sermonising": "sermonizing",
1456
+ "sheikh": "sheik",
1457
+ "shovelled": "shoveled",
1458
+ "shovelling": "shoveling",
1459
+ "shrivelled": "shriveled",
1460
+ "shrivelling": "shriveling",
1461
+ "signalise": "signalize",
1462
+ "signalised": "signalized",
1463
+ "signalises": "signalizes",
1464
+ "signalising": "signalizing",
1465
+ "signalled": "signaled",
1466
+ "signalling": "signaling",
1467
+ "smoulder": "smolder",
1468
+ "smouldered": "smoldered",
1469
+ "smouldering": "smoldering",
1470
+ "smoulders": "smolders",
1471
+ "snivelled": "sniveled",
1472
+ "snivelling": "sniveling",
1473
+ "snorkelled": "snorkeled",
1474
+ "snorkelling": "snorkeling",
1475
+ "snowplough": "snowplow",
1476
+ "snowploughs": "snowplow",
1477
+ "socialisation": "socialization",
1478
+ "socialise": "socialize",
1479
+ "socialised": "socialized",
1480
+ "socialises": "socializes",
1481
+ "socialising": "socializing",
1482
+ "sodomise": "sodomize",
1483
+ "sodomised": "sodomized",
1484
+ "sodomises": "sodomizes",
1485
+ "sodomising": "sodomizing",
1486
+ "solemnise": "solemnize",
1487
+ "solemnised": "solemnized",
1488
+ "solemnises": "solemnizes",
1489
+ "solemnising": "solemnizing",
1490
+ "sombre": "somber",
1491
+ "specialisation": "specialization",
1492
+ "specialisations": "specializations",
1493
+ "specialise": "specialize",
1494
+ "specialised": "specialized",
1495
+ "specialises": "specializes",
1496
+ "specialising": "specializing",
1497
+ "spectre": "specter",
1498
+ "spectres": "specters",
1499
+ "spiralled": "spiraled",
1500
+ "spiralling": "spiraling",
1501
+ "splendour": "splendor",
1502
+ "splendours": "splendors",
1503
+ "squirrelled": "squirreled",
1504
+ "squirrelling": "squirreling",
1505
+ "stabilisation": "stabilization",
1506
+ "stabilise": "stabilize",
1507
+ "stabilised": "stabilized",
1508
+ "stabiliser": "stabilizer",
1509
+ "stabilisers": "stabilizers",
1510
+ "stabilises": "stabilizes",
1511
+ "stabilising": "stabilizing",
1512
+ "standardisation": "standardization",
1513
+ "standardise": "standardize",
1514
+ "standardised": "standardized",
1515
+ "standardises": "standardizes",
1516
+ "standardising": "standardizing",
1517
+ "stencilled": "stenciled",
1518
+ "stencilling": "stenciling",
1519
+ "sterilisation": "sterilization",
1520
+ "sterilisations": "sterilizations",
1521
+ "sterilise": "sterilize",
1522
+ "sterilised": "sterilized",
1523
+ "steriliser": "sterilizer",
1524
+ "sterilisers": "sterilizers",
1525
+ "sterilises": "sterilizes",
1526
+ "sterilising": "sterilizing",
1527
+ "stigmatisation": "stigmatization",
1528
+ "stigmatise": "stigmatize",
1529
+ "stigmatised": "stigmatized",
1530
+ "stigmatises": "stigmatizes",
1531
+ "stigmatising": "stigmatizing",
1532
+ "storey": "story",
1533
+ "storeys": "stories",
1534
+ "subsidisation": "subsidization",
1535
+ "subsidise": "subsidize",
1536
+ "subsidised": "subsidized",
1537
+ "subsidiser": "subsidizer",
1538
+ "subsidisers": "subsidizers",
1539
+ "subsidises": "subsidizes",
1540
+ "subsidising": "subsidizing",
1541
+ "succour": "succor",
1542
+ "succoured": "succored",
1543
+ "succouring": "succoring",
1544
+ "succours": "succors",
1545
+ "sulphate": "sulfate",
1546
+ "sulphates": "sulfates",
1547
+ "sulphide": "sulfide",
1548
+ "sulphides": "sulfides",
1549
+ "sulphur": "sulfur",
1550
+ "sulphurous": "sulfurous",
1551
+ "summarise": "summarize",
1552
+ "summarised": "summarized",
1553
+ "summarises": "summarizes",
1554
+ "summarising": "summarizing",
1555
+ "swivelled": "swiveled",
1556
+ "swivelling": "swiveling",
1557
+ "symbolise": "symbolize",
1558
+ "symbolised": "symbolized",
1559
+ "symbolises": "symbolizes",
1560
+ "symbolising": "symbolizing",
1561
+ "sympathise": "sympathize",
1562
+ "sympathised": "sympathized",
1563
+ "sympathiser": "sympathizer",
1564
+ "sympathisers": "sympathizers",
1565
+ "sympathises": "sympathizes",
1566
+ "sympathising": "sympathizing",
1567
+ "synchronisation": "synchronization",
1568
+ "synchronise": "synchronize",
1569
+ "synchronised": "synchronized",
1570
+ "synchronises": "synchronizes",
1571
+ "synchronising": "synchronizing",
1572
+ "synthesise": "synthesize",
1573
+ "synthesised": "synthesized",
1574
+ "synthesiser": "synthesizer",
1575
+ "synthesisers": "synthesizers",
1576
+ "synthesises": "synthesizes",
1577
+ "synthesising": "synthesizing",
1578
+ "syphon": "siphon",
1579
+ "syphoned": "siphoned",
1580
+ "syphoning": "siphoning",
1581
+ "syphons": "siphons",
1582
+ "systematisation": "systematization",
1583
+ "systematise": "systematize",
1584
+ "systematised": "systematized",
1585
+ "systematises": "systematizes",
1586
+ "systematising": "systematizing",
1587
+ "tantalise": "tantalize",
1588
+ "tantalised": "tantalized",
1589
+ "tantalises": "tantalizes",
1590
+ "tantalising": "tantalizing",
1591
+ "tantalisingly": "tantalizingly",
1592
+ "tasselled": "tasseled",
1593
+ "technicolour": "technicolor",
1594
+ "temporise": "temporize",
1595
+ "temporised": "temporized",
1596
+ "temporises": "temporizes",
1597
+ "temporising": "temporizing",
1598
+ "tenderise": "tenderize",
1599
+ "tenderised": "tenderized",
1600
+ "tenderises": "tenderizes",
1601
+ "tenderising": "tenderizing",
1602
+ "terrorise": "terrorize",
1603
+ "terrorised": "terrorized",
1604
+ "terrorises": "terrorizes",
1605
+ "terrorising": "terrorizing",
1606
+ "theatre": "theater",
1607
+ "theatregoer": "theatergoer",
1608
+ "theatregoers": "theatergoers",
1609
+ "theatres": "theaters",
1610
+ "theorise": "theorize",
1611
+ "theorised": "theorized",
1612
+ "theorises": "theorizes",
1613
+ "theorising": "theorizing",
1614
+ "tonne": "ton",
1615
+ "tonnes": "tons",
1616
+ "towelled": "toweled",
1617
+ "towelling": "toweling",
1618
+ "toxaemia": "toxemia",
1619
+ "tranquillise": "tranquilize",
1620
+ "tranquillised": "tranquilized",
1621
+ "tranquilliser": "tranquilizer",
1622
+ "tranquillisers": "tranquilizers",
1623
+ "tranquillises": "tranquilizes",
1624
+ "tranquillising": "tranquilizing",
1625
+ "tranquillity": "tranquility",
1626
+ "tranquillize": "tranquilize",
1627
+ "tranquillized": "tranquilized",
1628
+ "tranquillizer": "tranquilizer",
1629
+ "tranquillizers": "tranquilizers",
1630
+ "tranquillizes": "tranquilizes",
1631
+ "tranquillizing": "tranquilizing",
1632
+ "tranquilly": "tranquility",
1633
+ "transistorised": "transistorized",
1634
+ "traumatise": "traumatize",
1635
+ "traumatised": "traumatized",
1636
+ "traumatises": "traumatizes",
1637
+ "traumatising": "traumatizing",
1638
+ "travelled": "traveled",
1639
+ "traveller": "traveler",
1640
+ "travellers": "travelers",
1641
+ "travelling": "traveling",
1642
+ "travelog": "travelogue",
1643
+ "travelogs": "travelogues",
1644
+ "trialled": "trialed",
1645
+ "trialling": "trialing",
1646
+ "tricolour": "tricolor",
1647
+ "tricolours": "tricolors",
1648
+ "trivialise": "trivialize",
1649
+ "trivialised": "trivialized",
1650
+ "trivialises": "trivializes",
1651
+ "trivialising": "trivializing",
1652
+ "tumour": "tumor",
1653
+ "tumours": "tumors",
1654
+ "tunnelled": "tunneled",
1655
+ "tunnelling": "tunneling",
1656
+ "tyrannise": "tyrannize",
1657
+ "tyrannised": "tyrannized",
1658
+ "tyrannises": "tyrannizes",
1659
+ "tyrannising": "tyrannizing",
1660
+ "tyre": "tire",
1661
+ "tyres": "tires",
1662
+ "unauthorised": "unauthorized",
1663
+ "uncivilised": "uncivilized",
1664
+ "underutilised": "underutilized",
1665
+ "unequalled": "unequaled",
1666
+ "unfavourable": "unfavorable",
1667
+ "unfavourably": "unfavorably",
1668
+ "unionisation": "unionization",
1669
+ "unionise": "unionize",
1670
+ "unionised": "unionized",
1671
+ "unionises": "unionizes",
1672
+ "unionising": "unionizing",
1673
+ "unorganised": "unorganized",
1674
+ "unravelled": "unraveled",
1675
+ "unravelling": "unraveling",
1676
+ "unrecognisable": "unrecognizable",
1677
+ "unrecognised": "unrecognized",
1678
+ "unrivalled": "unrivaled",
1679
+ "unsavoury": "unsavory",
1680
+ "untrammelled": "untrammeled",
1681
+ "urbanisation": "urbanization",
1682
+ "urbanise": "urbanize",
1683
+ "urbanised": "urbanized",
1684
+ "urbanises": "urbanizes",
1685
+ "urbanising": "urbanizing",
1686
+ "utilisable": "utilizable",
1687
+ "utilisation": "utilization",
1688
+ "utilise": "utilize",
1689
+ "utilised": "utilized",
1690
+ "utilises": "utilizes",
1691
+ "utilising": "utilizing",
1692
+ "valour": "valor",
1693
+ "vandalise": "vandalize",
1694
+ "vandalised": "vandalized",
1695
+ "vandalises": "vandalizes",
1696
+ "vandalising": "vandalizing",
1697
+ "vaporisation": "vaporization",
1698
+ "vaporise": "vaporize",
1699
+ "vaporised": "vaporized",
1700
+ "vaporises": "vaporizes",
1701
+ "vaporising": "vaporizing",
1702
+ "vapour": "vapor",
1703
+ "vapours": "vapors",
1704
+ "verbalise": "verbalize",
1705
+ "verbalised": "verbalized",
1706
+ "verbalises": "verbalizes",
1707
+ "verbalising": "verbalizing",
1708
+ "victimisation": "victimization",
1709
+ "victimise": "victimize",
1710
+ "victimised": "victimized",
1711
+ "victimises": "victimizes",
1712
+ "victimising": "victimizing",
1713
+ "videodisc": "videodisk",
1714
+ "videodiscs": "videodisks",
1715
+ "vigour": "vigor",
1716
+ "visualisation": "visualization",
1717
+ "visualisations": "visualizations",
1718
+ "visualise": "visualize",
1719
+ "visualised": "visualized",
1720
+ "visualises": "visualizes",
1721
+ "visualising": "visualizing",
1722
+ "vocalisation": "vocalization",
1723
+ "vocalisations": "vocalizations",
1724
+ "vocalise": "vocalize",
1725
+ "vocalised": "vocalized",
1726
+ "vocalises": "vocalizes",
1727
+ "vocalising": "vocalizing",
1728
+ "vulcanised": "vulcanized",
1729
+ "vulgarisation": "vulgarization",
1730
+ "vulgarise": "vulgarize",
1731
+ "vulgarised": "vulgarized",
1732
+ "vulgarises": "vulgarizes",
1733
+ "vulgarising": "vulgarizing",
1734
+ "waggon": "wagon",
1735
+ "waggons": "wagons",
1736
+ "watercolour": "watercolor",
1737
+ "watercolours": "watercolors",
1738
+ "weaselled": "weaseled",
1739
+ "weaselling": "weaseling",
1740
+ "westernisation": "westernization",
1741
+ "westernise": "westernize",
1742
+ "westernised": "westernized",
1743
+ "westernises": "westernizes",
1744
+ "westernising": "westernizing",
1745
+ "womanise": "womanize",
1746
+ "womanised": "womanized",
1747
+ "womaniser": "womanizer",
1748
+ "womanisers": "womanizers",
1749
+ "womanises": "womanizes",
1750
+ "womanising": "womanizing",
1751
+ "woollen": "woolen",
1752
+ "woollens": "woolens",
1753
+ "woollies": "woolies",
1754
+ "woolly": "wooly",
1755
+ "worshipped": "worshiped",
1756
+ "worshipper": "worshiper",
1757
+ "worshipping": "worshiping",
1758
+ "yodelled": "yodeled",
1759
+ "yodelling": "yodeling",
1760
+ "yoghourt": "yogurt",
1761
+ "yoghourts": "yogurts",
1762
+ "yoghurt": "yogurt",
1763
+ "yoghurts": "yogurts",
1764
+ }
1765
+
1766
+ # non-ASCII letters that are not separated by "NFKD" normalization
1767
+ ADDITIONAL_DIACRITICS = {
1768
+ "œ": "oe",
1769
+ "Œ": "OE",
1770
+ "ø": "o",
1771
+ "Ø": "O",
1772
+ "æ": "ae",
1773
+ "Æ": "AE",
1774
+ "ß": "ss",
1775
+ "ẞ": "SS",
1776
+ "đ": "d",
1777
+ "Đ": "D",
1778
+ "ð": "d",
1779
+ "Ð": "D",
1780
+ "þ": "th",
1781
+ "Þ": "th",
1782
+ "ł": "l",
1783
+ "Ł": "L",
1784
+ }
1785
+
1786
+
1787
+ def remove_symbols_and_diacritics(s: str, keep=""):
1788
+ """
1789
+ Replace any other markers, symbols, and punctuations with a space, and drop any diacritics
1790
+ (category 'Mn' and some manual mappings)
1791
+ """
1792
+
1793
+ def replace_character(char):
1794
+ if char in keep:
1795
+ return char
1796
+ elif char in ADDITIONAL_DIACRITICS:
1797
+ return ADDITIONAL_DIACRITICS[char]
1798
+
1799
+ elif unicodedata.category(char) == "Mn":
1800
+ return ""
1801
+
1802
+ elif unicodedata.category(char)[0] in "MSP":
1803
+ return " "
1804
+
1805
+ return char
1806
+
1807
+ return "".join(replace_character(c) for c in unicodedata.normalize("NFKD", s))
1808
+
1809
+
1810
+ def remove_symbols(s: str):
1811
+ """
1812
+ Replace any other markers, symbols, punctuations with a space, keeping diacritics
1813
+ """
1814
+ return "".join(
1815
+ " " if unicodedata.category(c)[0] in "MSP" else c
1816
+ for c in unicodedata.normalize("NFKC", s)
1817
+ )
1818
+
1819
+
1820
+ class BasicTextNormalizer:
1821
+ def __init__(self, remove_diacritics: bool = False, split_letters: bool = False):
1822
+ self.clean = (
1823
+ remove_symbols_and_diacritics if remove_diacritics else remove_symbols
1824
+ )
1825
+ self.split_letters = split_letters
1826
+
1827
+ def __call__(self, s: str):
1828
+ s = s.lower()
1829
+ s = re.sub(r"[<\[][^>\]]*[>\]]", "", s) # remove words between brackets
1830
+ s = re.sub(r"\(([^)]+?)\)", "", s) # remove words between parenthesis
1831
+ s = self.clean(s).lower()
1832
+
1833
+ if self.split_letters:
1834
+ s = " ".join(regex.findall(r"\X", s, regex.U))
1835
+
1836
+ s = re.sub(
1837
+ r"\s+", " ", s
1838
+ ) # replace any successive whitespace characters with a space
1839
+
1840
+ return s
1841
+
1842
+
1843
+ class EnglishNumberNormalizer:
1844
+ """
1845
+ Convert any spelled-out numbers into arabic numbers, while handling:
1846
+
1847
+ - remove any commas
1848
+ - keep the suffixes such as: `1960s`, `274th`, `32nd`, etc.
1849
+ - spell out currency symbols after the number. e.g. `$20 million` -> `20000000 dollars`
1850
+ - spell out `one` and `ones`
1851
+ - interpret successive single-digit numbers as nominal: `one oh one` -> `101`
1852
+ """
1853
+
1854
+ def __init__(self):
1855
+ super().__init__()
1856
+
1857
+ self.zeros = {"o", "oh", "zero"}
1858
+ # fmt: off
1859
+ self.ones = {
1860
+ name: i
1861
+ for i, name in enumerate(
1862
+ [
1863
+ "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten",
1864
+ "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen",
1865
+ "eighteen", "nineteen"],
1866
+ start=1,
1867
+ )
1868
+ }
1869
+ # fmt: on
1870
+ self.ones_plural = {
1871
+ "sixes" if name == "six" else name + "s": (value, "s")
1872
+ for name, value in self.ones.items()
1873
+ }
1874
+ self.ones_ordinal = {
1875
+ "zeroth": (0, "th"),
1876
+ "first": (1, "st"),
1877
+ "second": (2, "nd"),
1878
+ "third": (3, "rd"),
1879
+ "fifth": (5, "th"),
1880
+ "twelfth": (12, "th"),
1881
+ **{
1882
+ name + ("h" if name.endswith("t") else "th"): (value, "th")
1883
+ for name, value in self.ones.items()
1884
+ if value > 3 and value != 5 and value != 12
1885
+ },
1886
+ }
1887
+ self.ones_suffixed = {**self.ones_plural, **self.ones_ordinal}
1888
+
1889
+ self.tens = {
1890
+ "twenty": 20,
1891
+ "thirty": 30,
1892
+ "forty": 40,
1893
+ "fifty": 50,
1894
+ "sixty": 60,
1895
+ "seventy": 70,
1896
+ "eighty": 80,
1897
+ "ninety": 90,
1898
+ }
1899
+ self.tens_plural = {
1900
+ name.replace("y", "ies"): (value, "s") for name, value in self.tens.items()
1901
+ }
1902
+ self.tens_ordinal = {
1903
+ name.replace("y", "ieth"): (value, "th")
1904
+ for name, value in self.tens.items()
1905
+ }
1906
+ self.tens_suffixed = {**self.tens_plural, **self.tens_ordinal}
1907
+
1908
+ self.multipliers = {
1909
+ "hundred": 100,
1910
+ "thousand": 1_000,
1911
+ "million": 1_000_000,
1912
+ "billion": 1_000_000_000,
1913
+ "trillion": 1_000_000_000_000,
1914
+ "quadrillion": 1_000_000_000_000_000,
1915
+ "quintillion": 1_000_000_000_000_000_000,
1916
+ "sextillion": 1_000_000_000_000_000_000_000,
1917
+ "septillion": 1_000_000_000_000_000_000_000_000,
1918
+ "octillion": 1_000_000_000_000_000_000_000_000_000,
1919
+ "nonillion": 1_000_000_000_000_000_000_000_000_000_000,
1920
+ "decillion": 1_000_000_000_000_000_000_000_000_000_000_000,
1921
+ }
1922
+ self.multipliers_plural = {
1923
+ name + "s": (value, "s") for name, value in self.multipliers.items()
1924
+ }
1925
+ self.multipliers_ordinal = {
1926
+ name + "th": (value, "th") for name, value in self.multipliers.items()
1927
+ }
1928
+ self.multipliers_suffixed = {
1929
+ **self.multipliers_plural,
1930
+ **self.multipliers_ordinal,
1931
+ }
1932
+ self.decimals = {*self.ones, *self.tens, *self.zeros}
1933
+
1934
+ self.preceding_prefixers = {
1935
+ "minus": "-",
1936
+ "negative": "-",
1937
+ "plus": "+",
1938
+ "positive": "+",
1939
+ }
1940
+ self.following_prefixers = {
1941
+ "pound": "£",
1942
+ "pounds": "£",
1943
+ "euro": "€",
1944
+ "euros": "€",
1945
+ "dollar": "$",
1946
+ "dollars": "$",
1947
+ "cent": "¢",
1948
+ "cents": "¢",
1949
+ }
1950
+ self.prefixes = set(
1951
+ list(self.preceding_prefixers.values())
1952
+ + list(self.following_prefixers.values())
1953
+ )
1954
+ self.suffixers = {
1955
+ "per": {"cent": "%"},
1956
+ "percent": "%",
1957
+ }
1958
+ self.specials = {"and", "double", "triple", "point"}
1959
+
1960
+ self.words = {
1961
+ key
1962
+ for mapping in [
1963
+ self.zeros,
1964
+ self.ones,
1965
+ self.ones_suffixed,
1966
+ self.tens,
1967
+ self.tens_suffixed,
1968
+ self.multipliers,
1969
+ self.multipliers_suffixed,
1970
+ self.preceding_prefixers,
1971
+ self.following_prefixers,
1972
+ self.suffixers,
1973
+ self.specials,
1974
+ ]
1975
+ for key in mapping
1976
+ }
1977
+ self.literal_words = {"one", "ones"}
1978
+
1979
+ def process_words(self, words: List[str]) -> Iterator[str]:
1980
+ prefix: Optional[str] = None
1981
+ value: Optional[Union[str, int]] = None
1982
+ skip = False
1983
+
1984
+ def to_fraction(s: str):
1985
+ try:
1986
+ return Fraction(s)
1987
+ except ValueError:
1988
+ return None
1989
+
1990
+ def output(result: Union[str, int]):
1991
+ nonlocal prefix, value
1992
+ result = str(result)
1993
+ if prefix is not None:
1994
+ result = prefix + result
1995
+ value = None
1996
+ prefix = None
1997
+ return result
1998
+
1999
+ if len(words) == 0:
2000
+ return
2001
+
2002
+ for i, current in enumerate(words):
2003
+ prev = words[i - 1] if i != 0 else None
2004
+ next = words[i + 1] if i != len(words) - 1 else None
2005
+ if skip:
2006
+ skip = False
2007
+ continue
2008
+
2009
+ next_is_numeric = next is not None and re.match(r"^\d+(\.\d+)?$", next)
2010
+ has_prefix = current[0] in self.prefixes
2011
+ current_without_prefix = current[1:] if has_prefix else current
2012
+ if re.match(r"^\d+(\.\d+)?$", current_without_prefix):
2013
+ # arabic numbers (potentially with signs and fractions)
2014
+ f = to_fraction(current_without_prefix)
2015
+ if f is None:
2016
+ raise ValueError("Converting the fraction failed")
2017
+
2018
+ if value is not None:
2019
+ if isinstance(value, str) and value.endswith("."):
2020
+ # concatenate decimals / ip address components
2021
+ value = str(value) + str(current)
2022
+ continue
2023
+ else:
2024
+ yield output(value)
2025
+
2026
+ prefix = current[0] if has_prefix else prefix
2027
+ if f.denominator == 1:
2028
+ value = f.numerator # store integers as int
2029
+ else:
2030
+ value = current_without_prefix
2031
+ elif current not in self.words:
2032
+ # non-numeric words
2033
+ if value is not None:
2034
+ yield output(value)
2035
+ yield output(current)
2036
+ elif current in self.zeros:
2037
+ value = str(value or "") + "0"
2038
+ elif current in self.ones:
2039
+ ones = self.ones[current]
2040
+
2041
+ if value is None:
2042
+ value = ones
2043
+ elif isinstance(value, str) or prev in self.ones:
2044
+ if (
2045
+ prev in self.tens and ones < 10
2046
+ ): # replace the last zero with the digit
2047
+ value = value[:-1] + str(ones)
2048
+ else:
2049
+ value = str(value) + str(ones)
2050
+ elif ones < 10:
2051
+ if value % 10 == 0:
2052
+ value += ones
2053
+ else:
2054
+ value = str(value) + str(ones)
2055
+ else: # eleven to nineteen
2056
+ if value % 100 == 0:
2057
+ value += ones
2058
+ else:
2059
+ value = str(value) + str(ones)
2060
+ elif current in self.ones_suffixed:
2061
+ # ordinal or cardinal; yield the number right away
2062
+ ones, suffix = self.ones_suffixed[current]
2063
+ if value is None:
2064
+ yield output(str(ones) + suffix)
2065
+ elif isinstance(value, str) or prev in self.ones:
2066
+ if prev in self.tens and ones < 10:
2067
+ yield output(value[:-1] + str(ones) + suffix)
2068
+ else:
2069
+ yield output(str(value) + str(ones) + suffix)
2070
+ elif ones < 10:
2071
+ if value % 10 == 0:
2072
+ yield output(str(value + ones) + suffix)
2073
+ else:
2074
+ yield output(str(value) + str(ones) + suffix)
2075
+ else: # eleven to nineteen
2076
+ if value % 100 == 0:
2077
+ yield output(str(value + ones) + suffix)
2078
+ else:
2079
+ yield output(str(value) + str(ones) + suffix)
2080
+ value = None
2081
+ elif current in self.tens:
2082
+ tens = self.tens[current]
2083
+ if value is None:
2084
+ value = tens
2085
+ elif isinstance(value, str):
2086
+ value = str(value) + str(tens)
2087
+ else:
2088
+ if value % 100 == 0:
2089
+ value += tens
2090
+ else:
2091
+ value = str(value) + str(tens)
2092
+ elif current in self.tens_suffixed:
2093
+ # ordinal or cardinal; yield the number right away
2094
+ tens, suffix = self.tens_suffixed[current]
2095
+ if value is None:
2096
+ yield output(str(tens) + suffix)
2097
+ elif isinstance(value, str):
2098
+ yield output(str(value) + str(tens) + suffix)
2099
+ else:
2100
+ if value % 100 == 0:
2101
+ yield output(str(value + tens) + suffix)
2102
+ else:
2103
+ yield output(str(value) + str(tens) + suffix)
2104
+ elif current in self.multipliers:
2105
+ multiplier = self.multipliers[current]
2106
+ if value is None:
2107
+ value = multiplier
2108
+ elif isinstance(value, str) or value == 0:
2109
+ f = to_fraction(value)
2110
+ p = f * multiplier if f is not None else None
2111
+ if f is not None and p.denominator == 1:
2112
+ value = p.numerator
2113
+ else:
2114
+ yield output(value)
2115
+ value = multiplier
2116
+ else:
2117
+ before = value // 1000 * 1000
2118
+ residual = value % 1000
2119
+ value = before + residual * multiplier
2120
+ elif current in self.multipliers_suffixed:
2121
+ multiplier, suffix = self.multipliers_suffixed[current]
2122
+ if value is None:
2123
+ yield output(str(multiplier) + suffix)
2124
+ elif isinstance(value, str):
2125
+ f = to_fraction(value)
2126
+ p = f * multiplier if f is not None else None
2127
+ if f is not None and p.denominator == 1:
2128
+ yield output(str(p.numerator) + suffix)
2129
+ else:
2130
+ yield output(value)
2131
+ yield output(str(multiplier) + suffix)
2132
+ else: # int
2133
+ before = value // 1000 * 1000
2134
+ residual = value % 1000
2135
+ value = before + residual * multiplier
2136
+ yield output(str(value) + suffix)
2137
+ value = None
2138
+ elif current in self.preceding_prefixers:
2139
+ # apply prefix (positive, minus, etc.) if it precedes a number
2140
+ if value is not None:
2141
+ yield output(value)
2142
+
2143
+ if next in self.words or next_is_numeric:
2144
+ prefix = self.preceding_prefixers[current]
2145
+ else:
2146
+ yield output(current)
2147
+ elif current in self.following_prefixers:
2148
+ # apply prefix (dollars, cents, etc.) only after a number
2149
+ if value is not None:
2150
+ prefix = self.following_prefixers[current]
2151
+ yield output(value)
2152
+ else:
2153
+ yield output(current)
2154
+ elif current in self.suffixers:
2155
+ # apply suffix symbols (percent -> '%')
2156
+ if value is not None:
2157
+ suffix = self.suffixers[current]
2158
+ if isinstance(suffix, dict):
2159
+ if next in suffix:
2160
+ yield output(str(value) + suffix[next])
2161
+ skip = True
2162
+ else:
2163
+ yield output(value)
2164
+ yield output(current)
2165
+ else:
2166
+ yield output(str(value) + suffix)
2167
+ else:
2168
+ yield output(current)
2169
+ elif current in self.specials:
2170
+ if next not in self.words and not next_is_numeric:
2171
+ # apply special handling only if the next word can be numeric
2172
+ if value is not None:
2173
+ yield output(value)
2174
+ yield output(current)
2175
+ elif current == "and":
2176
+ # ignore "and" after hundreds, thousands, etc.
2177
+ if prev not in self.multipliers:
2178
+ if value is not None:
2179
+ yield output(value)
2180
+ yield output(current)
2181
+ elif current == "double" or current == "triple":
2182
+ if next in self.ones or next in self.zeros:
2183
+ repeats = 2 if current == "double" else 3
2184
+ ones = self.ones.get(next, 0)
2185
+ value = str(value or "") + str(ones) * repeats
2186
+ skip = True
2187
+ else:
2188
+ if value is not None:
2189
+ yield output(value)
2190
+ yield output(current)
2191
+ elif current == "point":
2192
+ if next in self.decimals or next_is_numeric:
2193
+ value = str(value or "") + "."
2194
+ else:
2195
+ # should all have been covered at this point
2196
+ raise ValueError(f"Unexpected token: {current}")
2197
+ else:
2198
+ # all should have been covered at this point
2199
+ raise ValueError(f"Unexpected token: {current}")
2200
+
2201
+ if value is not None:
2202
+ yield output(value)
2203
+
2204
+ def preprocess(self, s: str):
2205
+ # replace "<number> and a half" with "<number> point five"
2206
+ results = []
2207
+
2208
+ segments = re.split(r"\band\s+a\s+half\b", s)
2209
+ for i, segment in enumerate(segments):
2210
+ if len(segment.strip()) == 0:
2211
+ continue
2212
+ if i == len(segments) - 1:
2213
+ results.append(segment)
2214
+ else:
2215
+ results.append(segment)
2216
+ last_word = segment.rsplit(maxsplit=2)[-1]
2217
+ if last_word in self.decimals or last_word in self.multipliers:
2218
+ results.append("point five")
2219
+ else:
2220
+ results.append("and a half")
2221
+
2222
+ s = " ".join(results)
2223
+
2224
+ # put a space at number/letter boundary
2225
+ s = re.sub(r"([a-z])([0-9])", r"\1 \2", s)
2226
+ s = re.sub(r"([0-9])([a-z])", r"\1 \2", s)
2227
+
2228
+ # but remove spaces which could be a suffix
2229
+ s = re.sub(r"([0-9])\s+(st|nd|rd|th|s)\b", r"\1\2", s)
2230
+
2231
+ return s
2232
+
2233
+ def postprocess(self, s: str):
2234
+ def combine_cents(m: Match):
2235
+ try:
2236
+ currency = m.group(1)
2237
+ integer = m.group(2)
2238
+ cents = int(m.group(3))
2239
+ return f"{currency}{integer}.{cents:02d}"
2240
+ except ValueError:
2241
+ return m.string
2242
+
2243
+ def extract_cents(m: Match):
2244
+ try:
2245
+ return f"¢{int(m.group(1))}"
2246
+ except ValueError:
2247
+ return m.string
2248
+
2249
+ # apply currency postprocessing; "$2 and ¢7" -> "$2.07"
2250
+ s = re.sub(r"([€£$])([0-9]+) (?:and )?¢([0-9]{1,2})\b", combine_cents, s)
2251
+ s = re.sub(r"[€£$]0.([0-9]{1,2})\b", extract_cents, s)
2252
+
2253
+ # write "one(s)" instead of "1(s)", just for the readability
2254
+ s = re.sub(r"\b1(s?)\b", r"one\1", s)
2255
+
2256
+ return s
2257
+
2258
+ def __call__(self, s: str):
2259
+ s = self.preprocess(s)
2260
+ s = " ".join(word for word in self.process_words(s.split()) if word is not None)
2261
+ s = self.postprocess(s)
2262
+
2263
+ return s
2264
+
2265
+
2266
+ class EnglishSpellingNormalizer:
2267
+ """
2268
+ Applies British-American spelling mappings as listed in [1].
2269
+
2270
+ [1] https://www.tysto.com/uk-us-spelling-list.html
2271
+ """
2272
+
2273
+ def __init__(self, english_spelling_mapping):
2274
+ self.mapping = english_spelling_mapping
2275
+
2276
+ def __call__(self, s: str):
2277
+ return " ".join(self.mapping.get(word, word) for word in s.split())
2278
+
2279
+
2280
+ class EnglishTextNormalizer:
2281
+ def __init__(self, english_spelling_mapping=abbr):
2282
+ self.ignore_patterns = r"\b(hmm|mm|mhm|mmm|uh|um)\b"
2283
+ self.replacers = {
2284
+ # common contractions
2285
+ r"\bwon't\b": "will not",
2286
+ r"\bcan't\b": "can not",
2287
+ r"\blet's\b": "let us",
2288
+ r"\bain't\b": "aint",
2289
+ r"\by'all\b": "you all",
2290
+ r"\bwanna\b": "want to",
2291
+ r"\bgotta\b": "got to",
2292
+ r"\bgonna\b": "going to",
2293
+ r"\bi'ma\b": "i am going to",
2294
+ r"\bimma\b": "i am going to",
2295
+ r"\bwoulda\b": "would have",
2296
+ r"\bcoulda\b": "could have",
2297
+ r"\bshoulda\b": "should have",
2298
+ r"\bma'am\b": "madam",
2299
+ # contractions in titles/prefixes
2300
+ r"\bmr\b": "mister ",
2301
+ r"\bmrs\b": "missus ",
2302
+ r"\bst\b": "saint ",
2303
+ r"\bdr\b": "doctor ",
2304
+ r"\bprof\b": "professor ",
2305
+ r"\bcapt\b": "captain ",
2306
+ r"\bgov\b": "governor ",
2307
+ r"\bald\b": "alderman ",
2308
+ r"\bgen\b": "general ",
2309
+ r"\bsen\b": "senator ",
2310
+ r"\brep\b": "representative ",
2311
+ r"\bpres\b": "president ",
2312
+ r"\brev\b": "reverend ",
2313
+ r"\bhon\b": "honorable ",
2314
+ r"\basst\b": "assistant ",
2315
+ r"\bassoc\b": "associate ",
2316
+ r"\blt\b": "lieutenant ",
2317
+ r"\bcol\b": "colonel ",
2318
+ r"\bjr\b": "junior ",
2319
+ r"\bsr\b": "senior ",
2320
+ r"\besq\b": "esquire ",
2321
+ # prefect tenses, ideally it should be any past participles, but it's harder..
2322
+ r"'d been\b": " had been",
2323
+ r"'s been\b": " has been",
2324
+ r"'d gone\b": " had gone",
2325
+ r"'s gone\b": " has gone",
2326
+ r"'d done\b": " had done", # "'s done" is ambiguous
2327
+ r"'s got\b": " has got",
2328
+ # general contractions
2329
+ r"n't\b": " not",
2330
+ r"'re\b": " are",
2331
+ r"'s\b": " is",
2332
+ r"'d\b": " would",
2333
+ r"'ll\b": " will",
2334
+ r"'t\b": " not",
2335
+ r"'ve\b": " have",
2336
+ r"'m\b": " am",
2337
+ }
2338
+ self.standardize_numbers = EnglishNumberNormalizer()
2339
+ self.standardize_spellings = EnglishSpellingNormalizer(english_spelling_mapping)
2340
+
2341
+ def __call__(self, s: str):
2342
+ s = s.lower()
2343
+
2344
+ s = re.sub(r"[<\[][^>\]]*[>\]]", "", s) # remove words between brackets
2345
+ s = re.sub(r"\(([^)]+?)\)", "", s) # remove words between parenthesis
2346
+ s = re.sub(self.ignore_patterns, "", s)
2347
+ s = re.sub(
2348
+ r"\s+'", "'", s
2349
+ ) # standardize when there's a space before an apostrophe
2350
+
2351
+ for pattern, replacement in self.replacers.items():
2352
+ s = re.sub(pattern, replacement, s)
2353
+
2354
+ s = re.sub(r"(\d),(\d)", r"\1\2", s) # remove commas between digits
2355
+ s = re.sub(r"\.([^0-9]|$)", r" \1", s) # remove periods not followed by numbers
2356
+ s = remove_symbols_and_diacritics(
2357
+ s, keep=".%$¢€£"
2358
+ ) # keep some symbols for numerics
2359
+
2360
+ s = self.standardize_numbers(s)
2361
+ s = self.standardize_spellings(s)
2362
+
2363
+ # now remove prefix/suffix symbols that are not preceded/followed by numbers
2364
+ s = re.sub(r"[.$¢€£]([^0-9])", r" \1", s)
2365
+ s = re.sub(r"([^0-9])%", r"\1 ", s)
2366
+
2367
+ s = re.sub(
2368
+ r"\s+", " ", s
2369
+ ) # replace any successive whitespace characters with a space
2370
+
2371
+ return s
2372
+
2373
+
2374
+ text_normalizer = EnglishTextNormalizer()
utils.py ADDED
@@ -0,0 +1,991 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import colorsys
2
+ import json
3
+ import os
4
+ import random
5
+ from concurrent.futures import ThreadPoolExecutor
6
+ from dataclasses import dataclass, make_dataclass
7
+ from datetime import datetime
8
+ from io import BytesIO
9
+
10
+ import aiohttp
11
+ import evaluate
12
+ import numpy as np
13
+ import pandas as pd
14
+ import plotly.graph_objects as go
15
+ from huggingface_hub import hf_hub_download, list_repo_files
16
+ from pydub import AudioSegment
17
+
18
+ from constants import WHISPER_OPEN_AI_LINK
19
+
20
+ # Load the Word Error Rate (WER) metric from the evaluate library
21
+ wer_metric = evaluate.load("wer")
22
+
23
+
24
+ def compute_average_wer(results):
25
+ """
26
+ Compute the average Word Error Rate (WER) for a list of transcription results.
27
+
28
+ :param results: List of dictionaries, each containing 'reference' and 'prediction' keys
29
+ :return: Average WER as a percentage, rounded to 2 decimal places
30
+
31
+ This function calculates the WER for each reference-prediction pair and returns
32
+ the average. If no predictions are provided, it returns 100% WER.
33
+ """
34
+ references = [result["reference"] for result in results]
35
+ predictions = [result["prediction"] for result in results]
36
+ if len(predictions) == 0:
37
+ return 1
38
+ return round(
39
+ wer_metric.compute(references=references, predictions=predictions) * 100.0,
40
+ 2,
41
+ )
42
+
43
+
44
+ def read_json_line_by_line(file_path):
45
+ """
46
+ Read a JSON file line by line, parsing each line as a separate JSON object.
47
+
48
+ :param file_path: Path to the JSON file
49
+ :return: List of parsed JSON objects
50
+
51
+ This function is useful for reading large JSON files that contain one JSON object
52
+ per line. It handles JSON parsing errors gracefully, skipping invalid lines.
53
+ """
54
+ data = []
55
+ with open(file_path, "r") as f:
56
+ for line in f:
57
+ try:
58
+ item = json.loads(line.strip())
59
+ data.append(item)
60
+ except json.JSONDecodeError:
61
+ print(f"Skipping invalid JSON in {file_path}: {line}")
62
+ return data
63
+
64
+
65
+ def group_wer(group):
66
+ """
67
+ Calculate the Word Error Rate (WER) for a group of transcriptions.
68
+
69
+ :param group: DataFrame group containing 'normalized_reference' and 'normalized_prediction' columns
70
+ :return: Average WER for the group
71
+
72
+ This function is typically used with DataFrame groupby operations to calculate
73
+ WER for specific groups of transcriptions.
74
+ """
75
+ return compute_average_wer(
76
+ group[["normalized_reference", "normalized_prediction"]]
77
+ .rename(
78
+ columns={
79
+ "normalized_reference": "reference",
80
+ "normalized_prediction": "prediction",
81
+ }
82
+ )
83
+ .to_dict("records")
84
+ )
85
+
86
+
87
+ def load_multilingual_results(csv_file):
88
+ """
89
+ Load multilingual results from a CSV file into a pandas DataFrame.
90
+
91
+ :param csv_file: Path to the CSV file containing multilingual results
92
+ :return: DataFrame with the loaded results, or None if the file is not found
93
+
94
+ This function attempts to load a CSV file using pandas, handling potential
95
+ FileNotFoundError exceptions.
96
+ """
97
+ try:
98
+ df = pd.json_normalize(csv_file)
99
+ return df
100
+ except FileNotFoundError:
101
+ return None
102
+
103
+
104
+ def download_dataset(repo_id, local_dir, remote_dir, path_includes=""):
105
+ """
106
+ Download benchmark result files from a specified Hugging Face repository to a local directory.
107
+
108
+ :param repo_id: ID of the Hugging Face repository
109
+ :param local_dir: Local directory where downloaded files will be saved
110
+ :param remote_dir: Remote directory within the repository to download from
111
+
112
+ This function uses the Hugging Face Hub API to list and download files from a
113
+ specific directory in a repository. It forces the download to ensure up-to-date files.
114
+ """
115
+ files = list_repo_files(repo_id, repo_type="dataset")
116
+ directory_files = [
117
+ file for file in files if file.startswith(remote_dir) and path_includes in file
118
+ ]
119
+ with ThreadPoolExecutor() as executor:
120
+ executor.map(
121
+ lambda file: hf_hub_download(
122
+ repo_id=repo_id,
123
+ repo_type="dataset",
124
+ filename=file,
125
+ local_dir=local_dir,
126
+ force_download=True,
127
+ ),
128
+ directory_files,
129
+ )
130
+
131
+
132
+ def process_file(file_path):
133
+ """
134
+ Process a file containing JSON objects delimited by new lines.
135
+
136
+ :param file_path: Path to the file to be processed
137
+ :return: List of dictionaries, each representing a parsed JSON object
138
+
139
+ This function reads the file line by line, parsing each line as a JSON object.
140
+ It handles potential JSON decoding errors, printing error messages for invalid lines.
141
+ """
142
+ data = []
143
+ with open(file_path, "r") as file:
144
+ for line in file:
145
+ line = line.strip()
146
+ if not line:
147
+ continue
148
+ try:
149
+ json_obj = json.loads(line)
150
+ data.append(json_obj)
151
+ except json.JSONDecodeError as e:
152
+ print(f"Error decoding JSON in line: {line}")
153
+ print(f"Error message: {str(e)}")
154
+ return data
155
+
156
+
157
+ def dir_to_json(root_dir, output_file):
158
+ """
159
+ Convert a directory of benchmark result files to a single JSON file.
160
+
161
+ :param root_dir: Root directory containing the benchmark result files
162
+ :param output_file: Output file where the JSON data will be saved
163
+
164
+ This function walks through the directory structure, processes each file,
165
+ and writes the combined data to a single JSON file. It extracts metadata
166
+ from the file path and includes it in the JSON output.
167
+ """
168
+ with open(output_file, "w") as outfile:
169
+ for subdir, _, files in os.walk(root_dir):
170
+ for file in files:
171
+ file_path = os.path.join(subdir, file)
172
+ # ignore .DS_Store and summary files
173
+ if file_path.endswith(".DS_Store") or "summary" in file_path:
174
+ continue
175
+ parts = file_path.split(os.sep)
176
+ print(parts)
177
+ model_version = parts[2]
178
+ device_name = parts[3].replace("_", " ")
179
+ os_type_version = parts[4]
180
+ dataset_name = parts[5]
181
+ timestamp_commit = parts[6].replace(".json", "")
182
+ timestamp, commit_hash, commit_timestamp = timestamp_commit.split("_")
183
+
184
+ data_list = process_file(file_path)
185
+ for data in data_list:
186
+ original_entry = {
187
+ "model": model_version.replace("_", "/"),
188
+ "device": device_name,
189
+ "os": os_type_version.replace("_", " "),
190
+ "wer": data["wer"],
191
+ "dataset_name": dataset_name,
192
+ "reference_transcription": data["reference_transcription"],
193
+ "prediction_transcription": data["prediction_transcription"],
194
+ "difference_transcription": data["difference_transcription"],
195
+ "audio_file_url": data["audio_file_url"],
196
+ "timestamp": timestamp.replace("-", ":").replace(":", "-", 2),
197
+ "commit_hash": commit_hash,
198
+ "commit_timestamp": commit_timestamp,
199
+ }
200
+
201
+ outfile.write(json.dumps(original_entry) + "\n")
202
+
203
+
204
+ async def download_audio_to_ndarray(url):
205
+ """
206
+ Downloads an audio file from a URL and converts it to a NumPy array.
207
+
208
+ :param url: The URL of the audio file to download
209
+ :return: A tuple containing the sample rate and audio data as a NumPy array
210
+
211
+ This asynchronous function uses aiohttp to download the audio file,
212
+ converts it to an AudioSegment, and then to a NumPy array. It handles
213
+ both mono and stereo audio files.
214
+ """
215
+ async with aiohttp.ClientSession() as session:
216
+ async with session.get(url) as response:
217
+ if response.status == 200:
218
+ audio_bytes = BytesIO(await response.read())
219
+ audio = AudioSegment.from_file(audio_bytes, format="mp3")
220
+ audio_data = np.array(audio.get_array_of_samples())
221
+ if audio.channels == 2:
222
+ audio_data = audio_data.reshape((-1, 2))
223
+ return audio.frame_rate, audio_data
224
+ else:
225
+ return None, None
226
+
227
+
228
+ async def play_audio(url):
229
+ """
230
+ Wrapper function for Gradio to play audio from a URL.
231
+
232
+ :param url: The URL of the audio file to play
233
+ :return: A tuple of sample rate and audio data, or an error message
234
+
235
+ This function uses download_audio_to_ndarray to get the audio data
236
+ and returns it in a format suitable for Gradio's audio player.
237
+ """
238
+ sample_rate, audio_data = await download_audio_to_ndarray(url)
239
+ if audio_data is None:
240
+ return "Error downloading the file"
241
+ else:
242
+ return sample_rate, audio_data
243
+
244
+
245
+ def get_filter_cond(df, model, device, os, dataset, timestamp=None):
246
+ """
247
+ Creates a filter condition for a DataFrame based on specified parameters.
248
+
249
+ :param df: DataFrame containing the transcription data
250
+ :param model: String representing the model name
251
+ :param device: String representing the device name
252
+ :param os: String representing the OS name
253
+ :param dataset: String representing the dataset name
254
+ :param timestamp: Optional timestamp for filtering (default: None)
255
+ :return: A boolean mask for filtering the DataFrame
256
+
257
+ This function constructs a complex boolean condition for filtering
258
+ the DataFrame based on the provided parameters.
259
+ """
260
+ filter_cond = (
261
+ (df["model"] == model)
262
+ & (df["device"] == device)
263
+ & (df["os"] == os)
264
+ & (df["dataset_name"] == dataset)
265
+ )
266
+ return filter_cond & (df["timestamp"] == timestamp) if timestamp else filter_cond
267
+
268
+
269
+ def get_filtered_transcript(df, model, device, os, dataset, timestamp):
270
+ """
271
+ Retrieves filtered transcription data from a DataFrame.
272
+
273
+ :param df: DataFrame containing the transcription data
274
+ :param model: String representing the model name
275
+ :param device: String representing the device name
276
+ :param os: String representing the OS name
277
+ :param dataset: String representing the dataset name
278
+ :param timestamp: String representing the timestamp
279
+ :return: A filtered DataFrame with transcription data
280
+
281
+ This function applies a filter to the input DataFrame and returns
282
+ relevant columns for transcription analysis.
283
+ """
284
+ filter_cond = get_filter_cond(df, model, device, os, dataset, timestamp)
285
+ df = df[filter_cond][
286
+ [
287
+ "reference_transcription",
288
+ "prediction_transcription",
289
+ "difference_transcription",
290
+ "audio_file_url",
291
+ ]
292
+ ]
293
+ return df
294
+
295
+
296
+ def get_filtered_timestamps(df, model, device, os, dataset):
297
+ """
298
+ Retrieves unique timestamps for a specific model, device, OS, and dataset combination.
299
+
300
+ :param df: DataFrame containing the transcription data
301
+ :param model: String representing the model name
302
+ :param device: String representing the device name
303
+ :param os: String representing the OS name
304
+ :param dataset: String representing the dataset name
305
+ :return: A filtered DataFrame containing unique timestamps
306
+
307
+ This function is useful for getting a list of available timestamps
308
+ for a specific configuration, which can be used for further analysis or UI elements.
309
+ """
310
+ filter_cond = get_filter_cond(df, model, device, os, dataset)
311
+ df = df[filter_cond][["timestamp"]].drop_duplicates()
312
+ return df
313
+
314
+
315
+ def make_model_name_clickable_link(model):
316
+ """
317
+ Creates an HTML link to the Hugging Face model page.
318
+
319
+ :param model: String representing the model name
320
+ :return: An HTML string containing a clickable link to the model page
321
+
322
+ This function generates a formatted HTML link that can be used in
323
+ web interfaces to provide direct access to the model's page on Hugging Face.
324
+ """
325
+ return f"""<a style="color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;" href="https://huggingface.co/argmaxinc/whisperkit-coreml/tree/main/{model.replace('/', '_')}" target="_blank">{model}</a>"""
326
+
327
+
328
+ def make_dataset_wer_clickable_link(row, dataset):
329
+ """
330
+ Creates a clickable link for the WER value of a dataset.
331
+
332
+ :param row: Row containing the dataset WER value
333
+ :param dataset: String representing the dataset name
334
+ :return: An HTML string containing a clickable link to the dataset's WER details
335
+
336
+ This function generates a formatted HTML link that can be used in
337
+ web interfaces to provide access to detailed WER information for a specific dataset.
338
+ """
339
+ dataset_column = f"{dataset}"
340
+ href = WHISPER_OPEN_AI_LINK.format(
341
+ row["Model"].replace("/", "_"),
342
+ dataset,
343
+ )
344
+ return f'<a style="color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;" href="{href}">{row[dataset_column]}</a>'
345
+
346
+
347
+ def make_timestamp_clickable_link(model, dataset, timestamp):
348
+ """
349
+ Creates a clickable link for a timestamp.
350
+
351
+ :param model: String representing the model name
352
+ :param dataset: String representing the dataset name
353
+ :param timestamp: Timestamp to be displayed and used in the link
354
+ :return: An HTML string containing a clickable div for the timestamp
355
+
356
+ This function generates a formatted HTML div that can be used as a clickable
357
+ element in web interfaces, typically for displaying and interacting with specific timestamps.
358
+ """
359
+ elem_id = (
360
+ f"{dataset}-{model}-{timestamp}".replace(" ", "_")
361
+ .replace('"', "")
362
+ .replace("'", "")
363
+ .replace(",", "")
364
+ )
365
+ onclick = f"onclick=\"document.getElementById('{elem_id}').click();\""
366
+ return f'<div style="color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;" {onclick} href="#">{timestamp}</div>'
367
+
368
+
369
+ def make_multilingual_model_clickable_link(model):
370
+ """
371
+ Creates a clickable link for a multilingual model name.
372
+
373
+ :param model: String representing the model name
374
+ :return: An HTML string containing a clickable div for the model name
375
+
376
+ This function generates a formatted HTML div that can be used as a clickable
377
+ element in web interfaces, typically for displaying and interacting with multilingual model names.
378
+ """
379
+ elem_id = (
380
+ f"{model}".replace(" ", "_").replace('"', "").replace("'", "").replace(",", "")
381
+ )
382
+ onclick = f"onclick=\"document.getElementById('{elem_id}').click();console.log('hello');\""
383
+ return f'<div style="color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;" {onclick} href="#">{model}</div>'
384
+
385
+
386
+ def plot_metric(
387
+ df, y_axis_col, y_axis_title, fig_title, filter_input=None, exclude_input=None
388
+ ):
389
+ """
390
+ Plots a metric for each model-device-OS group in a DataFrame.
391
+
392
+ :param df: DataFrame containing the benchmark data
393
+ :param y_axis_col: DataFrame column to use as the y-axis
394
+ :param y_axis_title: Display name for the y-axis
395
+ :param fig_title: Display title for the figure
396
+ :param filter_input: Optional string to filter the model-device-OS combinations
397
+ :param exclude_input: Optional string to exclude model-device-OS combinations
398
+ :return: A Plotly figure object
399
+ """
400
+ grouped = df.groupby(["model", "device", "os"])
401
+ sorted_groups = [group.sort_values("commit_timestamp") for _, group in grouped]
402
+
403
+ if filter_input:
404
+ filters = [f.strip().lower() for f in filter_input.split(";")]
405
+ sorted_groups = [
406
+ group
407
+ for group in sorted_groups
408
+ if any(
409
+ f
410
+ in f"{group['model'].iloc[0]}-{group['device'].iloc[0]}-{group['os'].iloc[0]}".lower()
411
+ for f in filters
412
+ )
413
+ ]
414
+
415
+ if exclude_input:
416
+ excludes = [e.strip().lower() for e in exclude_input.split(";")]
417
+ sorted_groups = [
418
+ group
419
+ for group in sorted_groups
420
+ if not any(
421
+ e
422
+ in f"{group['model'].iloc[0]}-{group['device'].iloc[0]}-{group['os'].iloc[0]}".lower()
423
+ for e in excludes
424
+ )
425
+ ]
426
+
427
+ base_colors = ["#4542f4", "#0e0c06", "#ccf0a7", "#ff7f4e", "#ffd15a"]
428
+ num_colors = len(sorted_groups)
429
+ random_colors = generate_random_colors(base_colors, num_colors)
430
+ fig = go.Figure()
431
+ for i, group in enumerate(sorted_groups):
432
+ model_device_os = (
433
+ f"{group['model'].iloc[0]}-{group['device'].iloc[0]}-{group['os'].iloc[0]}"
434
+ )
435
+ fig.add_trace(
436
+ go.Scatter(
437
+ x=group["commit_timestamp"].apply(
438
+ lambda x: datetime.strptime(x, "%Y-%m-%dT%H%M%S").strftime(
439
+ "%Y-%m-%d %H:%M:%S"
440
+ )
441
+ ),
442
+ y=group[y_axis_col],
443
+ mode="lines+markers",
444
+ name=model_device_os,
445
+ line=dict(color=random_colors[i % len(random_colors)]),
446
+ marker=dict(color=random_colors[i % len(random_colors)]),
447
+ hovertemplate=(
448
+ f"<b>{model_device_os}</b><br>"
449
+ "Timestamp: %{x}<br>"
450
+ f"{y_axis_title}: %{{y:.2f}}<br>"
451
+ "<extra></extra>"
452
+ ),
453
+ )
454
+ )
455
+ fig.update_layout(
456
+ title=fig_title,
457
+ xaxis_title="Commit Timestamp",
458
+ yaxis_title=y_axis_title,
459
+ legend_title="Model-Device-OS",
460
+ width=1100,
461
+ height=600,
462
+ plot_bgcolor="rgb(250,249,244)",
463
+ )
464
+ return fig
465
+
466
+
467
+ def fields(raw_class):
468
+ """
469
+ Returns the fields of a dataclass.
470
+
471
+ :param raw_class: The dataclass to inspect
472
+ :return: List of fields in the dataclass
473
+
474
+ This utility function extracts and returns all the fields defined in a dataclass,
475
+ excluding special methods and attributes.
476
+ """
477
+ return [
478
+ v for k, v in raw_class.__dict__.items() if k[:2] != "__" and k[-2:] != "__"
479
+ ]
480
+
481
+
482
+ def get_os_name_and_version(os_string):
483
+ """
484
+ Extracts the OS name and major version from a string.
485
+
486
+ :param os_string: String representing the OS name and version
487
+ :return: Formatted string with OS name and major version
488
+
489
+ This function splits the input string into OS name and version,
490
+ then returns a formatted string with just the major version number.
491
+ """
492
+ os_name, os_version = os_string.split()
493
+ os_version = os_version.split(".")[0]
494
+ return f"{os_name} {os_version}"
495
+
496
+
497
+ def create_initial_quality_column_dict():
498
+ """
499
+ Creates the initial column dictionary for the quality table.
500
+
501
+ :return: A list of column dictionaries
502
+
503
+ This function defines the basic structure of the quality table,
504
+ including columns for model, average WER, and QoI (Quality of Implementation).
505
+ """
506
+ return [
507
+ [
508
+ "model",
509
+ ColumnContent,
510
+ ColumnContent("Model", "html", True, never_hidden=True),
511
+ ],
512
+ ["average_wer", ColumnContent, ColumnContent("Average WER", "html", True)],
513
+ ["qoi", ColumnContent, ColumnContent("QoI", "html", True)],
514
+ ]
515
+
516
+
517
+ def calculate_parity(m2_ultra_wer, row):
518
+ """
519
+ Calculates the WER parity between M2 Ultra and the current model.
520
+
521
+ :param m2_ultra_wer: DataFrame containing WER values for M2 Ultra
522
+ :param row: Current row being processed
523
+ :return: WER difference between M2 Ultra and current model, or None if not applicable
524
+
525
+ This function computes the percentage difference in WER between the M2 Ultra model
526
+ and the current model, providing a measure of relative performance.
527
+ """
528
+ if row["Model"] in m2_ultra_wer.index:
529
+ return round(m2_ultra_wer[row["Model"]] - row["Average WER"], 2)
530
+ return None
531
+
532
+
533
+ def create_initial_performance_column_dict():
534
+ """
535
+ Creates the initial column dictionary for the performance table.
536
+
537
+ :return: A list of column dictionaries
538
+
539
+ This function defines the basic structure of the performance table,
540
+ including columns for model, device, OS, parity, average WER, QoI, speed, and tokens per second.
541
+ """
542
+ return [
543
+ [
544
+ "model",
545
+ ColumnContent,
546
+ ColumnContent("Model", "html", True, never_hidden=True),
547
+ ],
548
+ [
549
+ "device",
550
+ ColumnContent,
551
+ ColumnContent("Device", "html", True, never_hidden=True),
552
+ ],
553
+ ["os", ColumnContent, ColumnContent("OS", "html", True, never_hidden=True)],
554
+ ["parity", ColumnContent, ColumnContent("Parity %", "html", False)],
555
+ ["average_wer", ColumnContent, ColumnContent("Average WER", "html", False)],
556
+ ["qoi", ColumnContent, ColumnContent("QoI", "html", False)],
557
+ ["speed", ColumnContent, ColumnContent("Speed", "html", False)],
558
+ ["toks", ColumnContent, ColumnContent("Tok / s", "html", False)],
559
+ ]
560
+
561
+
562
+ def add_datasets_to_quality_columns(column_dict, datasets):
563
+ """
564
+ Adds dataset-specific columns to the quality table column dictionary.
565
+
566
+ :param column_dict: The initial column dictionary
567
+ :param datasets: List of dataset names to add
568
+ :return: A dictionary containing the updated column dictionary and related metadata
569
+
570
+ This function extends the quality table structure with columns for each dataset,
571
+ and creates a dataclass to represent the table structure. It also generates
572
+ metadata about the columns for use in the UI.
573
+ """
574
+ updated_column_dict = column_dict.copy()
575
+
576
+ for dataset in datasets:
577
+ field_name = dataset.replace("-", "")
578
+ updated_column_dict.append(
579
+ [field_name, ColumnContent, ColumnContent(dataset, "html", True)]
580
+ )
581
+
582
+ AutoEvalColumn = make_dataclass("AutoEvalColumn", updated_column_dict, frozen=True)
583
+
584
+ COLS = [c.name for c in fields(AutoEvalColumn) if not c.hidden]
585
+ TYPES = [c.type for c in fields(AutoEvalColumn) if not c.hidden]
586
+ ALWAYS_HERE_COLS = [c.name for c in fields(AutoEvalColumn) if c.never_hidden]
587
+ TOGGLE_COLS = [c.name for c in fields(AutoEvalColumn) if not c.never_hidden]
588
+ SELECTED_COLS = [
589
+ c.name
590
+ for c in fields(AutoEvalColumn)
591
+ if not c.never_hidden and c.displayed_by_default
592
+ ]
593
+
594
+ return {
595
+ "column_dict": updated_column_dict,
596
+ "AutoEvalColumn": AutoEvalColumn,
597
+ "COLS": COLS,
598
+ "TYPES": TYPES,
599
+ "ALWAYS_HERE_COLS": ALWAYS_HERE_COLS,
600
+ "TOGGLE_COLS": TOGGLE_COLS,
601
+ "SELECTED_COLS": SELECTED_COLS,
602
+ }
603
+
604
+
605
+ def add_datasets_to_performance_columns(column_dict, datasets):
606
+ """
607
+ Adds dataset-specific columns to the performance table column dictionary.
608
+
609
+ :param column_dict: The initial column dictionary
610
+ :param datasets: List of dataset names to add
611
+ :return: A dictionary containing the updated column dictionary and related metadata
612
+
613
+ This function extends the performance table structure with columns for each dataset,
614
+ adding both speed and tokens per second metrics. It also creates a dataclass to
615
+ represent the table structure and generates metadata about the columns for use in the UI.
616
+ """
617
+ updated_column_dict = column_dict.copy()
618
+
619
+ for dataset in datasets:
620
+ field_name = dataset.replace("-", "")
621
+ updated_column_dict.append(
622
+ [
623
+ f"{field_name}_speed",
624
+ ColumnContent,
625
+ ColumnContent(
626
+ f"{'Short-Form' if dataset == 'librispeech-10mins' else 'Long-Form'} Speed",
627
+ "html",
628
+ True,
629
+ ),
630
+ ]
631
+ )
632
+ updated_column_dict.append(
633
+ [
634
+ f"{field_name}_toks",
635
+ ColumnContent,
636
+ ColumnContent(
637
+ f"{'Short-Form' if dataset == 'librispeech-10mins' else 'Long-Form'} Tok/s",
638
+ "html",
639
+ True,
640
+ ),
641
+ ]
642
+ )
643
+
644
+ AutoEvalColumn = make_dataclass("AutoEvalColumn", updated_column_dict, frozen=True)
645
+
646
+ COLS = [c.name for c in fields(AutoEvalColumn) if not c.hidden]
647
+ TYPES = [c.type for c in fields(AutoEvalColumn) if not c.hidden]
648
+ ALWAYS_HERE_COLS = [c.name for c in fields(AutoEvalColumn) if c.never_hidden]
649
+ TOGGLE_COLS = [c.name for c in fields(AutoEvalColumn) if not c.never_hidden]
650
+ SELECTED_COLS = [
651
+ c.name
652
+ for c in fields(AutoEvalColumn)
653
+ if not c.never_hidden and c.displayed_by_default
654
+ ]
655
+
656
+ return {
657
+ "column_dict": updated_column_dict,
658
+ "AutoEvalColumn": AutoEvalColumn,
659
+ "COLS": COLS,
660
+ "TYPES": TYPES,
661
+ "ALWAYS_HERE_COLS": ALWAYS_HERE_COLS,
662
+ "TOGGLE_COLS": TOGGLE_COLS,
663
+ "SELECTED_COLS": SELECTED_COLS,
664
+ }
665
+
666
+
667
+ def create_confusion_matrix_plot(matrix, labels, is_forced):
668
+ """
669
+ Creates a confusion matrix plot for language detection.
670
+
671
+ :param matrix: 2D numpy array representing the confusion matrix
672
+ :param labels: List of language labels
673
+ :param is_forced: Boolean indicating whether language hint was used
674
+ :return: A Plotly figure object representing the confusion matrix
675
+
676
+ This function generates a heatmap visualization of the confusion matrix
677
+ for language detection, with customized layout and hover information.
678
+ """
679
+ fig = go.Figure(
680
+ data=go.Heatmap(
681
+ z=matrix,
682
+ x=labels,
683
+ y=labels,
684
+ colorscale=[
685
+ [0, "rgb(250,249,244)"],
686
+ [0.5, "rgb(69,66,244)"],
687
+ [1.0, "rgb(14,12,6)"],
688
+ ],
689
+ hoverongaps=False,
690
+ hovertemplate="True: %{y}<br>Predicted: %{x}<br>Value: %{z}<extra></extra>",
691
+ )
692
+ )
693
+ fig.update_layout(
694
+ title=f'Language Detection Confusion Matrix with {"Language Hint" if is_forced else "Language Prediction by Model"}',
695
+ xaxis_title="Predicted Language",
696
+ yaxis_title="True Language",
697
+ xaxis=dict(tickangle=-45),
698
+ width=600,
699
+ height=600,
700
+ margin=dict(l=50, r=50, t=50, b=50),
701
+ )
702
+ return fig
703
+
704
+
705
+ def hex_to_rgb(hex_color):
706
+ """
707
+ Converts a hexadecimal color code to RGB values.
708
+
709
+ :param hex_color: String representing a color in hexadecimal format
710
+ :return: Tuple of three integers representing RGB values
711
+
712
+ This function takes a hex color code and returns the corresponding
713
+ RGB values as a tuple of integers.
714
+ """
715
+ hex_color = hex_color.lstrip("#")
716
+ return tuple(int(hex_color[i : i + 2], 16) for i in (0, 2, 4))
717
+
718
+
719
+ def rgb_to_hex(rgb):
720
+ """
721
+ Converts RGB values to a hexadecimal color code.
722
+
723
+ :param rgb: Tuple of three integers representing RGB values
724
+ :return: String representing the color in hexadecimal format
725
+
726
+ This function takes RGB values as a tuple and returns the corresponding
727
+ hex color code as a string.
728
+ """
729
+ return "#{:02x}{:02x}{:02x}".format(*rgb)
730
+
731
+
732
+ def interpolate_colors(color1, color2, factor):
733
+ """
734
+ Interpolates between two colors in HSV space.
735
+
736
+ :param color1: First color in hexadecimal format
737
+ :param color2: Second color in hexadecimal format
738
+ :param factor: Float between 0 and 1, representing the interpolation factor
739
+ :return: Interpolated color in hexadecimal format
740
+
741
+ This function performs color interpolation in HSV color space, which can
742
+ produce more visually pleasing results than simple RGB interpolation.
743
+ """
744
+ rgb1 = hex_to_rgb(color1)
745
+ rgb2 = hex_to_rgb(color2)
746
+
747
+ hsv1 = colorsys.rgb_to_hsv(*[x / 255.0 for x in rgb1])
748
+ hsv2 = colorsys.rgb_to_hsv(*[x / 255.0 for x in rgb2])
749
+
750
+ h = (hsv1[0] + factor * (hsv2[0] - hsv1[0])) % 1.0
751
+ s = hsv1[1] + factor * (hsv2[1] - hsv1[1])
752
+ v = hsv1[2] + factor * (hsv2[2] - hsv1[2])
753
+
754
+ rgb = colorsys.hsv_to_rgb(h, s, v)
755
+ return rgb_to_hex(tuple(int(x * 255) for x in rgb))
756
+
757
+
758
+ def color_distance(color1, color2):
759
+ """
760
+ Calculates the Euclidean distance between two colors in RGB space.
761
+
762
+ :param color1: First color in hexadecimal format
763
+ :param color2: Second color in hexadecimal format
764
+ :return: Float representing the distance between the two colors
765
+
766
+ This function computes the Euclidean distance between two colors in RGB space,
767
+ which can be used as a measure of color similarity.
768
+ """
769
+ rgb1 = hex_to_rgb(color1)
770
+ rgb2 = hex_to_rgb(color2)
771
+ return sum((a - b) ** 2 for a, b in zip(rgb1, rgb2)) ** 0.5
772
+
773
+
774
+ def generate_random_colors(base_colors, num_colors, min_distance=30):
775
+ """
776
+ Generates a list of random colors based on a set of base colors.
777
+
778
+ :param base_colors: List of base colors in hexadecimal format
779
+ :param num_colors: Number of colors to generate
780
+ :param min_distance: Minimum distance between generated colors (default: 30)
781
+ :return: List of generated colors in hexadecimal format
782
+
783
+ This function creates a list of random colors by interpolating between
784
+ the provided base colors. It attempts to maintain a minimum distance
785
+ between colors to ensure visual distinctiveness.
786
+ """
787
+ generated_colors = []
788
+ attempts = 0
789
+ max_attempts = 1000
790
+
791
+ while len(generated_colors) < num_colors and attempts < max_attempts:
792
+ color1, color2 = random.sample(base_colors, 2)
793
+ factor = random.random()
794
+ new_color = interpolate_colors(color1, color2, factor)
795
+
796
+ if all(color_distance(new_color, c) >= min_distance for c in generated_colors):
797
+ generated_colors.append(new_color)
798
+ attempts = 0
799
+ else:
800
+ attempts += 1
801
+
802
+ if attempts > 100:
803
+ if random.random() < 0.1:
804
+ generated_colors.append(new_color)
805
+ attempts = 0
806
+
807
+ return generated_colors
808
+
809
+
810
+ @dataclass
811
+ class Task:
812
+ """
813
+ Dataclass representing a benchmark task.
814
+
815
+ :param benchmark: String representing the benchmark name
816
+ :param metric: String representing the metric used for evaluation
817
+ :param col_name: String representing the column name in the results DataFrame
818
+ """
819
+
820
+ benchmark: str
821
+ metric: str
822
+ col_name: str
823
+
824
+
825
+ @dataclass(frozen=True)
826
+ class ColumnContent:
827
+ """
828
+ Dataclass representing a column in the results table.
829
+
830
+ :param name: String representing the column name
831
+ :param type: String representing the data type of the column
832
+ :param displayed_by_default: Boolean indicating if the column should be displayed by default
833
+ :param hidden: Boolean indicating if the column should be hidden (default: False)
834
+ :param never_hidden: Boolean indicating if the column should never be hidden (default: False)
835
+ :param dummy: Boolean indicating if this is a dummy column (default: False)
836
+ """
837
+
838
+ name: str
839
+ type: str
840
+ displayed_by_default: bool
841
+ hidden: bool = False
842
+ never_hidden: bool = False
843
+ dummy: bool = False
844
+
845
+
846
+ css = """
847
+ @font-face {
848
+ font-family: 'Zwizz Regular';
849
+ font-style: normal;
850
+ font-weight: normal;
851
+ src: local('Zwizz Regular'), url('static/Zwizz-Regular.woff') format('woff');
852
+ }
853
+
854
+ @font-face {
855
+ font-family: 'Zwizz Medium';
856
+ font-style: normal;
857
+ font-weight: normal;
858
+ src: local('Zwizz Medium'), url('static/Zwizz-Medium.woff') format('woff');
859
+ }
860
+
861
+ @font-face {
862
+ font-family: 'Zwizz SemiBold';
863
+ font-style: normal;
864
+ font-weight: normal;
865
+ src: local('Zwizz SemiBold'), url('static/Zwizz-SemiBold.woff') format('woff');
866
+ }
867
+
868
+ @import url('https://fonts.googleapis.com/css2?family=Noto+Color+Emoji&display=swap');
869
+ @import url('https://fonts.googleapis.com/css2?family=Sora:wght@300..400&display=swap');
870
+
871
+ /* Typography Scale */
872
+ h1, .h1 {
873
+ font-family: 'Sora', sans-serif;
874
+ font-weight: 300;
875
+ font-size: 2em;
876
+ letter-spacing: -0.05em;
877
+ }
878
+
879
+ h2, .h2 {
880
+ font-family: 'Sora', sans-serif;
881
+ font-weight: 400;
882
+ letter-spacing: -0.05em;
883
+ }
884
+
885
+ h3, h4, h5, .h3, .h4, .h5 {
886
+ font-family: 'Sora', sans-serif;
887
+ font-weight: 400;
888
+ letter-spacing: -0.05em;
889
+ }
890
+
891
+ h6, .h6, pre, code, .monospace {
892
+ font-family: 'IBM Plex Mono', monospace;
893
+ font-weight: 400;
894
+ letter-spacing: 0.01em;
895
+ }
896
+
897
+ /* Add strong tag styling */
898
+ strong, b {
899
+ font-family: 'Zwizz SemiBold', -apple-system, BlinkMacSystemFont, system-ui, sans-serif;
900
+ letter-spacing: -0.02em;
901
+ }
902
+
903
+ /* Global Zwizz styles */
904
+ :root {
905
+ --zwizz-spacing: -0.02em;
906
+ }
907
+
908
+ /* All Gradio elements should have Zwizz spacing */
909
+ .gradio-container * {
910
+ letter-spacing: var(--zwizz-spacing);
911
+ line-height: 1.7;
912
+ }
913
+
914
+ /* UI Elements */
915
+ .tab-buttons button, #models-to-add-text, .gradio-button {
916
+ font-family: 'Sora', sans-serif;
917
+ font-weight: 400;
918
+ letter-spacing: -0.05em;
919
+ }
920
+
921
+ /* Specific Table Styling */
922
+ table, .table, th, td {
923
+ font-family: 'IBM Plex Mono', 'Noto Color Emoji', sans-serif, monospace !important;
924
+ font-weight: 400;
925
+ letter-spacing: 0.01em;
926
+ }
927
+
928
+ /* Technical/Code Elements */
929
+ .code-block, .technical-text {
930
+ font-family: 'IBM Plex Mono', monospace;
931
+ font-weight: 400;
932
+ letter-spacing: 0.01em;
933
+ }
934
+
935
+ /* Additional Elements */
936
+ #methodology-text p, #methodology-text li, .markdown-text {
937
+ font-family: 'Zwizz Regular', -apple-system, BlinkMacSystemFont, system-ui, sans-serif;
938
+ font-size: 16px !important;
939
+ letter-spacing: var(--zwizz-spacing);
940
+ line-height: 1.7;
941
+ }
942
+
943
+ /* Font weight utilities */
944
+ .zwizz-medium {
945
+ font-family: 'Zwizz Medium', -apple-system, BlinkMacSystemFont, system-ui, sans-serif;
946
+ }
947
+
948
+ .zwizz-semibold {
949
+ font-family: 'Zwizz SemiBold', -apple-system, BlinkMacSystemFont, system-ui, sans-serif;
950
+ }
951
+
952
+ /* Maintaining Original Layout Rules */
953
+ .gradio-container {
954
+ max-width: 95% !important;
955
+ }
956
+
957
+ /* Table Layouts */
958
+ .large-table,
959
+ .large-table .table-wrap,
960
+ #multilingual-model-table .table-wrap,
961
+ #lookup-table .table-wrap {
962
+ height: 35em !important;
963
+ overflow-y: scroll !important;
964
+ }
965
+
966
+ /* SVG Container Rules */
967
+ .svg-container,
968
+ .main-svg {
969
+ width: 100% !important;
970
+ }
971
+
972
+ .large-table, .large-table .table-wrap, #multilingual-model-table .table-wrap, #lookup-table .table-wrap {
973
+ height: 35em !important;
974
+ overflow-y: scroll !important;
975
+ }
976
+
977
+ .left-side-table .table-wrap {
978
+ height: 15em !important;
979
+ overflow-y: scroll !important;
980
+ }
981
+
982
+ #average-wer-table .table-wrap {
983
+ height: 8em !important;
984
+ overflow-y: scroll !important;
985
+ }
986
+
987
+ #general-wer-table .table-wrap {
988
+ height: 35em !important;
989
+ overflow-y: scroll !important;
990
+ }
991
+ """