Fixes to documentation
Browse files- TRAINING.md +4 -5
TRAINING.md
CHANGED
@@ -24,9 +24,9 @@ When doing human evaliuation the results for finetuned Catalan language model we
|
|
24 |
|
25 |
Our hypothesis is that the evaluation on Common Voice gives better results because the model is overfitted and has lost generalization capabilities.
|
26 |
|
27 |
-
**
|
28 |
|
29 |
-
|
30 |
|
31 |
| | base | sc-base | small | sc-small |medium | sc-medium |
|
32 |
| ----------- | ----------- | ----------- | ----------- |----------- | ----------- | ----------- |
|
@@ -35,7 +35,7 @@ Doing a more extensive evaluation shows:
|
|
35 |
| Son_Goku_catalan_valencian_voice | 51.90 | 85.44 | 39.87 |65.19 | 18.99| 71.52
|
36 |
| Universal_Declaration_of_Human_Rights | 47.12 | 36.45 | 39.14 |75.59 | 44.37 | 27.79
|
37 |
|
38 |
-
As you can see,
|
39 |
|
40 |
Legend:
|
41 |
* "sc-" Indicates Softcatalà fine-tuned model
|
@@ -56,12 +56,11 @@ In our experiments
|
|
56 |
| ----------- | ----------- |
|
57 |
| OpenAI | 27.32 |
|
58 |
| Whisper.cpp 1.2.1 | 38.89 |
|
59 |
-
| HuggingFace
|
60 |
| CTranslate2 3.10.3 | 43.68 |
|
61 |
|
62 |
We strongly recommend using CTranslate2 as inference client.
|
63 |
|
64 |
-
|
65 |
**5. Fine-tunning degrades timestamp prediction**
|
66 |
|
67 |
Whisper uses timestamp tokens to indicate the timestamps of the transcribed texts.
|
|
|
24 |
|
25 |
Our hypothesis is that the evaluation on Common Voice gives better results because the model is overfitted and has lost generalization capabilities.
|
26 |
|
27 |
+
**3. Model degrades according evaluation with other datasets**
|
28 |
|
29 |
+
Results doing an evaluation with other datasets:
|
30 |
|
31 |
| | base | sc-base | small | sc-small |medium | sc-medium |
|
32 |
| ----------- | ----------- | ----------- | ----------- |----------- | ----------- | ----------- |
|
|
|
35 |
| Son_Goku_catalan_valencian_voice | 51.90 | 85.44 | 39.87 |65.19 | 18.99| 71.52
|
36 |
| Universal_Declaration_of_Human_Rights | 47.12 | 36.45 | 39.14 |75.59 | 44.37 | 27.79
|
37 |
|
38 |
+
As you can see, the fine-tunned models perform worse in most of the scenarios than OpenAI models.
|
39 |
|
40 |
Legend:
|
41 |
* "sc-" Indicates Softcatalà fine-tuned model
|
|
|
56 |
| ----------- | ----------- |
|
57 |
| OpenAI | 27.32 |
|
58 |
| Whisper.cpp 1.2.1 | 38.89 |
|
59 |
+
| HuggingFace 4.27.1 | 93.54 |
|
60 |
| CTranslate2 3.10.3 | 43.68 |
|
61 |
|
62 |
We strongly recommend using CTranslate2 as inference client.
|
63 |
|
|
|
64 |
**5. Fine-tunning degrades timestamp prediction**
|
65 |
|
66 |
Whisper uses timestamp tokens to indicate the timestamps of the transcribed texts.
|