jordimas commited on
Commit
802c1a9
1 Parent(s): 69769f1

Fixes to documentation

Browse files
Files changed (1) hide show
  1. TRAINING.md +4 -5
TRAINING.md CHANGED
@@ -24,9 +24,9 @@ When doing human evaliuation the results for finetuned Catalan language model we
24
 
25
  Our hypothesis is that the evaluation on Common Voice gives better results because the model is overfitted and has lost generalization capabilities.
26
 
27
- **2. Model degrades according evaluation with other datasets**
28
 
29
- Doing a more extensive evaluation shows:
30
 
31
  | | base | sc-base | small | sc-small |medium | sc-medium |
32
  | ----------- | ----------- | ----------- | ----------- |----------- | ----------- | ----------- |
@@ -35,7 +35,7 @@ Doing a more extensive evaluation shows:
35
  | Son_Goku_catalan_valencian_voice | 51.90 | 85.44 | 39.87 |65.19 | 18.99| 71.52
36
  | Universal_Declaration_of_Human_Rights | 47.12 | 36.45 | 39.14 |75.59 | 44.37 | 27.79
37
 
38
- As you can see,
39
 
40
  Legend:
41
  * "sc-" Indicates Softcatalà fine-tuned model
@@ -56,12 +56,11 @@ In our experiments
56
  | ----------- | ----------- |
57
  | OpenAI | 27.32 |
58
  | Whisper.cpp 1.2.1 | 38.89 |
59
- | HuggingFace | 93.54 |
60
  | CTranslate2 3.10.3 | 43.68 |
61
 
62
  We strongly recommend using CTranslate2 as inference client.
63
 
64
-
65
  **5. Fine-tunning degrades timestamp prediction**
66
 
67
  Whisper uses timestamp tokens to indicate the timestamps of the transcribed texts.
 
24
 
25
  Our hypothesis is that the evaluation on Common Voice gives better results because the model is overfitted and has lost generalization capabilities.
26
 
27
+ **3. Model degrades according evaluation with other datasets**
28
 
29
+ Results doing an evaluation with other datasets:
30
 
31
  | | base | sc-base | small | sc-small |medium | sc-medium |
32
  | ----------- | ----------- | ----------- | ----------- |----------- | ----------- | ----------- |
 
35
  | Son_Goku_catalan_valencian_voice | 51.90 | 85.44 | 39.87 |65.19 | 18.99| 71.52
36
  | Universal_Declaration_of_Human_Rights | 47.12 | 36.45 | 39.14 |75.59 | 44.37 | 27.79
37
 
38
+ As you can see, the fine-tunned models perform worse in most of the scenarios than OpenAI models.
39
 
40
  Legend:
41
  * "sc-" Indicates Softcatalà fine-tuned model
 
56
  | ----------- | ----------- |
57
  | OpenAI | 27.32 |
58
  | Whisper.cpp 1.2.1 | 38.89 |
59
+ | HuggingFace 4.27.1 | 93.54 |
60
  | CTranslate2 3.10.3 | 43.68 |
61
 
62
  We strongly recommend using CTranslate2 as inference client.
63
 
 
64
  **5. Fine-tunning degrades timestamp prediction**
65
 
66
  Whisper uses timestamp tokens to indicate the timestamps of the transcribed texts.