filipzawadka commited on
Commit
2505a15
1 Parent(s): e7244b3

readme update

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -10,4 +10,20 @@ pinned: false
10
  license: apache-2.0
11
  ---
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
10
  license: apache-2.0
11
  ---
12
 
13
+ Possible model improvments
14
+
15
+ (a) model-centric approach -
16
+ for sure the biggest improvment is using the bigger whisper architecture
17
+ increase the batch size and train for longer, we could use a scheduler to rise it consistently,
18
+ until the model stabilizes completly
19
+ multi-head training: we could train on all languages with common part of the architecture, which could iprove generalization
20
+ and help us be able to use much more data
21
+
22
+ (b) data-centric approach -
23
+ we can use a dataset with better phonetic desctiption like TIMIT dataset
24
+ we can use more data, and more diverse data, here most of the files
25
+ are recorder from a laptop microphone, which can influence
26
+ predictions on other sourses
27
+ add noise and other transformations to the dataset
28
+
29
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference