pere commited on
Commit
86bc0da
1 Parent(s): 4dfe3f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -8
README.md CHANGED
@@ -49,7 +49,7 @@ The following people contributed to building this model: Rolv-Arild Braaten, Per
49
  ## Training procedure
50
  To reproduce these results, we strongly recommend that you follow the [instructions from HuggingFace](https://github.com/huggingface/transformers/tree/master/examples/research_projects/robust-speech-event#talks) to train a simple Swedish model.
51
 
52
- When you have verified that you are able to do this, create a new repo. You can then start by copying the files **run.sh** and **run_speech_recognition_ctc.py** from our repo. You should be able to reproduce our results by just running this script. With some tweaking, you will most likely be able to build an even better ASR.
53
 
54
  ### 5-gram Language Model
55
  Adding a language model will improve the results of the model. 🤗 has provided another [very nice blog](https://huggingface.co/blog/wav2vec2-with-ngram) about how to add a 5-gram language model to improve the ASR model. You can build this from your own corpus, for instance by extracting some suitable text from the [Norwegian Colossal Corpus](https://huggingface.co/datasets/NbAiLab/NCC). You can also skip some of the steps in the guide, and copy the [5-gram model from this repo](https://huggingface.co/NbAiLab/XLSR-300M-bokmaal/tree/main/language_model).
@@ -100,10 +100,13 @@ The following parameters were used during training:
100
  --preprocessing_num_workers="16"
101
  ```
102
 
103
- This training will take 3-4 days on an average GPU. You might get a decent model and faster results by changing these parameters:
104
- ```
105
- --per_device_train_batch_size - Adjust this to the maximum of available memory. 16 or 24 might be good settings depending on your system
106
- --gradient_accumulation_steps - Can be adjusted even further up to increase batch size and speed up training without running into memory issues
107
- --learning_rate - Can be increased, maybe as high as 1e-4. Speeds up training but might add instability
108
- --epochs - Can be decreased significantly. This is a huge dataset and you might get a decent result already after a couple of epochs
109
- ```
 
 
 
 
49
  ## Training procedure
50
  To reproduce these results, we strongly recommend that you follow the [instructions from HuggingFace](https://github.com/huggingface/transformers/tree/master/examples/research_projects/robust-speech-event#talks) to train a simple Swedish model.
51
 
52
+ When you have verified that you are able to do this, create a fresh new repo. You can then start by copying the files **run.sh** and **run_speech_recognition_ctc.py** from our repo. You Rhould be able to reproduce our results by just running this script. With some tweaking, you will most likely be able to build an even better ASR.
53
 
54
  ### 5-gram Language Model
55
  Adding a language model will improve the results of the model. 🤗 has provided another [very nice blog](https://huggingface.co/blog/wav2vec2-with-ngram) about how to add a 5-gram language model to improve the ASR model. You can build this from your own corpus, for instance by extracting some suitable text from the [Norwegian Colossal Corpus](https://huggingface.co/datasets/NbAiLab/NCC). You can also skip some of the steps in the guide, and copy the [5-gram model from this repo](https://huggingface.co/NbAiLab/XLSR-300M-bokmaal/tree/main/language_model).
 
100
  --preprocessing_num_workers="16"
101
  ```
102
 
103
+ Following this settings, the training might take 3-4 days on an average GPU. You should however get a decent model and faster results by changing these parameters
104
+
105
+ | Parameter| Comment |
106
+ |:-------------|:-----|
107
+ | per_device_train_batch_size | Adjust this to the maximum of available memory. 16 or 24 might be good settings depending on your system |
108
+ |gradient_accumulation_steps |Can be adjusted even further up to increase batch size and speed up training without running into memory issues |
109
+ | learning_rate|Can be increased, maybe as high as 1e-4. Speeds up training but might add instability |
110
+ | epochs| Can be decreased significantly. This is a huge dataset and you might get a decent result already after a couple of epochs|
111
+
112
+