PereLluis13 commited on
Commit
63c7357
1 Parent(s): 56de3dd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -87,7 +87,7 @@ processor = Wav2Vec2Processor.from_pretrained("PereLluis13/wav2vec2-large-xlsr-5
87
  model = Wav2Vec2ForCTC.from_pretrained("PereLluis13/wav2vec2-large-xlsr-53-greek")
88
  model.to("cuda")
89
 
90
- chars_to_ignore_regex = '[\\\\\\\\,\\\\\\\\?\\\\\\\\.\\\\\\\\!\\\\\\\\-\\\\\\\\;\\\\\\\\:\\\\\\\\"\\\\\\\\“\\\\\\\\%\\\\\\\\‘\\\\\\\\”\\\\\\\\�]'
91
 
92
 
93
  resampler = torchaudio.transforms.Resample(48_000, 16_000)
@@ -137,6 +137,6 @@ The Common Voice `train`, `validation`, and CSS10 datasets were used for trainin
137
  return batch
138
  ```
139
 
140
- As suggested by Florian Zimmermeister.
141
 
142
  The script used for training can be found in [run_common_voice.py](examples/research_projects/wav2vec2/run_common_voice.py), still pending of PR. The only changes are to `speech_file_to_array_fn`. Batch size was kept at 32 (using `gradient_accumulation_steps`) using one of the [OVH](https://www.ovh.com/) machines, with a V100 GPU (thank you very much [OVH](https://www.ovh.com/)). The model trained for 40 epochs, the first 20 with the `train+validation` splits, and then `extra` split was added with the data from CSS10 at the 20th epoch.
87
  model = Wav2Vec2ForCTC.from_pretrained("PereLluis13/wav2vec2-large-xlsr-53-greek")
88
  model.to("cuda")
89
 
90
+ chars_to_ignore_regex = '[\\\\\\\\\\\\\\\\,\\\\\\\\\\\\\\\\?\\\\\\\\\\\\\\\\.\\\\\\\\\\\\\\\\!\\\\\\\\\\\\\\\\-\\\\\\\\\\\\\\\\;\\\\\\\\\\\\\\\\:\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\“\\\\\\\\\\\\\\\\%\\\\\\\\\\\\\\\\‘\\\\\\\\\\\\\\\\”\\\\\\\\\\\\\\\\�]'
91
 
92
 
93
  resampler = torchaudio.transforms.Resample(48_000, 16_000)
137
  return batch
138
  ```
139
 
140
+ As suggested by [Florian Zimmermeister](https://github.com/flozi00).
141
 
142
  The script used for training can be found in [run_common_voice.py](examples/research_projects/wav2vec2/run_common_voice.py), still pending of PR. The only changes are to `speech_file_to_array_fn`. Batch size was kept at 32 (using `gradient_accumulation_steps`) using one of the [OVH](https://www.ovh.com/) machines, with a V100 GPU (thank you very much [OVH](https://www.ovh.com/)). The model trained for 40 epochs, the first 20 with the `train+validation` splits, and then `extra` split was added with the data from CSS10 at the 20th epoch.