Training in progress, step 300

Files changed (3) hide show

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d091c8707cb0763632c8b311f2d581a51bd141ec488b3c948362989bfac0bdda
 size 967102601

 version https://git-lfs.github.com/spec/v1
+oid sha256:3659cbd57caabfa6834081314e1044f720d4f82db5a36a341158bdc9fc0cf4f2
 size 967102601

runs/Dec20_18-49-20_129-146-32-172/events.out.tfevents.1671562165.129-146-32-172.139059.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:03037c53c66d682beeba49057ddb0ddeae3ab097c31a75d7d89d6fa58c9f7476
-size 8111

 version https://git-lfs.github.com/spec/v1
+oid sha256:1538b38213e587df8d9b211f4b2848555784411095be34e9210a0d863fdeae37
+size 10046

whisper_small_ps_augmented.py CHANGED Viewed

@@ -101,7 +101,7 @@ def augment_dataset(batch):
 print('Augment train set:')
-fleurs['train'] = fleurs['train'].map(augment_dataset, num_proc=3)
 """We can apply the data preparation function to all of our training examples using dataset's `.map` method. The argument `num_proc` specifies how many CPU cores to use. Setting `num_proc` > 1 will enable multiprocessing. If the `.map` method hangs with multiprocessing, set `num_proc=1` and process the dataset sequentially."""
@@ -137,7 +137,7 @@ def prepare_dataset(batch):
 print('Extract features and normalize data:')
 fleurs = fleurs.map(
-    prepare_dataset, remove_columns=fleurs.column_names['train'], num_proc=3).with_format('torch')
 """Finally, we filter any training data with audio samples longer than 30s. These samples would otherwise be truncated by the Whisper feature-extractor which could affect the stability of training. We define a function that returns `True` for samples that are less than 30s, and `False` for those that are longer:"""
@@ -272,7 +272,7 @@ training_args = Seq2SeqTrainingArguments(
     greater_is_better=False,
     push_to_hub=True,
     #optim='adamw_bnb_8bit',  # 'adamw_bnb_8bit',
-    overwrite_output_dir="True"
 )

 print('Augment train set:')
+fleurs['train'] = fleurs['train'].map(augment_dataset, num_proc=10)
 """We can apply the data preparation function to all of our training examples using dataset's `.map` method. The argument `num_proc` specifies how many CPU cores to use. Setting `num_proc` > 1 will enable multiprocessing. If the `.map` method hangs with multiprocessing, set `num_proc=1` and process the dataset sequentially."""
 print('Extract features and normalize data:')
 fleurs = fleurs.map(
+    prepare_dataset, remove_columns=fleurs.column_names['train'], num_proc=10).with_format('torch')
 """Finally, we filter any training data with audio samples longer than 30s. These samples would otherwise be truncated by the Whisper feature-extractor which could affect the stability of training. We define a function that returns `True` for samples that are less than 30s, and `False` for those that are longer:"""
     greater_is_better=False,
     push_to_hub=True,
     #optim='adamw_bnb_8bit',  # 'adamw_bnb_8bit',
+    overwrite_output_dir="False"
 )