0%| | 0/5000 [00:00> The following columns in the training set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 0%| | 25/5000 [13:57<39:16:32, 28.42s/it] 1%| | 50/5000 [26:04<41:27:08, 30.15s/it] 2%|▏ | 75/5000 [38:05<41:31:52, 30.36s/it] 2%|▏ | 100/5000 [50:33<39:07:41, 28.75s/it] 2%|▏ | 124/5000 [1:02:09<37:55:13, 28.00s/it] 3%|▎ | 150/5000 [1:14:46<38:01:15, 28.22s/it] 3%|▎ | 174/5000 [1:26:08<38:57:16, 29.06s/it] 4%|▍ | 200/5000 [1:38:53<40:24:10, 30.30s/it] 4%|▍ | 225/5000 [1:51:23<39:09:10, 29.52s/it] 5%|▍ | 242/5000 [1:58:41<21:36:39, 16.35s/it] Reading metadata...: 229844it [00:24, 23238.20it/s] 5%|▌ | 250/5000 [2:04:05<40:04:27, 30.37s/it] 6%|▌ | 275/5000 [2:16:06<37:28:33, 28.55s/it] 6%|▌ | 300/5000 [2:28:14<37:50:23, 28.98s/it] 6%|▋ | 325/5000 [2:40:15<37:32:05, 28.90s/it] 7%|▋ | 350/5000 [2:52:14<37:01:18, 28.66s/it] 7%|▋ | 374/5000 [3:03:56<37:34:58, 29.25s/it] 8%|▊ | 400/5000 [3:16:23<36:38:33, 28.68s/it] 8%|▊ | 425/5000 [3:28:40<37:07:44, 29.22s/it] 9%|▉ | 450/5000 [3:40:41<35:47:50, 28.32s/it] 10%|▉ | 475/5000 [3:52:36<35:37:33, 28.34s/it] 10%|▉ | 485/5000 [3:56:13<18:50:40, 15.03s/it] Reading metadata...: 15520it [00:00, 39827.17it/s]] 10%|█ | 500/5000 [4:04:55<35:30:12, 28.40s/it] 10%|█ | 525/5000 [4:16:40<35:03:17, 28.20s/it] 11%|█ | 550/5000 [4:28:28<35:16:21, 28.54s/it] 11%|█▏ | 572/5000 [4:39:16<35:11:44, 28.61s/it] 12%|█▏ | 575/5000 [4:40:57<38:15:47, 31.13s/it] 12%|█▏ | 600/5000 [4:52:54<36:49:52, 30.13s/it] 12%|█▎ | 625/5000 [5:04:50<34:18:11, 28.23s/it] 13%|█▎ | 650/5000 [5:16:38<33:43:03, 27.90s/it] 14%|█▎ | 675/5000 [5:28:26<33:50:01, 28.16s/it] 14%|█▍ | 699/5000 [5:39:42<33:47:13, 28.28s/it] 14%|█▍ | 725/5000 [5:51:27<25:30:12, 21.48s/it] 15%|█▍ | 727/5000 [5:51:48<18:50:29, 15.87s/it] Reading metadata...: 15520it [00:00, 37871.82it/s] 15%|█▍ | 749/5000 [6:04:03<34:54:34, 29.56s/it] 15%|█▌ | 774/5000 [6:16:23<34:46:09, 29.62s/it] 16%|█▌ | 799/5000 [6:28:22<33:09:42, 28.42s/it] 16%|█▋ | 825/5000 [6:40:55<33:15:27, 28.68s/it] 17%|█▋ | 850/5000 [6:53:05<34:29:28, 29.92s/it] 18%|█▊ | 875/5000 [7:05:18<34:01:22, 29.69s/it] 18%|█▊ | 899/5000 [7:16:58<34:20:00, 30.14s/it] 18%|█▊ | 924/5000 [7:29:08<31:54:16, 28.18s/it] 19%|█▉ | 949/5000 [7:40:42<31:08:09, 27.67s/it] 19%|█▉ | 970/5000 [7:49:41<17:13:56, 15.39s/it] Reading metadata...: 221914it [00:33, 6083.09it/s] 19%|█▉ | 974/5000 [7:53:34<42:06:19, 37.65s/it] 20%|█▉ | 999/5000 [8:05:24<31:33:52, 28.40s/it] 20%|██ | 1000/5000 [8:05:54<32:01:14, 28.82s/it][INFO|trainer.py:3138] 2023-05-10 17:47:34,809 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-10 17:47:34,809 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-10 17:47:34,809 >> Batch size = 64 [INFO|trainer_utils.py:693] 2023-05-10 17:47:50,596 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. {'eval_loss': 0.24644243717193604, 'eval_wer': 9.800036380645496, 'eval_runtime': 3384.9122, 'eval_samples_per_second': 4.585, 'eval_steps_per_second': 0.072, 'epoch': 4.01} 20%|██ | 1000/5000 [9:02:19<32:01:14, 28.82s/it][INFO|trainer.py:2877] 2023-05-10 18:43:59,730 >> Saving model checkpoint to ./checkpoint-1000 [INFO|configuration_utils.py:458] 2023-05-10 18:43:59,735 >> Configuration saved in ./checkpoint-1000/config.json [INFO|configuration_utils.py:364] 2023-05-10 18:43:59,739 >> Configuration saved in ./checkpoint-1000/generation_config.json [INFO|modeling_utils.py:1855] 2023-05-10 18:44:03,168 >> Model weights saved in ./checkpoint-1000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-10 18:44:03,173 >> Feature extractor saved in ./checkpoint-1000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-10 18:44:11,165 >> Feature extractor saved in ./preprocessor_config.json Adding files tracked by Git LFS: ['wandb/run-20230509_115211-hq92t8sj/run-hq92t8sj.wandb', 'wandb/run-20230510_094132-lvsln7ks/run-lvsln7ks.wandb']. This may take a bit of time if the files are large. 05/10/2023 18:44:21 - WARNING - huggingface_hub.repository - Adding files tracked by Git LFS: ['wandb/run-20230509_115211-hq92t8sj/run-hq92t8sj.wandb', 'wandb/run-20230510_094132-lvsln7ks/run-lvsln7ks.wandb']. This may take a bit of time if the files are large. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 20%|██ | 1025/5000 [9:14:35<31:09:01, 28.21s/it] 21%|██ | 1049/5000 [9:26:14<32:13:26, 29.36s/it] 22%|██▏ | 1075/5000 [9:38:18<30:32:50, 28.02s/it] 22%|██▏ | 1100/5000 [9:50:01<30:41:56, 28.34s/it] 22%|██▎ | 1125/5000 [10:01:47<30:08:21, 28.00s/it] 23%|██▎ | 1150/5000 [10:13:42<30:15:19, 28.29s/it] 23%|██▎ | 1174/5000 [10:25:04<30:12:39, 28.43s/it] 24%|██▍ | 1200/5000 [10:37:12<29:47:54, 28.23s/it] 24%|██▍ | 1212/5000 [10:41:54<16:58:24, 16.13s/it] Reading metadata...: 15520it [00:00, 39430.04it/s]] 24%|██▍ | 1225/5000 [10:49:34<29:56:46, 28.56s/it] 25%|██▌ | 1250/5000 [11:01:43<29:58:55, 28.78s/it] 26%|██▌ | 1275/5000 [11:14:00<30:57:42, 29.92s/it] 26%|██▌ | 1300/5000 [11:26:24<29:52:50, 29.07s/it] 26%|██▋ | 1325/5000 [11:38:41<29:56:14, 29.33s/it] 27%|██▋ | 1350/5000 [11:50:51<30:10:19, 29.76s/it] 28%|██▊ | 1375/5000 [12:03:00<29:03:49, 28.86s/it] 28%|██▊ | 1400/5000 [12:15:04<29:12:43, 29.21s/it] 28%|██▊ | 1425/5000 [12:27:26<29:14:04, 29.44s/it] 29%|██▉ | 1450/5000 [12:39:38<28:05:33, 28.49s/it] 29%|██▉ | 1455/5000 [12:40:52<14:54:16, 15.14s/it] Reading metadata...: 221914it [00:11, 21050.24it/s] 30%|██▉ | 1475/5000 [12:51:59<28:05:07, 28.68s/it] 30%|███ | 1500/5000 [13:04:14<27:52:11, 28.67s/it] 30%|███ | 1524/5000 [13:15:55<28:31:59, 29.55s/it] 31%|███ | 1550/5000 [13:28:41<28:46:01, 30.02s/it] 32%|███▏ | 1575/5000 [13:40:49<27:37:42, 29.04s/it] 32%|███▏ | 1600/5000 [13:53:15<28:38:52, 30.33s/it] 32%|███▎ | 1625/5000 [14:05:32<26:57:46, 28.76s/it] 33%|███▎ | 1650/5000 [14:17:49<27:35:28, 29.65s/it] 34%|███▎ | 1675/5000 [14:29:59<27:04:36, 29.32s/it] 34%|███▍ | 1697/5000 [14:39:41<14:46:59, 16.11s/it] Reading metadata...: 15520it [00:00, 37994.72it/s]] 34%|███▍ | 1700/5000 [14:42:18<31:39:14, 34.53s/it] 34%|███▍ | 1725/5000 [14:54:15<26:30:49, 29.14s/it] 35%|███▌ | 1750/5000 [15:06:24<26:10:07, 28.99s/it] 36%|███▌ | 1775/5000 [15:18:20<25:35:23, 28.57s/it] 36%|███▌ | 1799/5000 [15:29:33<24:50:30, 27.94s/it] 36%|███▋ | 1825/5000 [15:41:48<24:40:17, 27.97s/it] 37%|███▋ | 1850/5000 [15:53:41<24:37:31, 28.14s/it] 38%|███▊ | 1875/5000 [16:05:33<25:22:21, 29.23s/it] 38%|███▊ | 1900/5000 [16:17:16<23:40:33, 27.49s/it] 38%|███▊ | 1925/5000 [16:28:59<24:12:37, 28.34s/it] 39%|███▉ | 1940/5000 [16:34:58<12:58:19, 15.26s/it] Reading metadata...: 15520it [00:00, 41429.23it/s]] 39%|███▉ | 1950/5000 [16:41:26<26:21:14, 31.11s/it] 40%|███▉ | 1975/5000 [16:53:46<24:39:45, 29.35s/it] 40%|████ | 2000/5000 [17:06:00<24:20:42, 29.21s/it][INFO|trainer.py:3138] 2023-05-11 02:47:40,879 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-11 02:47:40,879 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-11 02:47:40,879 >> Batch size = 64 Reading metadata...: 0it [00:00, ?it/s] [INFO|trainer_utils.py:693] 2023-05-11 02:47:55,714 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. 40%|████ | 2000/5000 [17:58:10<24:20:42, 29.21s/it][INFO|trainer.py:2877] 2023-05-11 03:39:50,770 >> Saving model checkpoint to ./checkpoint-2000 [INFO|configuration_utils.py:458] 2023-05-11 03:39:50,774 >> Configuration saved in ./checkpoint-2000/config.json [INFO|configuration_utils.py:364] 2023-05-11 03:39:50,778 >> Configuration saved in ./checkpoint-2000/generation_config.json {'eval_loss': 0.2271677553653717, 'eval_wer': 8.622862637077075, 'eval_runtime': 3129.8829, 'eval_samples_per_second': 4.959, 'eval_steps_per_second': 0.078, 'epoch': 8.01} [INFO|modeling_utils.py:1855] 2023-05-11 03:39:54,336 >> Model weights saved in ./checkpoint-2000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-11 03:39:54,340 >> Feature extractor saved in ./checkpoint-2000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-11 03:40:05,259 >> Feature extractor saved in ./preprocessor_config.json /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 40%|████ | 2025/5000 [18:11:31<24:27:18, 29.59s/it] 41%|████ | 2049/5000 [18:23:22<23:30:07, 28.67s/it] 42%|████▏ | 2075/5000 [18:36:10<23:37:51, 29.08s/it] 42%|████▏ | 2100/5000 [18:48:47<23:17:01, 28.90s/it] 42%|████▏ | 2124/5000 [19:01:00<23:57:25, 29.99s/it] 43%|████▎ | 2149/5000 [19:13:19<23:08:38, 29.22s/it] 43%|████▎ | 2174/5000 [19:26:02<23:49:21, 30.35s/it] 44%|████▎ | 2182/5000 [19:28:54<12:49:33, 16.39s/it] Reading metadata...: 15520it [00:00, 39949.82it/s]] 44%|████▍ | 2200/5000 [19:38:47<21:55:35, 28.19s/it] 44%|████▍ | 2225/5000 [19:50:44<21:35:08, 28.00s/it] 45%|████▌ | 2250/5000 [20:02:39<21:22:48, 27.99s/it] 46%|████▌ | 2275/5000 [20:14:32<21:15:41, 28.09s/it] 46%|████▌ | 2300/5000 [20:26:20<20:53:58, 27.87s/it] 46%|████▋ | 2325/5000 [20:38:20<21:18:18, 28.67s/it] 47%|████▋ | 2350/5000 [20:50:09<21:14:08, 28.85s/it] 48%|████▊ | 2375/5000 [21:01:55<20:58:49, 28.77s/it] 48%|████▊ | 2400/5000 [21:13:42<20:10:12, 27.93s/it] 48%|████▊ | 2425/5000 [21:24:13<10:39:32, 14.90s/it] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 15520it [00:00, 42906.76it/s]] 49%|████▉ | 2450/5000 [21:37:09<19:34:05, 27.63s/it] 49%|████▉ | 2474/5000 [21:48:22<19:40:57, 28.05s/it] 50%|████▉ | 2499/5000 [21:59:52<19:17:58, 27.78s/it] 50%|█████ | 2524/5000 [22:11:27<18:58:41, 27.59s/it] 51%|█████ | 2549/5000 [22:23:04<19:00:07, 27.91s/it] 51%|█████▏ | 2574/5000 [22:34:30<18:14:39, 27.07s/it] 52%|█████▏ | 2599/5000 [22:46:03<18:30:20, 27.75s/it] 52%|█████▏ | 2624/5000 [22:57:41<18:31:53, 28.08s/it] 53%|█████▎ | 2649/5000 [23:09:11<18:02:57, 27.64s/it] 53%|█████▎ | 2667/5000 [23:16:33<10:09:46, 15.68s/it] Reading metadata...: 15520it [00:00, 40616.01it/s]] 53%|█████▎ | 2674/5000 [23:20:58<18:47:44, 29.09s/it] 54%|█████▍ | 2699/5000 [23:32:26<17:38:02, 27.59s/it] 54%|█████▍ | 2724/5000 [23:44:00<17:41:36, 27.99s/it] 55%|█████▍ | 2749/5000 [23:55:39<17:28:41, 27.95s/it] 55%|█████▌ | 2774/5000 [24:07:16<17:15:09, 27.90s/it] 56%|█████▌ | 2799/5000 [24:18:54<17:00:38, 27.82s/it] 56%|█████▋ | 2824/5000 [24:30:32<17:04:13, 28.24s/it] 57%|█████▋ | 2849/5000 [24:42:05<16:38:59, 27.87s/it] 57%|█████▋ | 2874/5000 [24:53:39<16:19:52, 27.65s/it] 58%|█████▊ | 2899/5000 [25:05:08<16:15:56, 27.87s/it] 58%|█████▊ | 2910/5000 [25:09:07<8:36:19, 14.82s/it] Reading metadata...: 221914it [00:15, 19229.59it/s] 58%|█████▊ | 2924/5000 [25:17:11<16:12:37, 28.11s/it] 59%|█████▉ | 2949/5000 [25:28:44<15:55:42, 27.96s/it] 59%|█████▉ | 2974/5000 [25:40:24<15:40:54, 27.87s/it] 60%|█████▉ | 2999/5000 [25:52:03<15:26:20, 27.78s/it] 60%|██████ | 3000/5000 [25:52:31<15:29:53, 27.90s/it][INFO|trainer.py:3138] 2023-05-11 11:34:12,378 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-11 11:34:12,378 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-11 11:34:12,378 >> Batch size = 64 Reading metadata...: 15520it [00:00, 24837.17it/s] [INFO|trainer_utils.py:693] 2023-05-11 11:34:22,019 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. 60%|██████ | 3000/5000 [26:43:05<15:29:53, 27.90s/it][INFO|trainer.py:2877] 2023-05-11 12:24:45,564 >> Saving model checkpoint to ./checkpoint-3000 [INFO|configuration_utils.py:458] 2023-05-11 12:24:45,569 >> Configuration saved in ./checkpoint-3000/config.json [INFO|configuration_utils.py:364] 2023-05-11 12:24:45,573 >> Configuration saved in ./checkpoint-3000/generation_config.json {'eval_loss': 0.25769394636154175, 'eval_wer': 8.695623928070265, 'eval_runtime': 3033.1756, 'eval_samples_per_second': 5.117, 'eval_steps_per_second': 0.08, 'epoch': 12.02} [INFO|modeling_utils.py:1855] 2023-05-11 12:24:49,746 >> Model weights saved in ./checkpoint-3000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-11 12:24:49,750 >> Feature extractor saved in ./checkpoint-3000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-11 12:25:01,129 >> Feature extractor saved in ./preprocessor_config.json /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 60%|██████ | 3025/5000 [26:55:52<15:29:03, 28.22s/it] 61%|██████ | 3050/5000 [27:07:30<15:03:59, 27.81s/it] 62%|██████▏ | 3075/5000 [27:19:03<14:46:05, 27.62s/it] 62%|██████▏ | 3100/5000 [27:31:33<15:58:08, 30.26s/it] 62%|██████▏ | 3124/5000 [27:44:17<18:49:57, 36.14s/it] 63%|██████▎ | 3150/5000 [27:58:08<12:45:17, 24.82s/it] 63%|██████▎ | 3152/5000 [27:58:29<8:59:11, 17.51s/it] Reading metadata...: 218215it [05:38, 808.68it/s] 64%|██████▎ | 3175/5000 [28:21:31<16:14:53, 32.05s/it] 64%|██████▍ | 3200/5000 [28:35:06<16:39:40, 33.32s/it] 64%|██████▍ | 3225/5000 [28:47:56<14:57:40, 30.34s/it] 65%|██████▍ | 3249/5000 [29:00:06<14:41:41, 30.21s/it] 66%|██████▌ | 3275/5000 [29:14:09<15:26:39, 32.23s/it] 66%|██████▌ | 3300/5000 [29:28:49<15:38:51, 33.14s/it] 66%|██████▋ | 3324/5000 [29:40:43<14:00:20, 30.08s/it] 67%|██████▋ | 3350/5000 [29:53:28<13:42:47, 29.92s/it] 67%|██████▋ | 3374/5000 [30:05:30<13:26:26, 29.76s/it] 68%|██████▊ | 3395/5000 [30:14:47<6:48:42, 15.28s/it] Reading metadata...: 221914it [00:24, 10269.69it/s] 68%|██████▊ | 3399/5000 [30:18:26<16:00:47, 36.01s/it] 68%|██████▊ | 3424/5000 [30:30:23<12:38:30, 28.88s/it] 69%|██████▉ | 3449/5000 [30:42:12<12:14:00, 28.40s/it] 69%|██████▉ | 3474/5000 [30:54:04<12:05:36, 28.53s/it] 70%|██████▉ | 3499/5000 [31:05:54<11:41:43, 28.05s/it] 70%|███████ | 3524/5000 [31:17:50<11:34:42, 28.24s/it] 71%|███████ | 3549/5000 [31:29:41<11:39:07, 28.91s/it] 71%|███████▏ | 3574/5000 [31:41:33<11:19:50, 28.60s/it] 72%|███████▏ | 3599/5000 [31:53:19<10:50:25, 27.86s/it] 72%|███████▏ | 3624/5000 [32:04:57<10:35:00, 27.69s/it] 73%|███████▎ | 3637/5000 [32:10:07<6:03:02, 15.98s/it] Reading metadata...: 15520it [00:00, 34489.56it/s] 73%|███████▎ | 3649/5000 [32:17:38<11:23:10, 30.34s/it] 74%|███████▎ | 3675/5000 [32:30:31<10:49:34, 29.41s/it] 74%|███████▍ | 3699/5000 [32:42:32<10:40:27, 29.54s/it] 74%|███████▍ | 3724/5000 [32:54:44<10:14:21, 28.89s/it] 75%|███████▍ | 3749/5000 [33:07:16<10:43:26, 30.86s/it] 75%|███████▌ | 3774/5000 [33:19:41<10:35:11, 31.09s/it] 76%|███████▌ | 3799/5000 [33:32:08<9:58:45, 29.91s/it] 76%|███████▋ | 3824/5000 [33:44:29<10:14:31, 31.35s/it] 77%|███████▋ | 3850/5000 [33:56:57<8:56:32, 27.99s/it] 77%|███████▋ | 3874/5000 [34:08:31<9:11:29, 29.39s/it] 78%|███████▊ | 3880/5000 [34:10:13<4:42:32, 15.14s/it] Reading metadata...: 15520it [00:00, 32765.41it/s]] 78%|███████▊ | 3899/5000 [34:21:07<8:54:10, 29.11s/it] 78%|███████▊ | 3924/5000 [34:33:36<8:47:19, 29.41s/it] 79%|███████▉ | 3949/5000 [34:45:51<8:27:26, 28.97s/it] 79%|███████▉ | 3974/5000 [34:58:07<8:29:21, 29.79s/it] 80%|███████▉ | 3999/5000 [35:10:22<8:13:11, 29.56s/it] 80%|████████ | 4000/5000 [35:10:50<8:05:10, 29.11s/it][INFO|trainer.py:3138] 2023-05-11 20:52:31,246 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-11 20:52:31,247 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-11 20:52:31,247 >> Batch size = 64 Reading metadata...: 15520it [00:00, 34879.19it/s] [INFO|trainer_utils.py:693] 2023-05-11 20:52:40,903 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. 80%|████████ | 4000/5000 [36:03:32<8:05:10, 29.11s/it][INFO|trainer.py:2877] 2023-05-11 21:45:12,895 >> Saving model checkpoint to ./checkpoint-4000 [INFO|configuration_utils.py:458] 2023-05-11 21:45:12,903 >> Configuration saved in ./checkpoint-4000/config.json [INFO|configuration_utils.py:364] 2023-05-11 21:45:12,909 >> Configuration saved in ./checkpoint-4000/generation_config.json {'eval_loss': 0.22095343470573425, 'eval_wer': 8.212281066472636, 'eval_runtime': 3161.6276, 'eval_samples_per_second': 4.909, 'eval_steps_per_second': 0.077, 'epoch': 16.02} [INFO|modeling_utils.py:1855] 2023-05-11 21:45:15,833 >> Model weights saved in ./checkpoint-4000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-11 21:45:15,837 >> Feature extractor saved in ./checkpoint-4000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-11 21:45:26,357 >> Feature extractor saved in ./preprocessor_config.json /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 80%|████████ | 4024/5000 [36:16:27<8:09:20, 30.08s/it] 81%|████████ | 4050/5000 [36:29:22<7:57:29, 30.16s/it] 82%|████████▏ | 4075/5000 [36:41:46<7:53:07, 30.69s/it] 82%|████████▏ | 4100/5000 [36:54:01<7:20:09, 29.34s/it] 82%|████████▏ | 4122/5000 [37:03:42<3:56:29, 16.16s/it] Reading metadata...: 15520it [00:00, 36696.34it/s]] 82%|████████▎ | 4125/5000 [37:06:21<8:27:37, 34.81s/it] 83%|████████▎ | 4149/5000 [37:17:41<6:41:18, 28.29s/it] 84%|████████▎ | 4175/5000 [37:30:03<6:34:17, 28.68s/it] 84%|████████▍ | 4200/5000 [37:41:48<6:19:51, 28.49s/it] 84%|████████▍ | 4224/5000 [37:53:06<6:02:22, 28.02s/it] 85%|████████▌ | 4250/5000 [38:05:24<5:55:21, 28.43s/it] 86%|████████▌ | 4275/5000 [38:17:12<5:41:43, 28.28s/it] 86%|████████▌ | 4300/5000 [38:29:22<5:47:26, 29.78s/it] 86%|████████▋ | 4325/5000 [38:41:09<5:15:39, 28.06s/it] 87%|████████▋ | 4350/5000 [38:52:57<5:10:15, 28.64s/it] 87%|████████▋ | 4365/5000 [38:58:54<2:39:16, 15.05s/it] Reading metadata...: 15520it [00:00, 35578.75it/s]] 88%|████████▊ | 4375/5000 [39:05:06<5:03:20, 29.12s/it] 88%|████████▊ | 4399/5000 [39:16:23<4:38:03, 27.76s/it] 88%|████████▊ | 4424/5000 [39:28:16<4:31:50, 28.32s/it] 89%|████████▉ | 4449/5000 [39:39:59<4:16:06, 27.89s/it] 89%|████████▉ | 4474/5000 [39:51:45<4:06:18, 28.10s/it] 90%|█████████ | 4500/5000 [40:04:06<3:56:12, 28.35s/it] 90%|█████████ | 4524/5000 [40:15:28<3:46:29, 28.55s/it] 91%|█████████ | 4550/5000 [40:27:40<3:31:12, 28.16s/it] 91%|█████████▏| 4574/5000 [40:39:14<3:24:41, 28.83s/it] 92%|█████████▏| 4599/5000 [40:51:25<3:16:51, 29.45s/it] 92%|█████████▏| 4607/5000 [40:54:09<1:43:41, 15.83s/it] Reading metadata...: 15520it [00:00, 38951.85it/s]] 92%|█████████▏| 4624/5000 [41:03:44<3:00:13, 28.76s/it] 93%|█████████▎| 4649/5000 [41:15:51<2:48:39, 28.83s/it] 93%|█████████▎| 4674/5000 [41:27:56<2:36:09, 28.74s/it] 94%|█████████▍| 4699/5000 [41:39:47<2:21:59, 28.30s/it] 94%|█████████▍| 4724/5000 [41:51:50<2:14:44, 29.29s/it] 95%|█████████▍| 4749/5000 [42:03:48<1:57:29, 28.08s/it] 95%|█████████▌| 4774/5000 [42:15:42<1:51:26, 29.58s/it] 96%|█████████▌| 4799/5000 [42:27:32<1:33:50, 28.01s/it] 96%|█████████▋| 4824/5000 [42:39:17<1:22:49, 28.24s/it] 97%|█████████▋| 4849/5000 [42:50:17<42:16, 16.80s/it] 97%|█████████▋| 4850/5000 [42:50:27<37:12, 14.89s/it] Reading metadata...: 15520it [00:00, 34946.75it/s]] 97%|█████████▋| 4874/5000 [43:03:54<1:03:08, 30.07s/it] 98%|█████████▊| 4900/5000 [43:16:36<50:05, 30.05s/it] 98%|█████████▊| 4924/5000 [43:28:22<37:46, 29.82s/it] 99%|█████████▉| 4950/5000 [43:41:06<24:26, 29.33s/it] 100%|█████████▉| 4975/5000 [43:53:24<12:30, 30.03s/it] 100%|█████████▉| 4999/5000 [44:05:11<00:29, 29.50s/it] 100%|██████████| 5000/5000 [44:05:38<00:00, 28.81s/it][INFO|trainer.py:3138] 2023-05-12 05:47:19,194 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-12 05:47:19,195 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-12 05:47:19,195 >> Batch size = 64 Reading metadata...: 1it [00:01, 1.78s/it] [INFO|trainer_utils.py:693] 2023-05-12 05:47:34,127 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. [INFO|configuration_utils.py:458] 2023-05-12 06:44:01,605 >> Configuration saved in ./checkpoint-5000/config.jsonorresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. [INFO|configuration_utils.py:458] 2023-05-12 06:44:01,605 >> Configuration saved in ./checkpoint-5000/config.jsonorresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. [INFO|modeling_utils.py:1855] 2023-05-12 06:44:04,688 >> Model weights saved in ./checkpoint-5000/pytorch_model.binresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. [INFO|feature_extraction_utils.py:369] 2023-05-12 06:44:15,555 >> Feature extractor saved in ./preprocessor_config.jsononding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. [INFO|trainer.py:2177] 2023-05-12 06:45:02,783 >> Loading best model from ./checkpoint-4000 (score: 8.212281066472636).onding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. [INFO|trainer.py:2177] 2023-05-12 06:45:02,783 >> Loading best model from ./checkpoint-4000 (score: 8.212281066472636).onding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. 100%|██████████| 5000/5000 [45:03:23<00:00, 32.44s/it] [INFO|trainer.py:2877] 2023-05-12 06:45:04,427 >> Saving model checkpoint to ./ [INFO|configuration_utils.py:458] 2023-05-12 06:45:04,430 >> Configuration saved in ./config.json [INFO|configuration_utils.py:364] 2023-05-12 06:45:04,434 >> Configuration saved in ./generation_config.json {'train_runtime': 162214.2802, 'train_samples_per_second': 3.945, 'train_steps_per_second': 0.031, 'train_loss': 0.10428944413661957, 'epoch': 20.03} [INFO|modeling_utils.py:1855] 2023-05-12 06:45:07,575 >> Model weights saved in ./pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-12 06:45:07,579 >> Feature extractor saved in ./preprocessor_config.json [INFO|trainer.py:2877] 2023-05-12 06:45:07,583 >> Saving model checkpoint to ./ [INFO|configuration_utils.py:458] 2023-05-12 06:45:07,586 >> Configuration saved in ./config.json [INFO|configuration_utils.py:364] 2023-05-12 06:45:07,589 >> Configuration saved in ./generation_config.json [INFO|modeling_utils.py:1855] 2023-05-12 06:45:10,742 >> Model weights saved in ./pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-12 06:45:10,746 >> Feature extractor saved in ./preprocessor_config.json Several commits (2) will be pushed upstream. The progress bars may be unreliable. 05/12/2023 06:45:57 - WARNING - huggingface_hub.repository - Several commits (2) will be pushed upstream. 05/12/2023 06:45:57 - WARNING - huggingface_hub.repository - The progress bars may be unreliable. Upload file pytorch_model.bin: 931MB [01:20, 15.4MB/s] To https://huggingface.co/danielizham/whisper-small-es cbe0ded..ac68b76 main -> main 05/12/2023 06:47:27 - WARNING - huggingface_hub.repository - To https://huggingface.co/danielizham/whisper-small-es cbe0ded..ac68b76 main -> main Upload file pytorch_model.bin: 100%|██████████| 922M/922M [01:23<00:00, 11.6MB/s] To https://huggingface.co/danielizham/whisper-small-es ac68b76..b6dcee7 main -> main 05/12/2023 06:47:38 - WARNING - huggingface_hub.repository - To https://huggingface.co/danielizham/whisper-small-es ac68b76..b6dcee7 main -> main [INFO|trainer.py:3138] 2023-05-12 06:47:41,494 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-12 06:47:41,494 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-12 06:47:41,494 >> Batch size = 64 ***** train metrics ***** epoch = 20.03 train_loss = 0.1043 train_runtime = 1 day, 21:03:34.28 train_samples_per_second = 3.945 train_steps_per_second = 0.031 05/12/2023 06:47:41 - INFO - __main__ - *** Evaluate *** Reading metadata...: 15520it [00:02, 5694.40it/s] [INFO|trainer_utils.py:693] 2023-05-12 06:47:59,383 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' [INFO|trainer.py:2877] 2023-05-12 07:40:07,772 >> Saving model checkpoint to ./ [INFO|configuration_utils.py:458] 2023-05-12 07:40:07,776 >> Configuration saved in ./config.json [INFO|configuration_utils.py:364] 2023-05-12 07:40:07,779 >> Configuration saved in ./generation_config.json ***** eval metrics ***** epoch = 20.03 eval_loss = 0.221 eval_runtime = 0:52:26.25 eval_samples_per_second = 4.933 eval_steps_per_second = 0.077 eval_wer = 8.2123 [INFO|modeling_utils.py:1855] 2023-05-12 07:40:11,003 >> Model weights saved in ./pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-12 07:40:11,008 >> Feature extractor saved in ./preprocessor_config.json Upload file wandb/run-20230510_094132-lvsln7ks/run-lvsln7ks.wandb: 37%|███▋ | 5.41M/14.6M [00:02<00:02, 3.28MB/s] 05/12/2023 07:40:31 - WARNING - huggingface_hub.repository - To https://huggingface.co/danielizham/whisper-small-es Upload file wandb/run-20230510_094132-lvsln7ks/run-lvsln7ks.wandb: 37%|███▋ | 5.41M/14.6M [00:02<00:02, 3.28MB/s]To https://huggingface.co/danielizham/whisper-small-es b6dcee7..72efb0f main -> main