0%| | 0/5000 [00:00> The following columns in the training set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 0%| | 25/5000 [15:04<41:02:00, 29.69s/it] 1%| | 49/5000 [27:10<40:56:29, 29.77s/it] 2%|▏ | 75/5000 [40:25<41:58:23, 30.68s/it] 2%|▏ | 99/5000 [52:35<40:36:11, 29.82s/it] 2%|▏ | 124/5000 [1:05:34<39:47:20, 29.38s/it] 3%|▎ | 149/5000 [1:18:10<40:31:07, 30.07s/it] 3%|▎ | 174/5000 [1:30:34<40:45:25, 30.40s/it] 4%|▍ | 189/5000 [1:38:11<40:29:36, 30.30s/it] 05/09/2023 13:31:00 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [1/20] 05/09/2023 13:31:05 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [2/20] 05/09/2023 13:31:11 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [3/20] 05/09/2023 13:31:16 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [4/20] 05/09/2023 13:31:21 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [5/20] 05/09/2023 13:31:26 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [6/20] 05/09/2023 13:31:31 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [7/20] 05/09/2023 13:31:37 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [8/20] 05/09/2023 13:31:42 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [9/20] 05/09/2023 13:31:47 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [10/20] 05/09/2023 13:31:52 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [11/20] 05/09/2023 13:31:57 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [12/20] 05/09/2023 13:32:03 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [13/20] 05/09/2023 13:32:08 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [14/20] 05/09/2023 13:32:13 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [15/20] 05/09/2023 13:32:18 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [16/20] 05/09/2023 13:32:23 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [17/20] 05/09/2023 13:32:29 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [18/20] 05/09/2023 13:32:34 - WARNING - datasets.download.streaming_download_manager - Got disconnected from remote data host. Retrying in 5sec [19/20] 4%|▍ | 189/5000 [1:38:11<40:29:36, 30.30s/it]Traceback (most recent call last): File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/download/streaming_download_manager.py", line 372, in read_with_retries out = read(*args, **kwargs) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/fsspec/implementations/http.py", line 600, in read return super().read(length) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/fsspec/spec.py", line 1703, in read out = self.cache._fetch(self.loc, self.loc + length) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/fsspec/caching.py", line 397, in _fetch new = self.fetcher(self.end, bend) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper return sync(self.loop, func, *args, **kwargs) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync raise return_result File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner result[0] = await coro File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/fsspec/implementations/http.py", line 655, in async_fetch_range r.raise_for_status() File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1005, in raise_for_status raise ClientResponseError( aiohttp.client_exceptions.ClientResponseError: 502, message='Bad Gateway', url=URL('https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0/resolve/main/audio/es/train/es_train_3.tar') The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/local/QCRI/dizham/kanari/whisper/whisper-small-es/run_speech_recognition_seq2seq_streaming.py", line 629, in main() File "/home/local/QCRI/dizham/kanari/whisper/whisper-small-es/run_speech_recognition_seq2seq_streaming.py", line 578, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/transformers/trainer.py", line 1664, in train return inner_training_loop( File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/transformers/trainer.py", line 1901, in _inner_training_loop for step, inputs in enumerate(epoch_iterator): File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 634, in __next__ data = self._next_data() File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 678, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 32, in fetch data.append(next(self.dataset_iter)) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 981, in __iter__ for key, example in ex_iterable: File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 592, in __iter__ for key, example in iterator: File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 647, in __iter__ for x in self.ex_iterable: File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 500, in __iter__ for key, example in iterator: File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 772, in __iter__ for key, example in self.ex_iterable: File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 229, in __iter__ if not iterators[i].hasnext(): File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 76, in hasnext self._thenext = next(self.it) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 142, in __iter__ yield from self.generate_examples_fn(**kwargs_with_shuffled_shards) File "/home/local/QCRI/dizham/.cache/huggingface/modules/datasets_modules/datasets/mozilla-foundation--common_voice_11_0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/common_voice_11_0.py", line 195, in _generate_examples result["audio"] = {"path": path, "bytes": file.read()} File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/tarfile.py", line 684, in read b = self.fileobj.read(length) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/tarfile.py", line 521, in read buf = self._read(size) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/tarfile.py", line 529, in _read return self.__read(size) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/tarfile.py", line 559, in __read buf = self.fileobj.read(self.bufsize) File "/home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/datasets/download/streaming_download_manager.py", line 381, in read_with_retries raise ConnectionError("Server Disconnected") from disconnect_err ConnectionError: Server Disconnected