diff --git "a/wandb/run-20220323_165914-1vl16ira/files/output.log" "b/wandb/run-20220323_165914-1vl16ira/files/output.log" --- "a/wandb/run-20220323_165914-1vl16ira/files/output.log" +++ "b/wandb/run-20220323_165914-1vl16ira/files/output.log" @@ -15353,3 +15353,301 @@ remote: tput: No value for $TERM and no -T specified remote: tput: No value for $TERM and no -T specified remote: tput: No value for $TERM and no -T specified To https://huggingface.co/sanchit-gandhi/wav2vec2-2-bart-large-cnn +{'dataset': {'name': 'librispeech_asr', 'type': 'librispeech_asr', 'args': 'clean'}}--v100.1749532.0: 100%|██████| 352k/352k [01:52<00:00, 2.92kB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +{'dataset': {'name': 'librispeech_asr', 'type': 'librispeech_asr', 'args': 'clean'}}--v100.1749532.0: 100%|██████| 352k/352k [01:52<00:00, 2.92kB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +{'dataset': {'name': 'librispeech_asr', 'type': 'librispeech_asr', 'args': 'clean'}}--v100.1749532.0: 100%|██████| 352k/352k [01:52<00:00, 2.92kB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +{'dataset': {'name': 'librispeech_asr', 'type': 'librispeech_asr', 'args': 'clean'}}--v100.1749532.0: 100%|██████| 352k/352k [01:52<00:00, 2.92kB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +{'dataset': {'name': 'librispeech_asr', 'type': 'librispeech_asr', 'args': 'clean'}}--v100.1749532.0: 100%|██████| 352k/352k [01:52<00:00, 2.92kB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +{'dataset': {'name': 'librispeech_asr', 'type': 'librispeech_asr', 'args': 'clean'}}--v100.1749532.0: 100%|██████| 352k/352k [01:52<00:00, 2.92kB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +remote: tput: No value for $TERM and no -T specified _asr', 'args': 'clean'}}--v100.1749532.0: 100%|██████| 352k/352k [01:52<00:00, 2.92kB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +remote: tput: No value for $TERM and no -T specified _asr', 'args': 'clean'}}--v100.1749532.0: 100%|██████| 352k/352k [01:52<00:00, 2.92kB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +03/24/2022 00:27:23 - WARNING - huggingface_hub.repository - remote: tput: No value for $TERM and no -T specified +remote: tput: No value for $TERM and no -T specified +remote: tput: No value for $TERM and no -T specified +remote: tput: No value for $TERM and no -T specified +To https://huggingface.co/sanchit-gandhi/wav2vec2-2-bart-large-cnn +Upload file wandb/run-20220323_165914-1vl16ira/run-1vl16ira.wandb: 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +Upload file wandb/run-20220323_165914-1vl16ira/run-1vl16ira.wandb: 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +***** train metrics ***** + epoch = 5.0 + train_loss = 1.6267 + train_runtime = 7:24:24.66 + train_samples = 28538 + train_samples_per_second = 5.351 + train_steps_per_second = 0.084 +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|██████████████████████████████��██████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|███████���█████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|████████████████████████████████████████���| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|██████████████████████████��██████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|███���█████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +03/24/2022 00:36:44 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow +***** eval metrics ***** + epoch = 5.0 + eval_loss = 0.3226 + eval_runtime = 0:09:18.18 + eval_samples = 2642 + eval_samples_per_second = 4.733 + eval_steps_per_second = 0.593 + eval_wer = 0.0924 +[INFO|trainer.py:2369] 2022-03-24 00:27:26,021 >> Batch size = 8 100%|█████████████████████████████████████████| 219M/219M [00:11<00:00, 20.8MB/s]03-24 00:23:38,677 >> 5,866 >> `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...coderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message.