05/07/2023 10:33:39 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 2distributed training: True, 16-bits training: True 05/07/2023 10:33:39 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=2, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_backend=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=1000, evaluation_strategy=steps, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_config=None, generation_max_length=225, generation_num_beams=None, gradient_accumulation_steps=2, gradient_checkpointing=True, greater_is_better=False, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=input_length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=./runs/May07_10-33-38_crimv3mgpu025, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=25, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=5000, metric_for_best_model=wer, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=adamw_hf, optim_args=None, output_dir=./, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=32, predict_with_generate=True, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['wandb'], resume_from_checkpoint=None, run_name=./, save_on_each_node=False, save_safetensors=False, save_steps=1000, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=500, weight_decay=0.0, xpu_backend=None, ) 05/07/2023 10:33:39 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=2, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_backend=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=1000, evaluation_strategy=steps, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_config=None, generation_max_length=225, generation_num_beams=None, gradient_accumulation_steps=2, gradient_checkpointing=True, greater_is_better=False, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=input_length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=./runs/May07_10-33-38_crimv3mgpu025, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=25, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=5000, metric_for_best_model=wer, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=adamw_hf, optim_args=None, output_dir=./, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=32, predict_with_generate=True, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['wandb'], resume_from_checkpoint=None, run_name=./, save_on_each_node=False, save_safetensors=False, save_steps=1000, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=500, weight_decay=0.0, xpu_backend=None, ) [INFO|configuration_utils.py:669] 2023-05-07 10:33:51,873 >> loading configuration file config.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/config.json [INFO|configuration_utils.py:725] 2023-05-07 10:33:51,887 >> Model config WhisperConfig { "_name_or_path": "openai/whisper-small", "activation_dropout": 0.0, "activation_function": "gelu", "apply_spec_augment": false, "architectures": [ "WhisperForConditionalGeneration" ], "attention_dropout": 0.0, "begin_suppress_tokens": [ 220, 50257 ], "bos_token_id": 50257, "classifier_proj_size": 256, "d_model": 768, "decoder_attention_heads": 12, "decoder_ffn_dim": 3072, "decoder_layerdrop": 0.0, "decoder_layers": 12, "decoder_start_token_id": 50258, "dropout": 0.0, "encoder_attention_heads": 12, "encoder_ffn_dim": 3072, "encoder_layerdrop": 0.0, "encoder_layers": 12, "eos_token_id": 50257, "forced_decoder_ids": [ [ 1, 50259 ], [ 2, 50359 ], [ 3, 50363 ] ], "init_std": 0.02, "is_encoder_decoder": true, "mask_feature_length": 10, "mask_feature_min_masks": 0, "mask_feature_prob": 0.0, "mask_time_length": 10, "mask_time_min_masks": 2, "mask_time_prob": 0.05, "max_length": 448, "max_source_positions": 1500, "max_target_positions": 448, "model_type": "whisper", "num_hidden_layers": 12, "num_mel_bins": 80, "pad_token_id": 50257, "scale_embedding": false, "suppress_tokens": [ 1, 2, 7, 8, 9, 10, 14, 25, 26, 27, 28, 29, 31, 58, 59, 60, 61, 62, 63, 90, 91, 92, 93, 359, 503, 522, 542, 873, 893, 902, 918, 922, 931, 1350, 1853, 1982, 2460, 2627, 3246, 3253, 3268, 3536, 3846, 3961, 4183, 4667, 6585, 6647, 7273, 9061, 9383, 10428, 10929, 11938, 12033, 12331, 12562, 13793, 14157, 14635, 15265, 15618, 16553, 16604, 18362, 18956, 20075, 21675, 22520, 26130, 26161, 26435, 28279, 29464, 31650, 32302, 32470, 36865, 42863, 47425, 49870, 50254, 50258, 50360, 50361, 50362 ], "torch_dtype": "float32", "transformers_version": "4.29.0.dev0", "use_cache": true, "use_weighted_layer_sum": false, "vocab_size": 51865 } [INFO|feature_extraction_utils.py:469] 2023-05-07 10:33:52,076 >> loading configuration file preprocessor_config.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/preprocessor_config.json [INFO|feature_extraction_utils.py:511] 2023-05-07 10:33:52,082 >> Feature extractor WhisperFeatureExtractor { "chunk_length": 30, "feature_extractor_type": "WhisperFeatureExtractor", "feature_size": 80, "hop_length": 160, "n_fft": 400, "n_samples": 480000, "nb_max_frames": 3000, "padding_side": "right", "padding_value": 0.0, "processor_class": "WhisperProcessor", "return_attention_mask": false, "sampling_rate": 16000 } [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file vocab.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/vocab.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file tokenizer.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/tokenizer.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file merges.txt from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/merges.txt [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file normalizer.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/normalizer.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file added_tokens.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/added_tokens.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file special_tokens_map.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/special_tokens_map.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file tokenizer_config.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/tokenizer_config.json [INFO|modeling_utils.py:2542] 2023-05-07 10:33:52,385 >> loading weights file pytorch_model.bin from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/pytorch_model.bin [INFO|configuration_utils.py:577] 2023-05-07 10:33:52,963 >> Generate config GenerationConfig { "_from_model_config": true, "begin_suppress_tokens": [ 220, 50257 ], "bos_token_id": 50257, "decoder_start_token_id": 50258, "eos_token_id": 50257, "max_length": 448, "pad_token_id": 50257, "transformers_version": "4.29.0.dev0", "use_cache": false } [INFO|modeling_utils.py:3211] 2023-05-07 10:33:55,474 >> All model checkpoint weights were used when initializing WhisperForConditionalGeneration. [INFO|modeling_utils.py:3219] 2023-05-07 10:33:55,474 >> All the weights of WhisperForConditionalGeneration were initialized from the model checkpoint at openai/whisper-small. If your task is similar to the task the model of the checkpoint was trained on, you can already use WhisperForConditionalGeneration for predictions without further training. [INFO|configuration_utils.py:539] 2023-05-07 10:33:55,680 >> loading configuration file generation_config.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/generation_config.json [INFO|configuration_utils.py:577] 2023-05-07 10:33:55,681 >> Generate config GenerationConfig { "begin_suppress_tokens": [ 220, 50257 ], "bos_token_id": 50257, "decoder_start_token_id": 50258, "eos_token_id": 50257, "forced_decoder_ids": [ [ 1, null ], [ 2, 50359 ] ], "is_multilingual": true, "lang_to_id": { "<|af|>": 50327, "<|am|>": 50334, "<|ar|>": 50272, "<|as|>": 50350, "<|az|>": 50304, "<|ba|>": 50355, "<|be|>": 50330, "<|bg|>": 50292, "<|bn|>": 50302, "<|bo|>": 50347, "<|br|>": 50309, "<|bs|>": 50315, "<|ca|>": 50270, "<|cs|>": 50283, "<|cy|>": 50297, "<|da|>": 50285, "<|de|>": 50261, "<|el|>": 50281, "<|en|>": 50259, "<|es|>": 50262, "<|et|>": 50307, "<|eu|>": 50310, "<|fa|>": 50300, "<|fi|>": 50277, "<|fo|>": 50338, "<|fr|>": 50265, "<|gl|>": 50319, "<|gu|>": 50333, "<|haw|>": 50352, "<|ha|>": 50354, "<|he|>": 50279, "<|hi|>": 50276, "<|hr|>": 50291, "<|ht|>": 50339, "<|hu|>": 50286, "<|hy|>": 50312, "<|id|>": 50275, "<|is|>": 50311, "<|it|>": 50274, "<|ja|>": 50266, "<|jw|>": 50356, "<|ka|>": 50329, "<|kk|>": 50316, "<|km|>": 50323, "<|kn|>": 50306, "<|ko|>": 50264, "<|la|>": 50294, "<|lb|>": 50345, "<|ln|>": 50353, "<|lo|>": 50336, "<|lt|>": 50293, "<|lv|>": 50301, "<|mg|>": 50349, "<|mi|>": 50295, "<|mk|>": 50308, "<|ml|>": 50296, "<|mn|>": 50314, "<|mr|>": 50320, "<|ms|>": 50282, "<|mt|>": 50343, "<|my|>": 50346, "<|ne|>": 50313, "<|nl|>": 50271, "<|nn|>": 50342, "<|no|>": 50288, "<|oc|>": 50328, "<|pa|>": 50321, "<|pl|>": 50269, "<|ps|>": 50340, "<|pt|>": 50267, "<|ro|>": 50284, "<|ru|>": 50263, "<|sa|>": 50344, "<|sd|>": 50332, "<|si|>": 50322, "<|sk|>": 50298, "<|sl|>": 50305, "<|sn|>": 50324, "<|so|>": 50326, "<|sq|>": 50317, "<|sr|>": 50303, "<|su|>": 50357, "<|sv|>": 50273, "<|sw|>": 50318, "<|ta|>": 50287, "<|te|>": 50299, "<|tg|>": 50331, "<|th|>": 50289, "<|tk|>": 50341, "<|tl|>": 50348, "<|tr|>": 50268, "<|tt|>": 50351, "<|uk|>": 50280, "<|ur|>": 50290, "<|uz|>": 50337, "<|vi|>": 50278, "<|yi|>": 50335, "<|yo|>": 50325, "<|zh|>": 50260 }, "max_initial_timestamp_index": 1, "max_length": 448, "no_timestamps_token_id": 50363, "pad_token_id": 50257, "return_timestamps": false, "suppress_tokens": [ 1, 2, 7, 8, 9, 10, 14, 25, 26, 27, 28, 29, 31, 58, 59, 60, 61, 62, 63, 90, 91, 92, 93, 359, 503, 522, 542, 873, 893, 902, 918, 922, 931, 1350, 1853, 1982, 2460, 2627, 3246, 3253, 3268, 3536, 3846, 3961, 4183, 4667, 6585, 6647, 7273, 9061, 9383, 10428, 10929, 11938, 12033, 12331, 12562, 13793, 14157, 14635, 15265, 15618, 16553, 16604, 18362, 18956, 20075, 21675, 22520, 26130, 26161, 26435, 28279, 29464, 31650, 32302, 32470, 36865, 42863, 47425, 49870, 50254, 50258, 50358, 50359, 50360, 50361, 50362 ], "task_to_id": { "transcribe": 50359, "translate": 50358 }, "transformers_version": "4.29.0.dev0" } [INFO|feature_extraction_utils.py:369] 2023-05-07 10:33:56,907 >> Feature extractor saved in ./preprocessor_config.json [INFO|tokenization_utils_base.py:2181] 2023-05-07 10:33:56,915 >> tokenizer config file saved in ./tokenizer_config.json [INFO|tokenization_utils_base.py:2188] 2023-05-07 10:33:56,922 >> Special tokens file saved in ./special_tokens_map.json [INFO|configuration_utils.py:458] 2023-05-07 10:33:57,075 >> Configuration saved in ./config.json [INFO|image_processing_utils.py:307] 2023-05-07 10:33:57,075 >> loading configuration file ./preprocessor_config.json [INFO|feature_extraction_utils.py:467] 2023-05-07 10:33:57,084 >> loading configuration file ./preprocessor_config.json [INFO|feature_extraction_utils.py:511] 2023-05-07 10:33:57,085 >> Feature extractor WhisperFeatureExtractor { "chunk_length": 30, "feature_extractor_type": "WhisperFeatureExtractor", "feature_size": 80, "hop_length": 160, "n_fft": 400, "n_samples": 480000, "nb_max_frames": 3000, "padding_side": "right", "padding_value": 0.0, "processor_class": "WhisperProcessor", "return_attention_mask": false, "sampling_rate": 16000 } [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file vocab.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file merges.txt [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file normalizer.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file tokenizer_config.json [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|startoftranscript|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|en|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|zh|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|de|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|es|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ru|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ko|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|fr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ja|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|pt|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|tr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|pl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ca|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|nl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ar|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|sv|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|it|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|id|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|hi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|fi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|vi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|he|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|uk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|el|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ms|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|cs|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ro|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|da|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|hu|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ta|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|no|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|th|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ur|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|hr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|bg|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|lt|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|la|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|mi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ml|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|cy|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|te|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|fa|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|lv|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|bn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|az|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|kn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|et|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|mk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|br|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|eu|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|is|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|hy|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ne|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|mn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|bs|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|kk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sq|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sw|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|gl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|mr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|pa|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|si|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|km|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|sn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|yo|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|so|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|af|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|oc|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ka|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|be|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|tg|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|sd|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|gu|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|am|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|yi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|lo|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|uz|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|fo|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ht|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ps|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|tk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|nn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|mt|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|sa|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|lb|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|my|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|bo|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|tl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|mg|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|as|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|tt|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|haw|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ln|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ha|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ba|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|jw|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|su|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|translate|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|transcribe|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|startoflm|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|startofprev|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,152 >> Adding <|nocaptions|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,152 >> Adding <|notimestamps|> to the vocabulary /home/local/QCRI/dizham/kanari/whisper/whisper-small-ar/./ is already a clone of https://huggingface.co/danielizham/whisper-small-ar. Make sure you pull the latest changes with `repo.git_pull()`. 05/07/2023 10:34:00 - WARNING - huggingface_hub.repository - /home/local/QCRI/dizham/kanari/whisper/whisper-small-ar/./ is already a clone of https://huggingface.co/danielizham/whisper-small-ar. Make sure you pull the latest changes with `repo.git_pull()`. [INFO|trainer.py:565] 2023-05-07 10:34:02,856 >> max_steps is given, it will override any value given in num_train_epochs [INFO|trainer.py:622] 2023-05-07 10:34:02,856 >> Using cuda_amp half precision backend /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/transformers/optimization.py:407: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning warnings.warn( [INFO|trainer.py:1771] 2023-05-07 10:34:02,869 >> ***** Running training ***** [INFO|trainer.py:1772] 2023-05-07 10:34:02,869 >> Num examples = 640,000 [INFO|trainer.py:1773] 2023-05-07 10:34:02,869 >> Num Epochs = 9,223,372,036,854,775,807 [INFO|trainer.py:1774] 2023-05-07 10:34:02,870 >> Instantaneous batch size per device = 32 [INFO|trainer.py:1775] 2023-05-07 10:34:02,870 >> Total train batch size (w. parallel, distributed & accumulation) = 128 [INFO|trainer.py:1776] 2023-05-07 10:34:02,870 >> Gradient Accumulation steps = 2 [INFO|trainer.py:1777] 2023-05-07 10:34:02,870 >> Total optimization steps = 5,000 [INFO|trainer.py:1778] 2023-05-07 10:34:02,871 >> Number of trainable parameters = 241,734,912 [INFO|integrations.py:720] 2023-05-07 10:34:02,872 >> Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true" wandb: Currently logged in as: danielizham. Use `wandb login --relogin` to force relogin wandb: Tracking run with wandb version 0.15.2 wandb: Run data is saved locally in /home/local/QCRI/dizham/kanari/whisper/whisper-small-ar/wandb/run-20230507_103405-9zf5xxpu wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run fast-feather-2 wandb: ⭐️ View project at https://wandb.ai/danielizham/huggingface wandb: 🚀 View run at https://wandb.ai/danielizham/huggingface/runs/9zf5xxpu 0%| | 0/5000 [00:00> The following columns in the training set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 0%| | 1/5000 [01:47<148:38:46, 107.05s/it] 0%| | 2/5000 [02:14<83:36:35, 60.22s/it] 0%| | 3/5000 [02:41<62:42:18, 45.17s/it] 0%| | 4/5000 [03:10<53:25:35, 38.50s/it] 0%| | 5/5000 [03:40<49:28:00, 35.65s/it] 0%| | 6/5000 [04:08<45:33:39, 32.84s/it] 0%| | 7/5000 [04:35<42:56:31, 30.96s/it] 0%| | 8/5000 [05:02<41:32:10, 29.95s/it] 0%| | 9/5000 [05:30<40:27:17, 29.18s/it] 0%| | 10/5000 [05:59<40:15:31, 29.04s/it] 0%| | 11/5000 [06:25<39:17:47, 28.36s/it] 0%| | 12/5000 [06:54<39:16:28, 28.35s/it] 0%| | 13/5000 [07:22<39:09:18, 28.27s/it] 0%| | 14/5000 [07:50<39:02:25, 28.19s/it] 0%| | 15/5000 [08:17<38:33:57, 27.85s/it] 0%| | 16/5000 [08:45<38:35:55, 27.88s/it] 0%| | 17/5000 [09:13<38:32:36, 27.85s/it] 0%| | 18/5000 [09:40<38:19:57, 27.70s/it] 0%| | 19/5000 [10:10<39:17:48, 28.40s/it] 0%| | 20/5000 [10:37<38:47:34, 28.04s/it] 0%| | 21/5000 [11:04<38:25:49, 27.79s/it] 0%| | 22/5000 [11:33<38:42:41, 28.00s/it] 0%| | 23/5000 [12:01<38:38:51, 27.95s/it] 0%| | 24/5000 [12:28<38:28:42, 27.84s/it] 0%| | 25/5000 [12:57<38:59:22, 28.21s/it] 0%| | 25/5000 [12:57<38:59:22, 28.21s/it] 1%| | 26/5000 [13:24<38:30:52, 27.88s/it] 1%| | 27/5000 [13:52<38:13:36, 27.67s/it] 1%| | 28/5000 [14:20<38:19:33, 27.75s/it] 1%| | 29/5000 [14:47<38:05:23, 27.58s/it] 1%| | 30/5000 [15:14<38:00:17, 27.53s/it] 1%| | 31/5000 [15:42<38:12:38, 27.68s/it] 1%| | 32/5000 [16:09<37:58:26, 27.52s/it] 1%| | 33/5000 [16:36<37:47:47, 27.39s/it] 1%| | 34/5000 [17:07<38:52:40, 28.18s/it] 1%| | 35/5000 [17:33<38:19:00, 27.78s/it] 1%| | 36/5000 [18:01<38:02:35, 27.59s/it] 1%| | 37/5000 [18:30<38:40:53, 28.06s/it] 1%| | 38/5000 [18:57<38:17:23, 27.78s/it] 1%| | 39/5000 [19:24<38:09:39, 27.69s/it] 1%| | 40/5000 [19:54<38:59:07, 28.30s/it] 1%| | 41/5000 [20:21<38:36:11, 28.02s/it] 1%| | 42/5000 [20:50<38:40:08, 28.08s/it] 1%| | 43/5000 [21:17<38:33:12, 28.00s/it] 1%| | 44/5000 [21:44<38:03:57, 27.65s/it] 1%| | 45/5000 [22:13<38:24:59, 27.91s/it] 1%| | 46/5000 [22:40<38:11:21, 27.75s/it] 1%| | 47/5000 [23:10<38:55:38, 28.29s/it] 1%| | 48/5000 [23:37<38:24:02, 27.92s/it] 1%| | 49/5000 [24:03<37:46:25, 27.47s/it] 1%| | 50/5000 [24:31<38:04:45, 27.69s/it] 1%| | 50/5000 [24:31<38:04:45, 27.69s/it] 1%| | 51/5000 [24:59<38:02:43, 27.68s/it] 1%| | 52/5000 [25:26<37:51:48, 27.55s/it] 1%| | 53/5000 [25:55<38:16:50, 27.86s/it] 1%| | 54/5000 [26:22<38:10:08, 27.78s/it] 1%| | 55/5000 [26:50<37:59:21, 27.66s/it] 1%| | 56/5000 [27:18<38:09:00, 27.78s/it] 1%| | 57/5000 [27:45<37:51:24, 27.57s/it] 1%| | 58/5000 [28:12<37:39:30, 27.43s/it] 1%| | 59/5000 [28:42<38:33:05, 28.09s/it] 1%| | 60/5000 [29:09<38:09:46, 27.81s/it] 1%| | 61/5000 [29:36<38:01:21, 27.71s/it] 1%| | 62/5000 [30:05<38:26:25, 28.02s/it] 1%|▏ | 63/5000 [30:32<38:07:36, 27.80s/it] 1%|▏ | 64/5000 [30:59<37:48:53, 27.58s/it] 1%|▏ | 65/5000 [31:29<38:48:09, 28.31s/it] 1%|▏ | 66/5000 [31:57<38:25:18, 28.03s/it] 1%|▏ | 67/5000 [32:26<38:52:41, 28.37s/it] 1%|▏ | 68/5000 [32:54<38:41:09, 28.24s/it] 1%|▏ | 69/5000 [33:21<38:15:10, 27.93s/it] 1%|▏ | 70/5000 [33:49<38:07:26, 27.84s/it] 1%|▏ | 71/5000 [34:16<37:59:11, 27.74s/it] 1%|▏ | 72/5000 [34:47<39:02:11, 28.52s/it] 1%|▏ | 73/5000 [35:15<38:48:31, 28.36s/it] 1%|▏ | 74/5000 [35:42<38:24:24, 28.07s/it] 2%|▏ | 75/5000 [36:10<38:12:36, 27.93s/it] 2%|▏ | 75/5000 [36:10<38:12:36, 27.93s/it] 2%|▏ | 76/5000 [36:42<40:02:06, 29.27s/it] 2%|▏ | 77/5000 [37:09<39:13:52, 28.69s/it] 2%|▏ | 78/5000 [37:37<38:50:12, 28.41s/it] 2%|▏ | 79/5000 [38:06<39:05:12, 28.59s/it] 2%|▏ | 80/5000 [38:33<38:20:59, 28.06s/it] 2%|▏ | 81/5000 [39:01<38:30:04, 28.18s/it] 2%|▏ | 82/5000 [39:29<38:05:29, 27.88s/it] 2%|▏ | 83/5000 [39:56<37:52:05, 27.73s/it] 2%|▏ | 84/5000 [40:23<37:34:58, 27.52s/it] 2%|▏ | 85/5000 [40:51<37:40:57, 27.60s/it] 2%|▏ | 86/5000 [41:18<37:44:37, 27.65s/it] 2%|▏ | 87/5000 [41:46<37:36:43, 27.56s/it] 2%|▏ | 88/5000 [42:14<37:44:37, 27.66s/it] 2%|▏ | 89/5000 [42:42<37:51:35, 27.75s/it] 2%|▏ | 90/5000 [43:09<37:41:54, 27.64s/it] 2%|▏ | 91/5000 [43:37<37:50:15, 27.75s/it] 2%|▏ | 92/5000 [44:05<38:00:27, 27.88s/it] 2%|▏ | 93/5000 [44:32<37:38:09, 27.61s/it] 2%|▏ | 94/5000 [45:00<37:35:56, 27.59s/it] 2%|▏ | 95/5000 [45:27<37:35:42, 27.59s/it] 2%|▏ | 96/5000 [45:55<37:24:43, 27.46s/it] 2%|▏ | 97/5000 [46:23<37:39:21, 27.65s/it] 2%|▏ | 98/5000 [46:50<37:38:40, 27.65s/it] 2%|▏ | 99/5000 [47:18<37:32:18, 27.57s/it] 2%|▏ | 100/5000 [47:46<37:40:02, 27.67s/it] 2%|▏ | 100/5000 [47:46<37:40:02, 27.67s/it] 2%|▏ | 101/5000 [48:13<37:44:47, 27.74s/it] 2%|▏ | 102/5000 [48:41<37:29:30, 27.56s/it] 2%|▏ | 103/5000 [49:09<37:45:09, 27.75s/it] 2%|▏ | 104/5000 [49:36<37:34:21, 27.63s/it] 2%|▏ | 105/5000 [50:03<37:25:22, 27.52s/it] 2%|▏ | 106/5000 [50:31<37:25:13, 27.53s/it] 2%|▏ | 107/5000 [50:58<37:16:28, 27.42s/it] 2%|▏ | 108/5000 [51:25<37:13:39, 27.40s/it] 2%|▏ | 109/5000 [51:53<37:18:48, 27.46s/it] 2%|▏ | 110/5000 [52:20<37:12:04, 27.39s/it] 2%|▏ | 111/5000 [52:48<37:11:35, 27.39s/it] 2%|▏ | 112/5000 [53:15<37:19:40, 27.49s/it] 2%|▏ | 113/5000 [53:43<37:14:35, 27.44s/it] 2%|▏ | 114/5000 [54:10<37:06:26, 27.34s/it] 2%|▏ | 115/5000 [54:37<37:12:39, 27.42s/it] 2%|▏ | 116/5000 [55:06<37:30:46, 27.65s/it] 2%|▏ | 117/5000 [55:33<37:33:08, 27.69s/it] 2%|▏ | 118/5000 [56:01<37:25:22, 27.60s/it] 2%|▏ | 119/5000 [56:29<37:37:17, 27.75s/it] 2%|▏ | 120/5000 [56:56<37:25:52, 27.61s/it] 2%|▏ | 121/5000 [57:24<37:19:37, 27.54s/it] 2%|▏ | 122/5000 [57:53<38:07:24, 28.14s/it] 2%|▏ | 123/5000 [58:20<37:47:02, 27.89s/it] 2%|▏ | 124/5000 [58:48<37:31:04, 27.70s/it] 2%|▎ | 125/5000 [59:17<37:59:12, 28.05s/it] 2%|▎ | 125/5000 [59:17<37:59:12, 28.05s/it] 3%|▎ | 126/5000 [59:43<37:15:16, 27.52s/it] 3%|▎ | 127/5000 [1:00:08<36:26:56, 26.93s/it] 3%|▎ | 128/5000 [1:00:37<36:56:46, 27.30s/it] 3%|▎ | 129/5000 [1:01:03<36:45:18, 27.16s/it] 3%|▎ | 130/5000 [1:01:30<36:26:39, 26.94s/it] 3%|▎ | 131/5000 [1:01:57<36:38:18, 27.09s/it] 3%|▎ | 132/5000 [1:02:25<37:00:00, 27.36s/it] 3%|▎ | 133/5000 [1:02:52<36:50:32, 27.25s/it] 3%|▎ | 134/5000 [1:03:20<36:51:47, 27.27s/it] 3%|▎ | 135/5000 [1:03:50<38:00:00, 28.12s/it] 3%|▎ | 136/5000 [1:04:17<37:37:28, 27.85s/it] 3%|▎ | 137/5000 [1:04:44<37:24:28, 27.69s/it] 3%|▎ | 138/5000 [1:05:13<37:57:48, 28.11s/it] 3%|▎ | 139/5000 [1:05:41<37:35:32, 27.84s/it] 3%|▎ | 140/5000 [1:06:08<37:17:24, 27.62s/it] 3%|▎ | 141/5000 [1:06:35<37:22:03, 27.69s/it] 3%|▎ | 142/5000 [1:07:03<37:12:38, 27.57s/it] 3%|▎ | 143/5000 [1:07:30<37:00:17, 27.43s/it] 3%|▎ | 144/5000 [1:07:58<37:20:26, 27.68s/it] 3%|▎ | 145/5000 [1:08:26<37:18:01, 27.66s/it] 3%|▎ | 146/5000 [1:08:53<37:10:30, 27.57s/it] 3%|▎ | 147/5000 [1:09:24<38:26:43, 28.52s/it] 3%|▎ | 148/5000 [1:09:51<37:53:31, 28.11s/it] 3%|▎ | 149/5000 [1:10:18<37:34:47, 27.89s/it] 3%|▎ | 150/5000 [1:10:46<37:32:21, 27.86s/it] 3%|▎ | 150/5000 [1:10:46<37:32:21, 27.86s/it] 3%|▎ | 151/5000 [1:11:14<37:20:59, 27.73s/it] 3%|▎ | 152/5000 [1:11:41<37:11:01, 27.61s/it] 3%|▎ | 153/5000 [1:12:11<38:05:01, 28.29s/it] 3%|▎ | 154/5000 [1:12:38<37:41:38, 28.00s/it] 3%|▎ | 155/5000 [1:13:05<37:14:21, 27.67s/it] 3%|▎ | 156/5000 [1:13:34<37:40:15, 28.00s/it] 3%|▎ | 157/5000 [1:14:01<37:22:01, 27.78s/it] 3%|▎ | 158/5000 [1:14:28<37:08:30, 27.61s/it] 3%|▎ | 159/5000 [1:14:56<37:18:52, 27.75s/it] 3%|▎ | 160/5000 [1:15:10<31:43:49, 23.60s/it] 3%|▎ | 161/5000 [1:15:21<26:35:31, 19.78s/it] 3%|▎ | 162/5000 [1:15:32<23:01:15, 17.13s/it] 3%|▎ | 163/5000 [1:15:43<20:26:07, 15.21s/it]{'loss': 0.9879, 'learning_rate': 4.6000000000000004e-07, 'epoch': 0.01} {'loss': 0.8962, 'learning_rate': 9.600000000000001e-07, 'epoch': 0.01} {'loss': 0.6006, 'learning_rate': 1.46e-06, 'epoch': 0.01} {'loss': 0.4218, 'learning_rate': 1.9600000000000003e-06, 'epoch': 0.02} {'loss': 0.4419, 'learning_rate': 2.46e-06, 'epoch': 0.03} {'loss': 0.4007, 'learning_rate': 2.96e-06, 'epoch': 0.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.54it/s] Reading metadata...: 15060it [00:00, 40107.89it/s] Reading metadata...: 23919it [00:01, 14960.32it/s] Reading metadata...: 28043it [00:01, 18033.45it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:01, 1.15s/it] Reading metadata...: 10438it [00:01, 8535.33it/s] 3%|▎ | 164/5000 [1:17:12<50:13:58, 37.39s/it] 3%|▎ | 165/5000 [1:17:40<46:30:54, 34.63s/it] 3%|▎ | 166/5000 [1:18:08<43:58:06, 32.74s/it] 3%|▎ | 167/5000 [1:18:37<42:17:46, 31.51s/it] 3%|▎ | 168/5000 [1:19:04<40:32:14, 30.20s/it] 3%|▎ | 169/5000 [1:19:33<39:51:03, 29.70s/it] 3%|▎ | 170/5000 [1:20:02<39:30:40, 29.45s/it] 3%|▎ | 171/5000 [1:20:31<39:30:47, 29.46s/it] 3%|▎ | 172/5000 [1:20:59<39:02:18, 29.11s/it] 3%|▎ | 173/5000 [1:21:27<38:19:55, 28.59s/it] 3%|▎ | 174/5000 [1:21:55<38:12:08, 28.50s/it] 4%|▎ | 175/5000 [1:22:25<38:42:56, 28.89s/it] 4%|▎ | 175/5000 [1:22:25<38:42:56, 28.89s/it] 4%|▎ | 176/5000 [1:22:52<37:55:34, 28.30s/it] 4%|▎ | 177/5000 [1:23:20<37:58:12, 28.34s/it] 4%|▎ | 178/5000 [1:23:48<37:52:23, 28.28s/it] 4%|▎ | 179/5000 [1:24:16<37:46:36, 28.21s/it] 4%|▎ | 180/5000 [1:24:44<37:33:35, 28.05s/it] 4%|▎ | 181/5000 [1:25:11<37:14:38, 27.82s/it] 4%|▎ | 182/5000 [1:25:44<39:19:43, 29.39s/it] 4%|▎ | 183/5000 [1:26:13<38:47:39, 28.99s/it] 4%|▎ | 184/5000 [1:26:40<38:00:33, 28.41s/it] 4%|▎ | 185/5000 [1:27:08<38:09:17, 28.53s/it] 4%|▎ | 186/5000 [1:27:37<38:02:08, 28.44s/it] 4%|▎ | 187/5000 [1:28:04<37:29:01, 28.04s/it] 4%|▍ | 188/5000 [1:28:33<37:47:02, 28.27s/it] 4%|▍ | 189/5000 [1:29:00<37:39:52, 28.18s/it] 4%|▍ | 190/5000 [1:29:28<37:13:28, 27.86s/it] 4%|▍ | 191/5000 [1:29:56<37:29:50, 28.07s/it] 4%|▍ | 192/5000 [1:30:26<38:20:14, 28.71s/it] 4%|▍ | 193/5000 [1:30:54<37:43:33, 28.25s/it] 4%|▍ | 194/5000 [1:31:23<38:01:21, 28.48s/it] 4%|▍ | 195/5000 [1:31:51<38:09:37, 28.59s/it] 4%|▍ | 196/5000 [1:32:19<37:38:58, 28.21s/it] 4%|▍ | 197/5000 [1:32:48<37:55:29, 28.43s/it] 4%|▍ | 198/5000 [1:33:15<37:25:58, 28.06s/it] 4%|▍ | 199/5000 [1:33:41<36:50:43, 27.63s/it] 4%|▍ | 200/5000 [1:34:15<39:11:58, 29.40s/it] 4%|▍ | 200/5000 [1:34:15<39:11:58, 29.40s/it] 4%|▍ | 201/5000 [1:34:44<38:50:49, 29.14s/it] 4%|▍ | 202/5000 [1:35:11<38:06:09, 28.59s/it] 4%|▍ | 203/5000 [1:35:39<37:59:39, 28.51s/it] 4%|▍ | 204/5000 [1:36:07<37:39:47, 28.27s/it] 4%|▍ | 205/5000 [1:36:34<37:18:41, 28.01s/it] 4%|▍ | 206/5000 [1:37:03<37:22:26, 28.07s/it] 4%|▍ | 207/5000 [1:37:31<37:39:51, 28.29s/it] 4%|▍ | 208/5000 [1:37:58<36:56:51, 27.76s/it] 4%|▍ | 209/5000 [1:38:26<37:06:54, 27.89s/it] 4%|▍ | 210/5000 [1:38:54<36:58:07, 27.78s/it] 4%|▍ | 211/5000 [1:39:21<36:51:24, 27.71s/it] 4%|▍ | 212/5000 [1:39:48<36:25:39, 27.39s/it] 4%|▍ | 213/5000 [1:40:14<36:06:47, 27.16s/it] 4%|▍ | 214/5000 [1:40:43<36:31:19, 27.47s/it] 4%|▍ | 215/5000 [1:41:10<36:30:44, 27.47s/it] 4%|▍ | 216/5000 [1:41:39<37:02:43, 27.88s/it] 4%|▍ | 217/5000 [1:42:08<37:27:08, 28.19s/it] 4%|▍ | 218/5000 [1:42:35<37:12:10, 28.01s/it] 4%|▍ | 219/5000 [1:43:04<37:27:18, 28.20s/it] 4%|▍ | 220/5000 [1:43:32<37:32:49, 28.28s/it] 4%|▍ | 221/5000 [1:44:00<37:14:15, 28.05s/it] 4%|▍ | 222/5000 [1:44:29<37:34:23, 28.31s/it] 4%|▍ | 223/5000 [1:44:56<37:10:20, 28.01s/it] 4%|▍ | 224/5000 [1:45:23<36:46:48, 27.72s/it] 4%|▍ | 225/5000 [1:45:53<37:34:07, 28.32s/it] 4%|▍ | 225/5000 [1:45:53<37:34:07, 28.32s/it] 5%|▍ | 226/5000 [1:46:20<37:01:08, 27.92s/it] 5%|▍ | 227/5000 [1:46:47<36:45:13, 27.72s/it] 5%|▍ | 228/5000 [1:47:16<37:12:24, 28.07s/it] 5%|▍ | 229/5000 [1:47:45<37:36:57, 28.38s/it] 5%|▍ | 230/5000 [1:48:12<37:03:13, 27.97s/it] 5%|▍ | 231/5000 [1:48:40<37:09:03, 28.04s/it] 5%|▍ | 232/5000 [1:49:09<37:18:54, 28.17s/it] 5%|▍ | 233/5000 [1:49:36<37:00:35, 27.95s/it] 5%|▍ | 234/5000 [1:50:05<37:20:39, 28.21s/it] 5%|▍ | 235/5000 [1:50:33<37:08:04, 28.06s/it] 5%|▍ | 236/5000 [1:51:00<36:46:12, 27.79s/it] 5%|▍ | 237/5000 [1:51:28<37:02:06, 27.99s/it] 5%|▍ | 238/5000 [1:51:57<37:14:26, 28.15s/it] 5%|▍ | 239/5000 [1:52:26<37:23:32, 28.27s/it] 5%|▍ | 240/5000 [1:52:53<36:58:16, 27.96s/it] 5%|▍ | 241/5000 [1:53:22<37:17:08, 28.21s/it] 5%|▍ | 242/5000 [1:53:50<37:23:16, 28.29s/it] 5%|▍ | 243/5000 [1:54:18<37:04:50, 28.06s/it] 5%|▍ | 244/5000 [1:54:46<37:20:31, 28.27s/it] 5%|▍ | 245/5000 [1:55:15<37:34:31, 28.45s/it] 5%|▍ | 246/5000 [1:55:43<37:08:49, 28.13s/it] 5%|▍ | 247/5000 [1:56:11<37:22:29, 28.31s/it] 5%|▍ | 248/5000 [1:56:40<37:33:11, 28.45s/it] 5%|▍ | 249/5000 [1:57:07<37:05:48, 28.11s/it] 5%|▌ | 250/5000 [1:57:37<37:49:09, 28.66s/it] 5%|▌ | 250/5000 [1:57:37<37:49:09, 28.66s/it] 5%|▌ | 251/5000 [1:58:06<37:46:16, 28.63s/it] 5%|▌ | 252/5000 [1:58:33<37:13:53, 28.23s/it] 5%|▌ | 253/5000 [1:59:01<37:12:43, 28.22s/it] 5%|▌ | 254/5000 [1:59:30<37:21:10, 28.33s/it] 5%|▌ | 255/5000 [1:59:57<36:56:09, 28.02s/it] 5%|▌ | 256/5000 [2:00:26<37:17:02, 28.29s/it] 5%|▌ | 257/5000 [2:00:56<37:42:29, 28.62s/it] 5%|▌ | 258/5000 [2:01:22<36:40:47, 27.85s/it] 5%|▌ | 259/5000 [2:01:50<36:50:29, 27.97s/it] 5%|▌ | 260/5000 [2:02:19<37:11:52, 28.25s/it] 5%|▌ | 261/5000 [2:02:46<36:45:24, 27.92s/it] 5%|▌ | 262/5000 [2:03:15<37:08:28, 28.22s/it] 5%|▌ | 263/5000 [2:03:43<36:55:19, 28.06s/it] 5%|▌ | 264/5000 [2:04:11<36:57:37, 28.09s/it] 5%|▌ | 265/5000 [2:04:38<36:37:24, 27.84s/it] 5%|▌ | 266/5000 [2:05:09<38:02:09, 28.92s/it] 5%|▌ | 267/5000 [2:05:38<37:58:42, 28.89s/it] 5%|▌ | 268/5000 [2:06:06<37:22:13, 28.43s/it] 5%|▌ | 269/5000 [2:06:33<36:50:00, 28.03s/it] 5%|▌ | 270/5000 [2:07:02<37:30:20, 28.55s/it] 5%|▌ | 271/5000 [2:07:29<36:42:57, 27.95s/it] 5%|▌ | 272/5000 [2:07:56<36:14:44, 27.60s/it] 5%|▌ | 273/5000 [2:08:25<37:00:39, 28.19s/it] 5%|▌ | 274/5000 [2:08:53<36:42:42, 27.97s/it] 6%|▌ | 275/5000 [2:09:20<36:22:53, 27.72s/it] 6%|▌ | 275/5000 [2:09:20<36:22:53, 27.72s/it] 6%|▌ | 276/5000 [2:09:50<37:23:24, 28.49s/it] 6%|▌ | 277/5000 [2:10:18<36:54:46, 28.14s/it] 6%|▌ | 278/5000 [2:10:46<37:06:12, 28.29s/it] 6%|▌ | 279/5000 [2:11:15<37:16:37, 28.43s/it] 6%|▌ | 280/5000 [2:11:43<37:06:44, 28.31s/it] 6%|▌ | 281/5000 [2:12:10<36:43:23, 28.02s/it] 6%|▌ | 282/5000 [2:12:39<36:56:56, 28.19s/it] 6%|▌ | 283/5000 [2:13:07<37:01:41, 28.26s/it] 6%|▌ | 284/5000 [2:13:33<36:08:17, 27.59s/it] 6%|▌ | 285/5000 [2:14:01<36:14:21, 27.67s/it] 6%|▌ | 286/5000 [2:14:30<36:32:41, 27.91s/it] 6%|▌ | 287/5000 [2:14:57<36:20:12, 27.76s/it] 6%|▌ | 288/5000 [2:15:28<37:40:15, 28.78s/it] 6%|▌ | 289/5000 [2:15:57<37:40:20, 28.79s/it] 6%|▌ | 290/5000 [2:16:24<37:05:20, 28.35s/it] 6%|▌ | 291/5000 [2:16:52<36:35:38, 27.98s/it] 6%|▌ | 292/5000 [2:17:21<37:08:29, 28.40s/it] 6%|▌ | 293/5000 [2:17:48<36:35:56, 27.99s/it] 6%|▌ | 294/5000 [2:18:15<36:15:07, 27.73s/it] 6%|▌ | 295/5000 [2:18:45<37:01:27, 28.33s/it] 6%|▌ | 296/5000 [2:19:12<36:38:26, 28.04s/it] 6%|▌ | 297/5000 [2:19:39<36:10:13, 27.69s/it] 6%|▌ | 298/5000 [2:20:09<37:13:08, 28.50s/it] 6%|▌ | 299/5000 [2:20:36<36:26:30, 27.91s/it] 6%|▌ | 300/5000 [2:21:03<36:04:12, 27.63s/it] 6%|▌ | 300/5000 [2:21:03<36:04:12, 27.63s/it] 6%|▌ | 301/5000 [2:21:32<36:29:21, 27.96s/it] 6%|▌ | 302/5000 [2:22:01<36:54:28, 28.28s/it] 6%|▌ | 303/5000 [2:22:28<36:31:45, 28.00s/it] 6%|▌ | 304/5000 [2:22:57<36:51:38, 28.26s/it] 6%|▌ | 305/5000 [2:23:24<36:23:09, 27.90s/it] 6%|▌ | 306/5000 [2:23:51<36:07:02, 27.70s/it] 6%|▌ | 307/5000 [2:24:20<36:39:50, 28.12s/it] 6%|▌ | 308/5000 [2:24:49<36:59:57, 28.39s/it] 6%|▌ | 309/5000 [2:25:17<36:34:23, 28.07s/it] 6%|▌ | 310/5000 [2:25:44<36:21:42, 27.91s/it] 6%|▌ | 311/5000 [2:26:12<36:30:16, 28.03s/it] 6%|▌ | 312/5000 [2:26:40<36:08:38, 27.76s/it] 6%|▋ | 313/5000 [2:27:10<37:11:11, 28.56s/it] 6%|▋ | 314/5000 [2:27:37<36:34:51, 28.10s/it] 6%|▋ | 315/5000 [2:28:04<36:11:05, 27.80s/it] 6%|▋ | 316/5000 [2:28:33<36:35:48, 28.13s/it] 6%|▋ | 317/5000 [2:29:00<36:12:26, 27.83s/it] 6%|▋ | 318/5000 [2:29:27<35:56:52, 27.64s/it] 6%|▋ | 319/5000 [2:29:56<36:27:25, 28.04s/it] 6%|▋ | 320/5000 [2:30:24<36:15:39, 27.89s/it] 6%|▋ | 321/5000 [2:30:51<35:57:36, 27.67s/it] 6%|▋ | 322/5000 [2:31:20<36:38:12, 28.19s/it] 6%|▋ | 323/5000 [2:31:43<34:18:17, 26.41s/it] 6%|▋ | 324/5000 [2:31:54<28:14:37, 21.74s/it] 6%|▋ | 325/5000 [2:32:04<23:59:08, 18.47s/it] 6%|▋ | 325/5000 [2:32:04<23:59:08, 18.47s/it] 7%|▋ | 326/5000 [2:32:15<21:01:36, 16.20s/it] 7%|▋ | 327/5000 [2:32:23<17:43:56, 13.66s/it]{'loss': 0.3592, 'learning_rate': 3.46e-06, 'epoch': 1.0} {'loss': 0.3448, 'learning_rate': 3.96e-06, 'epoch': 1.01} {'loss': 0.3673, 'learning_rate': 4.4600000000000005e-06, 'epoch': 1.01} {'loss': 0.273, 'learning_rate': 4.960000000000001e-06, 'epoch': 1.02} {'loss': 0.3088, 'learning_rate': 5.460000000000001e-06, 'epoch': 1.02} {'loss': 0.302, 'learning_rate': 5.9600000000000005e-06, 'epoch': 1.03} {'loss': 0.2583, 'learning_rate': 6.460000000000001e-06, 'epoch': 1.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:01, 1.09s/it] Reading metadata...: 15098it [00:01, 17551.49it/s] Reading metadata...: 23979it [00:02, 8389.85it/s]  Reading metadata...: 28043it [00:02, 9574.58it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:03, 3.94s/it] Reading metadata...: 10438it [00:04, 2601.71it/s] 7%|▋ | 328/5000 [2:34:24<59:27:28, 45.82s/it] 7%|▋ | 329/5000 [2:34:54<53:18:07, 41.08s/it] 7%|▋ | 330/5000 [2:35:21<47:58:08, 36.98s/it] 7%|▋ | 331/5000 [2:35:51<45:08:09, 34.80s/it] 7%|▋ | 332/5000 [2:36:21<43:09:15, 33.28s/it] 7%|▋ | 333/5000 [2:36:51<41:49:05, 32.26s/it] 7%|▋ | 334/5000 [2:37:18<39:53:29, 30.78s/it] 7%|▋ | 335/5000 [2:37:49<39:56:07, 30.82s/it] 7%|▋ | 336/5000 [2:38:19<39:47:55, 30.72s/it] 7%|▋ | 337/5000 [2:38:49<39:19:14, 30.36s/it] 7%|▋ | 338/5000 [2:39:16<38:05:12, 29.41s/it] 7%|▋ | 339/5000 [2:39:46<38:06:08, 29.43s/it] 7%|▋ | 340/5000 [2:40:23<41:17:39, 31.90s/it] 7%|▋ | 341/5000 [2:40:53<40:16:40, 31.12s/it] 7%|▋ | 342/5000 [2:41:20<38:45:32, 29.96s/it] 7%|▋ | 343/5000 [2:41:49<38:35:57, 29.84s/it] 7%|▋ | 344/5000 [2:42:19<38:26:12, 29.72s/it] 7%|▋ | 345/5000 [2:42:46<37:37:18, 29.10s/it] 7%|▋ | 346/5000 [2:43:25<41:28:39, 32.08s/it] 7%|▋ | 347/5000 [2:43:53<39:32:15, 30.59s/it] 7%|▋ | 348/5000 [2:44:20<38:24:11, 29.72s/it] 7%|▋ | 349/5000 [2:44:50<38:33:23, 29.84s/it] 7%|▋ | 350/5000 [2:45:20<38:18:16, 29.66s/it] 7%|▋ | 350/5000 [2:45:20<38:18:16, 29.66s/it] 7%|▋ | 351/5000 [2:45:47<37:25:43, 28.98s/it] 7%|▋ | 352/5000 [2:46:24<40:39:26, 31.49s/it] 7%|▋ | 353/5000 [2:46:52<39:00:37, 30.22s/it] 7%|▋ | 354/5000 [2:47:19<37:50:13, 29.32s/it] 7%|▋ | 355/5000 [2:47:50<38:39:55, 29.97s/it] 7%|▋ | 356/5000 [2:48:18<37:44:03, 29.25s/it] 7%|▋ | 357/5000 [2:48:45<36:53:44, 28.61s/it] 7%|▋ | 358/5000 [2:49:16<37:57:20, 29.44s/it] 7%|▋ | 359/5000 [2:49:44<37:07:42, 28.80s/it] 7%|▋ | 360/5000 [2:50:11<36:34:52, 28.38s/it] 7%|▋ | 361/5000 [2:50:41<37:16:38, 28.93s/it] 7%|▋ | 362/5000 [2:51:08<36:25:25, 28.27s/it] 7%|▋ | 363/5000 [2:51:35<36:03:10, 27.99s/it] 7%|▋ | 364/5000 [2:52:08<37:57:49, 29.48s/it] 7%|▋ | 365/5000 [2:52:36<37:09:38, 28.86s/it] 7%|▋ | 366/5000 [2:53:03<36:26:23, 28.31s/it] 7%|▋ | 367/5000 [2:53:33<37:20:10, 29.01s/it] 7%|▋ | 368/5000 [2:54:00<36:21:56, 28.26s/it] 7%|▋ | 369/5000 [2:54:29<36:32:11, 28.40s/it] 7%|▋ | 370/5000 [2:54:58<36:49:06, 28.63s/it] 7%|▋ | 371/5000 [2:55:25<36:14:47, 28.19s/it] 7%|▋ | 372/5000 [2:55:56<37:19:36, 29.04s/it] 7%|▋ | 373/5000 [2:56:23<36:34:38, 28.46s/it] 7%|▋ | 374/5000 [2:56:51<36:29:03, 28.39s/it] 8%|▊ | 375/5000 [2:57:26<38:44:36, 30.16s/it] 8%|▊ | 375/5000 [2:57:26<38:44:36, 30.16s/it] 8%|▊ | 376/5000 [2:57:53<37:34:56, 29.26s/it] 8%|▊ | 377/5000 [2:58:23<37:52:08, 29.49s/it] 8%|▊ | 378/5000 [2:58:53<37:59:17, 29.59s/it] 8%|▊ | 379/5000 [2:59:20<37:09:54, 28.95s/it] 8%|▊ | 380/5000 [2:59:52<38:14:13, 29.80s/it] 8%|▊ | 381/5000 [3:00:19<37:18:02, 29.07s/it] 8%|▊ | 382/5000 [3:00:46<36:33:53, 28.50s/it] 8%|▊ | 383/5000 [3:01:19<38:02:05, 29.66s/it] 8%|▊ | 384/5000 [3:01:46<37:05:06, 28.92s/it] 8%|▊ | 385/5000 [3:02:13<36:29:52, 28.47s/it] 8%|▊ | 386/5000 [3:02:44<37:20:51, 29.14s/it] 8%|▊ | 387/5000 [3:03:11<36:28:43, 28.47s/it] 8%|▊ | 388/5000 [3:03:37<35:26:14, 27.66s/it] 8%|▊ | 389/5000 [3:04:08<36:38:24, 28.61s/it] 8%|▊ | 390/5000 [3:04:35<36:09:31, 28.24s/it] 8%|▊ | 391/5000 [3:05:02<35:46:46, 27.95s/it] 8%|▊ | 392/5000 [3:05:32<36:24:56, 28.45s/it] 8%|▊ | 393/5000 [3:05:59<35:54:14, 28.06s/it] 8%|▊ | 394/5000 [3:06:29<36:30:25, 28.53s/it] 8%|▊ | 395/5000 [3:07:02<38:23:21, 30.01s/it] 8%|▊ | 396/5000 [3:07:29<37:17:38, 29.16s/it] 8%|▊ | 397/5000 [3:07:59<37:24:48, 29.26s/it] 8%|▊ | 398/5000 [3:08:26<36:38:14, 28.66s/it] 8%|▊ | 399/5000 [3:08:58<37:48:56, 29.59s/it] 8%|▊ | 400/5000 [3:09:27<37:45:36, 29.55s/it] 8%|▊ | 400/5000 [3:09:27<37:45:36, 29.55s/it] 8%|▊ | 401/5000 [3:09:54<36:51:22, 28.85s/it] 8%|▊ | 402/5000 [3:10:24<37:14:45, 29.16s/it] 8%|▊ | 403/5000 [3:10:55<37:40:44, 29.51s/it] 8%|▊ | 404/5000 [3:11:22<36:45:44, 28.80s/it] 8%|▊ | 405/5000 [3:11:51<37:01:17, 29.00s/it] 8%|▊ | 406/5000 [3:12:26<39:06:48, 30.65s/it] 8%|▊ | 407/5000 [3:12:53<37:47:58, 29.63s/it] 8%|▊ | 408/5000 [3:13:22<37:42:58, 29.57s/it] 8%|▊ | 409/5000 [3:13:52<37:40:01, 29.54s/it] 8%|▊ | 410/5000 [3:14:19<36:46:57, 28.85s/it] 8%|▊ | 411/5000 [3:14:52<38:10:48, 29.95s/it] 8%|▊ | 412/5000 [3:15:20<37:36:03, 29.50s/it] 8%|▊ | 413/5000 [3:15:50<37:35:51, 29.51s/it] 8%|▊ | 414/5000 [3:16:16<36:32:40, 28.69s/it] 8%|▊ | 415/5000 [3:16:46<36:43:34, 28.84s/it] 8%|▊ | 416/5000 [3:17:15<37:01:20, 29.08s/it] 8%|▊ | 417/5000 [3:17:42<36:02:01, 28.30s/it] 8%|▊ | 418/5000 [3:18:11<36:23:04, 28.59s/it] 8%|▊ | 419/5000 [3:18:40<36:35:37, 28.76s/it] 8%|▊ | 420/5000 [3:19:07<35:55:13, 28.23s/it] 8%|▊ | 421/5000 [3:19:37<36:31:35, 28.72s/it] 8%|▊ | 422/5000 [3:20:06<36:44:24, 28.89s/it] 8%|▊ | 423/5000 [3:20:34<36:09:13, 28.44s/it] 8%|▊ | 424/5000 [3:21:03<36:39:46, 28.84s/it] 8%|▊ | 425/5000 [3:21:33<36:52:00, 29.01s/it] 8%|▊ | 425/5000 [3:21:33<36:52:00, 29.01s/it] 9%|▊ | 426/5000 [3:22:00<36:05:51, 28.41s/it] 9%|▊ | 427/5000 [3:22:30<36:43:01, 28.90s/it] 9%|▊ | 428/5000 [3:22:59<36:49:08, 28.99s/it] 9%|▊ | 429/5000 [3:23:26<36:07:18, 28.45s/it] 9%|▊ | 430/5000 [3:23:58<37:29:11, 29.53s/it] 9%|▊ | 431/5000 [3:24:26<36:36:32, 28.84s/it] 9%|▊ | 432/5000 [3:24:53<35:57:34, 28.34s/it] 9%|▊ | 433/5000 [3:25:25<37:21:31, 29.45s/it] 9%|▊ | 434/5000 [3:25:52<36:34:29, 28.84s/it] 9%|▊ | 435/5000 [3:26:19<35:56:08, 28.34s/it] 9%|▊ | 436/5000 [3:26:50<36:42:42, 28.96s/it] 9%|▊ | 437/5000 [3:27:20<37:05:57, 29.27s/it] 9%|▉ | 438/5000 [3:27:47<36:12:53, 28.58s/it] 9%|▉ | 439/5000 [3:28:17<36:41:46, 28.96s/it] 9%|▉ | 440/5000 [3:28:46<36:55:46, 29.15s/it] 9%|▉ | 441/5000 [3:29:13<36:09:48, 28.56s/it] 9%|▉ | 442/5000 [3:29:43<36:28:07, 28.80s/it] 9%|▉ | 443/5000 [3:30:12<36:43:21, 29.01s/it] 9%|▉ | 444/5000 [3:30:42<36:55:59, 29.18s/it] 9%|▉ | 445/5000 [3:31:09<36:02:45, 28.49s/it] 9%|▉ | 446/5000 [3:31:40<37:05:25, 29.32s/it] 9%|▉ | 447/5000 [3:32:07<36:13:45, 28.65s/it] 9%|▉ | 448/5000 [3:32:34<35:38:07, 28.18s/it] 9%|▉ | 449/5000 [3:33:05<36:30:32, 28.88s/it] 9%|▉ | 450/5000 [3:33:32<35:54:20, 28.41s/it] 9%|▉ | 450/5000 [3:33:32<35:54:20, 28.41s/it] 9%|▉ | 451/5000 [3:33:59<35:22:36, 28.00s/it] 9%|▉ | 452/5000 [3:34:31<36:52:54, 29.19s/it] 9%|▉ | 453/5000 [3:34:58<36:06:32, 28.59s/it] 9%|▉ | 454/5000 [3:35:25<35:30:53, 28.12s/it] 9%|▉ | 455/5000 [3:35:57<37:00:03, 29.31s/it] 9%|▉ | 456/5000 [3:36:24<36:06:07, 28.60s/it] 9%|▉ | 457/5000 [3:36:51<35:17:38, 27.97s/it] 9%|▉ | 458/5000 [3:37:17<34:37:21, 27.44s/it] 9%|▉ | 459/5000 [3:37:49<36:31:38, 28.96s/it] 9%|▉ | 460/5000 [3:38:17<35:49:57, 28.41s/it] 9%|▉ | 461/5000 [3:38:44<35:16:34, 27.98s/it] 9%|▉ | 462/5000 [3:39:14<36:11:48, 28.71s/it] 9%|▉ | 463/5000 [3:39:41<35:42:23, 28.33s/it] 9%|▉ | 464/5000 [3:40:08<35:02:57, 27.82s/it] 9%|▉ | 465/5000 [3:40:40<36:28:00, 28.95s/it] 9%|▉ | 466/5000 [3:41:07<35:48:57, 28.44s/it] 9%|▉ | 467/5000 [3:41:34<35:21:06, 28.08s/it] 9%|▉ | 468/5000 [3:42:06<36:39:29, 29.12s/it] 9%|▉ | 469/5000 [3:42:33<36:00:39, 28.61s/it] 9%|▉ | 470/5000 [3:43:00<35:25:08, 28.15s/it] 9%|▉ | 471/5000 [3:43:32<36:57:59, 29.38s/it] 9%|▉ | 472/5000 [3:43:59<36:04:29, 28.68s/it] 9%|▉ | 473/5000 [3:44:27<35:39:16, 28.35s/it] 9%|▉ | 474/5000 [3:44:57<36:23:20, 28.94s/it] 10%|▉ | 475/5000 [3:45:25<35:46:58, 28.47s/it] 10%|▉ | 475/5000 [3:45:25<35:46:58, 28.47s/it] 10%|▉ | 476/5000 [3:45:52<35:21:38, 28.14s/it] 10%|▉ | 477/5000 [3:46:28<38:13:45, 30.43s/it] 10%|▉ | 478/5000 [3:46:55<37:06:11, 29.54s/it] 10%|▉ | 479/5000 [3:47:21<35:39:21, 28.39s/it] 10%|▉ | 480/5000 [3:47:49<35:32:51, 28.31s/it] 10%|▉ | 481/5000 [3:48:16<34:57:24, 27.85s/it] 10%|▉ | 482/5000 [3:48:43<34:46:50, 27.71s/it] 10%|▉ | 483/5000 [3:49:16<36:38:31, 29.20s/it] 10%|▉ | 484/5000 [3:49:43<35:54:50, 28.63s/it] 10%|▉ | 485/5000 [3:50:10<35:19:14, 28.16s/it] 10%|▉ | 486/5000 [3:50:43<36:49:28, 29.37s/it] 10%|▉ | 487/5000 [3:50:57<31:01:29, 24.75s/it] 10%|▉ | 488/5000 [3:51:07<25:45:30, 20.55s/it] 10%|▉ | 489/5000 [3:51:18<22:05:59, 17.64s/it] 10%|▉ | 490/5000 [3:51:29<19:35:34, 15.64s/it]{'loss': 0.2545, 'learning_rate': 6.96e-06, 'epoch': 2.0} {'loss': 0.2599, 'learning_rate': 7.4600000000000006e-06, 'epoch': 2.01} {'loss': 0.2464, 'learning_rate': 7.960000000000002e-06, 'epoch': 2.01} {'loss': 0.1981, 'learning_rate': 8.46e-06, 'epoch': 2.02} {'loss': 0.2316, 'learning_rate': 8.96e-06, 'epoch': 2.02} {'loss': 0.2077, 'learning_rate': 9.460000000000001e-06, 'epoch': 2.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:02, 2.22s/it] Reading metadata...: 15016it [00:02, 9088.72it/s] Reading metadata...: 23848it [00:02, 13351.04it/s] Reading metadata...: 28043it [00:02, 10627.78it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:01, 1.89s/it] Reading metadata...: 10438it [00:01, 5308.23it/s] 10%|▉ | 491/5000 [3:53:06<49:59:47, 39.92s/it] 10%|▉ | 492/5000 [3:53:34<45:28:45, 36.32s/it] 10%|▉ | 493/5000 [3:54:01<42:16:20, 33.77s/it] 10%|▉ | 494/5000 [3:54:30<40:29:47, 32.35s/it] 10%|▉ | 495/5000 [3:54:58<38:41:25, 30.92s/it] 10%|▉ | 496/5000 [3:55:26<37:40:40, 30.12s/it] 10%|▉ | 497/5000 [3:55:57<37:50:39, 30.26s/it] 10%|▉ | 498/5000 [3:56:25<36:55:39, 29.53s/it] 10%|▉ | 499/5000 [3:56:53<36:39:25, 29.32s/it] 10%|█ | 500/5000 [3:57:20<35:45:29, 28.61s/it] 10%|█ | 500/5000 [3:57:20<35:45:29, 28.61s/it] 10%|█ | 501/5000 [3:57:50<36:10:29, 28.95s/it] 10%|█ | 502/5000 [3:58:17<35:22:32, 28.31s/it] 10%|█ | 503/5000 [3:58:44<34:59:39, 28.01s/it] 10%|█ | 504/5000 [3:59:13<35:18:43, 28.27s/it] 10%|█ | 505/5000 [3:59:41<35:07:06, 28.13s/it] 10%|█ | 506/5000 [4:00:09<35:10:42, 28.18s/it] 10%|█ | 507/5000 [4:00:37<35:09:58, 28.18s/it] 10%|█ | 508/5000 [4:01:05<34:52:29, 27.95s/it] 10%|█ | 509/5000 [4:01:33<34:57:53, 28.03s/it] 10%|█ | 510/5000 [4:02:01<34:58:48, 28.05s/it] 10%|█ | 511/5000 [4:02:28<34:36:24, 27.75s/it] 10%|█ | 512/5000 [4:02:57<34:56:20, 28.03s/it] 10%|█ | 513/5000 [4:03:25<34:47:39, 27.92s/it] 10%|█ | 514/5000 [4:03:52<34:30:48, 27.70s/it] 10%|█ | 515/5000 [4:04:19<34:25:26, 27.63s/it] 10%|█ | 516/5000 [4:04:47<34:17:59, 27.54s/it] 10%|█ | 517/5000 [4:05:14<34:12:29, 27.47s/it] 10%|█ | 518/5000 [4:05:41<34:10:29, 27.45s/it] 10%|█ | 519/5000 [4:06:09<34:24:54, 27.65s/it] 10%|█ | 520/5000 [4:06:36<34:07:13, 27.42s/it] 10%|█ | 521/5000 [4:07:08<35:46:29, 28.75s/it] 10%|█ | 522/5000 [4:07:36<35:18:23, 28.38s/it] 10%|█ | 523/5000 [4:08:03<34:47:17, 27.97s/it] 10%|█ | 524/5000 [4:08:32<35:23:31, 28.47s/it] 10%|█ | 525/5000 [4:09:00<34:54:55, 28.09s/it] 10%|█ | 525/5000 [4:09:00<34:54:55, 28.09s/it] 11%|█ | 526/5000 [4:09:25<33:59:55, 27.36s/it] 11%|█ | 527/5000 [4:09:51<33:27:38, 26.93s/it] 11%|█ | 528/5000 [4:10:19<33:45:53, 27.18s/it] 11%|█ | 529/5000 [4:10:46<33:44:51, 27.17s/it] 11%|█ | 530/5000 [4:11:15<34:14:15, 27.57s/it] 11%|█ | 531/5000 [4:11:42<34:10:56, 27.54s/it] 11%|█ | 532/5000 [4:12:09<34:01:11, 27.41s/it] 11%|█ | 533/5000 [4:12:38<34:34:56, 27.87s/it] 11%|█ | 534/5000 [4:13:06<34:35:00, 27.88s/it] 11%|█ | 535/5000 [4:13:33<34:18:03, 27.66s/it] 11%|█ | 536/5000 [4:14:01<34:30:46, 27.83s/it] 11%|█ | 537/5000 [4:14:29<34:28:24, 27.81s/it] 11%|█ | 538/5000 [4:14:57<34:24:07, 27.76s/it] 11%|█ | 539/5000 [4:15:24<34:11:36, 27.59s/it] 11%|█ | 540/5000 [4:15:51<34:07:40, 27.55s/it] 11%|█ | 541/5000 [4:16:20<34:33:30, 27.90s/it] 11%|█ | 542/5000 [4:16:47<34:20:33, 27.73s/it] 11%|█ | 543/5000 [4:17:15<34:24:08, 27.79s/it] 11%|█ | 544/5000 [4:17:43<34:25:59, 27.82s/it] 11%|█ | 545/5000 [4:18:10<34:09:36, 27.60s/it] 11%|█ | 546/5000 [4:18:38<34:19:04, 27.74s/it] 11%|█ | 547/5000 [4:19:06<34:21:15, 27.77s/it] 11%|█ | 548/5000 [4:19:34<34:13:57, 27.68s/it] 11%|█ | 549/5000 [4:20:03<34:45:28, 28.11s/it] 11%|█ | 550/5000 [4:20:30<34:27:46, 27.88s/it] 11%|█ | 550/5000 [4:20:30<34:27:46, 27.88s/it] 11%|█ | 551/5000 [4:20:57<34:11:41, 27.67s/it] 11%|█ | 552/5000 [4:21:26<34:33:00, 27.96s/it] 11%|█ | 553/5000 [4:21:53<34:17:24, 27.76s/it] 11%|█ | 554/5000 [4:22:21<34:08:34, 27.65s/it] 11%|█ | 555/5000 [4:22:49<34:26:50, 27.90s/it] 11%|█ | 556/5000 [4:23:17<34:28:07, 27.92s/it] 11%|█ | 557/5000 [4:23:44<34:10:44, 27.69s/it] 11%|█ | 558/5000 [4:24:12<34:00:19, 27.56s/it] 11%|█ | 559/5000 [4:24:40<34:23:16, 27.88s/it] 11%|█ | 560/5000 [4:25:08<34:11:49, 27.73s/it] 11%|█ | 561/5000 [4:25:36<34:17:53, 27.82s/it] 11%|█ | 562/5000 [4:26:04<34:35:00, 28.05s/it] 11%|█▏ | 563/5000 [4:26:31<33:58:17, 27.56s/it] 11%|█▏ | 564/5000 [4:26:58<34:00:51, 27.60s/it] 11%|█▏ | 565/5000 [4:27:26<34:07:16, 27.70s/it] 11%|█▏ | 566/5000 [4:27:54<34:20:15, 27.88s/it] 11%|█▏ | 567/5000 [4:28:22<34:03:53, 27.66s/it] 11%|█▏ | 568/5000 [4:28:50<34:09:43, 27.75s/it] 11%|█▏ | 569/5000 [4:29:17<34:10:34, 27.77s/it] 11%|█▏ | 570/5000 [4:29:45<33:56:18, 27.58s/it] 11%|█▏ | 571/5000 [4:30:13<34:10:09, 27.77s/it] 11%|█▏ | 572/5000 [4:30:41<34:11:42, 27.80s/it] 11%|█▏ | 573/5000 [4:31:08<33:58:26, 27.63s/it] 11%|█▏ | 574/5000 [4:31:36<34:02:43, 27.69s/it] 12%|█▏ | 575/5000 [4:32:04<34:05:04, 27.73s/it] 12%|█▏ | 575/5000 [4:32:04<34:05:04, 27.73s/it] 12%|█▏ | 576/5000 [4:32:31<33:53:20, 27.58s/it] 12%|█▏ | 577/5000 [4:32:58<33:43:16, 27.45s/it] 12%|█▏ | 578/5000 [4:33:25<33:25:33, 27.21s/it] 12%|█▏ | 579/5000 [4:33:52<33:22:58, 27.18s/it] 12%|█▏ | 580/5000 [4:34:19<33:30:08, 27.29s/it] 12%|█▏ | 581/5000 [4:34:47<33:40:18, 27.43s/it] 12%|█▏ | 582/5000 [4:35:14<33:32:07, 27.33s/it] 12%|█▏ | 583/5000 [4:35:42<33:55:35, 27.65s/it] 12%|█▏ | 584/5000 [4:36:10<33:48:17, 27.56s/it] 12%|█▏ | 585/5000 [4:36:37<33:41:31, 27.47s/it] 12%|█▏ | 586/5000 [4:37:05<33:56:36, 27.68s/it] 12%|█▏ | 587/5000 [4:37:33<33:58:42, 27.72s/it] 12%|█▏ | 588/5000 [4:38:00<33:51:26, 27.63s/it] 12%|█▏ | 589/5000 [4:38:29<34:04:58, 27.82s/it] 12%|█▏ | 590/5000 [4:38:56<34:03:34, 27.80s/it] 12%|█▏ | 591/5000 [4:39:24<33:57:32, 27.73s/it] 12%|█▏ | 592/5000 [4:39:51<33:42:24, 27.53s/it] 12%|█▏ | 593/5000 [4:40:19<33:58:36, 27.76s/it] 12%|█▏ | 594/5000 [4:40:47<34:02:31, 27.81s/it] 12%|█▏ | 595/5000 [4:41:15<33:51:44, 27.67s/it] 12%|█▏ | 596/5000 [4:41:42<33:39:13, 27.51s/it] 12%|█▏ | 597/5000 [4:42:10<33:58:56, 27.78s/it] 12%|█▏ | 598/5000 [4:42:37<33:42:46, 27.57s/it] 12%|█▏ | 599/5000 [4:43:04<33:30:34, 27.41s/it] 12%|█▏ | 600/5000 [4:43:33<34:06:44, 27.91s/it] 12%|█▏ | 600/5000 [4:43:33<34:06:44, 27.91s/it] 12%|█▏ | 601/5000 [4:44:01<33:53:07, 27.73s/it] 12%|█▏ | 602/5000 [4:44:28<33:41:18, 27.58s/it] 12%|█▏ | 603/5000 [4:44:59<35:04:30, 28.72s/it] 12%|█▏ | 604/5000 [4:45:27<34:38:09, 28.36s/it] 12%|█▏ | 605/5000 [4:45:55<34:34:59, 28.33s/it] 12%|█▏ | 606/5000 [4:46:23<34:20:01, 28.13s/it] 12%|█▏ | 607/5000 [4:46:51<34:20:45, 28.15s/it] 12%|█▏ | 608/5000 [4:47:18<34:01:48, 27.89s/it] 12%|█▏ | 609/5000 [4:47:46<34:00:57, 27.89s/it] 12%|█▏ | 610/5000 [4:48:14<34:06:23, 27.97s/it] 12%|█▏ | 611/5000 [4:48:41<33:49:16, 27.74s/it] 12%|█▏ | 612/5000 [4:49:10<34:06:46, 27.99s/it] 12%|█▏ | 613/5000 [4:49:38<34:00:04, 27.90s/it] 12%|█▏ | 614/5000 [4:50:05<33:43:00, 27.67s/it] 12%|█▏ | 615/5000 [4:50:32<33:37:13, 27.60s/it] 12%|█▏ | 616/5000 [4:51:00<33:39:36, 27.64s/it] 12%|█▏ | 617/5000 [4:51:27<33:30:54, 27.53s/it] 12%|█▏ | 618/5000 [4:51:55<33:23:37, 27.43s/it] 12%|█▏ | 619/5000 [4:52:23<33:51:14, 27.82s/it] 12%|█▏ | 620/5000 [4:52:50<33:32:09, 27.56s/it] 12%|█▏ | 621/5000 [4:53:17<33:20:34, 27.41s/it] 12%|█▏ | 622/5000 [4:53:45<33:35:22, 27.62s/it] 12%|█▏ | 623/5000 [4:54:13<33:30:25, 27.56s/it] 12%|█▏ | 624/5000 [4:54:40<33:11:34, 27.31s/it] 12%|█▎ | 625/5000 [4:55:08<33:43:52, 27.76s/it] 12%|█▎ | 625/5000 [4:55:08<33:43:52, 27.76s/it] 13%|█▎ | 626/5000 [4:55:36<33:33:46, 27.62s/it] 13%|█▎ | 627/5000 [4:56:03<33:23:47, 27.49s/it] 13%|█▎ | 628/5000 [4:56:32<33:50:09, 27.86s/it] 13%|█▎ | 629/5000 [4:57:00<33:54:49, 27.93s/it] 13%|█▎ | 630/5000 [4:57:27<33:38:47, 27.72s/it] 13%|█▎ | 631/5000 [4:57:55<33:41:28, 27.76s/it] 13%|█▎ | 632/5000 [4:58:22<33:38:57, 27.73s/it] 13%|█▎ | 633/5000 [4:58:49<33:19:08, 27.47s/it] 13%|█▎ | 634/5000 [4:59:17<33:30:40, 27.63s/it] 13%|█▎ | 635/5000 [4:59:44<33:21:27, 27.51s/it] 13%|█▎ | 636/5000 [5:00:12<33:16:18, 27.45s/it] 13%|█▎ | 637/5000 [5:00:40<33:31:57, 27.67s/it] 13%|█▎ | 638/5000 [5:01:08<33:39:37, 27.78s/it] 13%|█▎ | 639/5000 [5:01:35<33:26:38, 27.61s/it] 13%|█▎ | 640/5000 [5:02:04<34:00:57, 28.09s/it] 13%|█▎ | 641/5000 [5:02:32<33:44:56, 27.87s/it] 13%|█▎ | 642/5000 [5:02:59<33:29:59, 27.67s/it] 13%|█▎ | 643/5000 [5:03:28<33:47:23, 27.92s/it] 13%|█▎ | 644/5000 [5:03:54<33:15:04, 27.48s/it] 13%|█▎ | 645/5000 [5:04:21<33:01:41, 27.30s/it] 13%|█▎ | 646/5000 [5:04:48<32:48:05, 27.12s/it] 13%|█▎ | 647/5000 [5:05:16<33:09:03, 27.42s/it] 13%|█▎ | 648/5000 [5:05:43<32:59:56, 27.30s/it] 13%|█▎ | 649/5000 [5:06:12<33:44:52, 27.92s/it] 13%|█▎ | 650/5000 [5:06:34<31:39:01, 26.19s/it] 13%|█▎ | 650/5000 [5:06:34<31:39:01, 26.19s/it] 13%|█▎ | 651/5000 [5:06:45<26:03:56, 21.58s/it] 13%|█▎ | 652/5000 [5:06:56<22:15:49, 18.43s/it] 13%|█▎ | 653/5000 [5:07:07<19:32:01, 16.18s/it] 13%|█▎ | 654/5000 [5:07:15<16:27:10, 13.63s/it]{'loss': 0.1836, 'learning_rate': 9.960000000000001e-06, 'epoch': 3.0} {'loss': 0.2001, 'learning_rate': 9.94888888888889e-06, 'epoch': 3.01} {'loss': 0.1875, 'learning_rate': 9.893333333333334e-06, 'epoch': 3.01} {'loss': 0.15, 'learning_rate': 9.837777777777778e-06, 'epoch': 3.02} {'loss': 0.1534, 'learning_rate': 9.782222222222222e-06, 'epoch': 3.02} {'loss': 0.1565, 'learning_rate': 9.726666666666668e-06, 'epoch': 3.03} {'loss': 0.1266, 'learning_rate': 9.671111111111112e-06, 'epoch': 3.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.40it/s] Reading metadata...: 13887it [00:00, 35502.94it/s] Reading metadata...: 22056it [00:00, 29691.65it/s] Reading metadata...: 28043it [00:00, 31651.18it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 3.34it/s] Reading metadata...: 10438it [00:00, 28332.17it/s] 13%|█▎ | 655/5000 [5:08:52<46:55:10, 38.87s/it] 13%|█▎ | 656/5000 [5:09:21<43:01:04, 35.65s/it] 13%|█▎ | 657/5000 [5:09:48<39:53:00, 33.06s/it] 13%|█▎ | 658/5000 [5:10:16<38:12:56, 31.69s/it] 13%|█▎ | 659/5000 [5:10:45<37:11:39, 30.85s/it] 13%|█▎ | 660/5000 [5:11:13<36:08:48, 29.98s/it] 13%|█▎ | 661/5000 [5:11:40<35:07:56, 29.15s/it] 13%|█▎ | 662/5000 [5:12:08<34:35:15, 28.70s/it] 13%|█▎ | 663/5000 [5:12:36<34:21:00, 28.51s/it] 13%|█▎ | 664/5000 [5:13:04<34:13:30, 28.42s/it] 13%|█▎ | 665/5000 [5:13:31<33:46:38, 28.05s/it] 13%|█▎ | 666/5000 [5:14:00<34:02:21, 28.27s/it] 13%|█▎ | 667/5000 [5:14:28<33:49:20, 28.10s/it] 13%|█▎ | 668/5000 [5:14:56<34:02:05, 28.28s/it] 13%|█▎ | 669/5000 [5:15:24<33:41:22, 28.00s/it] 13%|█▎ | 670/5000 [5:15:52<33:34:18, 27.91s/it] 13%|█▎ | 671/5000 [5:16:20<33:35:14, 27.93s/it] 13%|█▎ | 672/5000 [5:16:47<33:18:26, 27.70s/it] 13%|█▎ | 673/5000 [5:17:17<34:06:49, 28.38s/it] 13%|█▎ | 674/5000 [5:17:44<33:41:46, 28.04s/it] 14%|█▎ | 675/5000 [5:18:11<33:22:09, 27.78s/it] 14%|█▎ | 675/5000 [5:18:11<33:22:09, 27.78s/it] 14%|█▎ | 676/5000 [5:18:39<33:28:46, 27.87s/it] 14%|█▎ | 677/5000 [5:19:06<33:03:48, 27.53s/it] 14%|█▎ | 678/5000 [5:19:32<32:31:54, 27.10s/it] 14%|█▎ | 679/5000 [5:20:00<32:46:53, 27.31s/it] 14%|█▎ | 680/5000 [5:20:26<32:28:07, 27.06s/it] 14%|█▎ | 681/5000 [5:20:53<32:23:40, 27.00s/it] 14%|█▎ | 682/5000 [5:21:21<32:45:28, 27.31s/it] 14%|█▎ | 683/5000 [5:21:48<32:40:09, 27.24s/it] 14%|█▎ | 684/5000 [5:22:16<32:41:46, 27.27s/it] 14%|█▎ | 685/5000 [5:22:44<33:11:01, 27.69s/it] 14%|█▎ | 686/5000 [5:23:12<33:07:35, 27.64s/it] 14%|█▎ | 687/5000 [5:23:39<32:59:00, 27.53s/it] 14%|█▍ | 688/5000 [5:24:08<33:19:27, 27.82s/it] 14%|█▍ | 689/5000 [5:24:34<32:52:08, 27.45s/it] 14%|█▍ | 690/5000 [5:25:02<32:50:17, 27.43s/it] 14%|█▍ | 691/5000 [5:25:29<32:47:05, 27.39s/it] 14%|█▍ | 692/5000 [5:25:55<32:30:49, 27.17s/it] 14%|█▍ | 693/5000 [5:26:22<32:24:10, 27.08s/it] 14%|█▍ | 694/5000 [5:26:50<32:34:11, 27.23s/it] 14%|█▍ | 695/5000 [5:27:17<32:33:06, 27.22s/it] 14%|█▍ | 696/5000 [5:27:45<32:48:17, 27.44s/it] 14%|█▍ | 697/5000 [5:28:13<33:02:45, 27.65s/it] 14%|█▍ | 698/5000 [5:28:40<32:49:27, 27.47s/it] 14%|█▍ | 699/5000 [5:29:08<32:58:14, 27.60s/it] 14%|█▍ | 700/5000 [5:29:36<32:52:54, 27.53s/it] 14%|█▍ | 700/5000 [5:29:36<32:52:54, 27.53s/it] 14%|█▍ | 701/5000 [5:30:03<32:59:06, 27.62s/it] 14%|█▍ | 702/5000 [5:30:31<33:01:56, 27.67s/it] 14%|█▍ | 703/5000 [5:30:58<32:42:15, 27.40s/it] 14%|█▍ | 704/5000 [5:31:27<33:14:49, 27.86s/it] 14%|█▍ | 705/5000 [5:31:55<33:10:53, 27.81s/it] 14%|█▍ | 706/5000 [5:32:22<33:01:40, 27.69s/it] 14%|█▍ | 707/5000 [5:32:51<33:27:50, 28.06s/it] 14%|█▍ | 708/5000 [5:33:18<33:11:38, 27.84s/it] 14%|█▍ | 709/5000 [5:33:45<32:58:58, 27.67s/it] 14%|█▍ | 710/5000 [5:34:14<33:20:25, 27.98s/it] 14%|█▍ | 711/5000 [5:34:41<33:03:46, 27.75s/it] 14%|█▍ | 712/5000 [5:35:09<32:58:54, 27.69s/it] 14%|█▍ | 713/5000 [5:35:38<33:29:44, 28.13s/it] 14%|█▍ | 714/5000 [5:36:05<33:12:27, 27.89s/it] 14%|█▍ | 715/5000 [5:36:33<33:01:30, 27.75s/it] 14%|█▍ | 716/5000 [5:37:05<34:30:24, 29.00s/it] 14%|█▍ | 717/5000 [5:37:32<33:45:23, 28.37s/it] 14%|█▍ | 718/5000 [5:37:59<33:22:11, 28.06s/it] 14%|█▍ | 719/5000 [5:38:29<34:12:47, 28.77s/it] 14%|█▍ | 720/5000 [5:38:57<33:39:21, 28.31s/it] 14%|█▍ | 721/5000 [5:39:25<33:32:35, 28.22s/it] 14%|█▍ | 722/5000 [5:39:53<33:33:52, 28.25s/it] 14%|█▍ | 723/5000 [5:40:20<33:12:43, 27.96s/it] 14%|█▍ | 724/5000 [5:40:48<33:17:00, 28.02s/it] 14%|█▍ | 725/5000 [5:41:16<32:59:03, 27.78s/it] 14%|█▍ | 725/5000 [5:41:16<32:59:03, 27.78s/it] 15%|█▍ | 726/5000 [5:41:43<32:57:34, 27.76s/it] 15%|█▍ | 727/5000 [5:42:11<32:46:32, 27.61s/it] 15%|█▍ | 728/5000 [5:42:38<32:38:55, 27.51s/it] 15%|█▍ | 729/5000 [5:43:06<32:51:03, 27.69s/it] 15%|█▍ | 730/5000 [5:43:34<33:03:15, 27.87s/it] 15%|█▍ | 731/5000 [5:44:02<32:55:06, 27.76s/it] 15%|█▍ | 732/5000 [5:44:30<33:04:30, 27.90s/it] 15%|█▍ | 733/5000 [5:44:58<33:12:34, 28.02s/it] 15%|█▍ | 734/5000 [5:45:26<32:58:22, 27.83s/it] 15%|█▍ | 735/5000 [5:45:54<33:00:14, 27.86s/it] 15%|█▍ | 736/5000 [5:46:21<32:58:52, 27.85s/it] 15%|█▍ | 737/5000 [5:46:49<32:46:49, 27.68s/it] 15%|█▍ | 738/5000 [5:47:16<32:45:39, 27.67s/it] 15%|█▍ | 739/5000 [5:47:45<32:58:14, 27.86s/it] 15%|█▍ | 740/5000 [5:48:12<32:53:57, 27.80s/it] 15%|█▍ | 741/5000 [5:48:40<32:41:30, 27.63s/it] 15%|█▍ | 742/5000 [5:49:13<34:39:06, 29.30s/it] 15%|█▍ | 743/5000 [5:49:42<34:44:37, 29.38s/it] 15%|█▍ | 744/5000 [5:50:09<33:50:56, 28.63s/it] 15%|█▍ | 745/5000 [5:50:38<33:44:17, 28.54s/it] 15%|█▍ | 746/5000 [5:51:06<33:31:41, 28.37s/it] 15%|█▍ | 747/5000 [5:51:32<32:53:41, 27.84s/it] 15%|█▍ | 748/5000 [5:52:00<33:01:45, 27.96s/it] 15%|█▍ | 749/5000 [5:52:28<32:45:54, 27.75s/it] 15%|█▌ | 750/5000 [5:52:55<32:38:30, 27.65s/it] 15%|█▌ | 750/5000 [5:52:55<32:38:30, 27.65s/it] 15%|█▌ | 751/5000 [5:53:23<32:35:57, 27.62s/it] 15%|█▌ | 752/5000 [5:53:51<32:57:59, 27.94s/it] 15%|█▌ | 753/5000 [5:54:19<32:43:21, 27.74s/it] 15%|█▌ | 754/5000 [5:54:47<32:54:30, 27.90s/it] 15%|█▌ | 755/5000 [5:55:15<33:06:21, 28.08s/it] 15%|█▌ | 756/5000 [5:55:43<32:51:02, 27.87s/it] 15%|█▌ | 757/5000 [5:56:11<33:07:55, 28.11s/it] 15%|█▌ | 758/5000 [5:56:39<32:52:04, 27.89s/it] 15%|█▌ | 759/5000 [5:57:06<32:39:43, 27.73s/it] 15%|█▌ | 760/5000 [5:57:35<33:04:22, 28.08s/it] 15%|█▌ | 761/5000 [5:58:02<32:47:53, 27.85s/it] 15%|█▌ | 762/5000 [5:58:30<32:35:13, 27.68s/it] 15%|█▌ | 763/5000 [5:58:57<32:38:01, 27.73s/it] 15%|█▌ | 764/5000 [5:59:26<32:49:59, 27.90s/it] 15%|█▌ | 765/5000 [5:59:53<32:36:02, 27.71s/it] 15%|█▌ | 766/5000 [6:00:21<32:48:05, 27.89s/it] 15%|█▌ | 767/5000 [6:00:49<32:45:08, 27.85s/it] 15%|█▌ | 768/5000 [6:01:17<32:36:36, 27.74s/it] 15%|█▌ | 769/5000 [6:01:45<32:40:26, 27.80s/it] 15%|█▌ | 770/5000 [6:02:13<32:48:40, 27.92s/it] 15%|█▌ | 771/5000 [6:02:41<32:46:30, 27.90s/it] 15%|█▌ | 772/5000 [6:03:08<32:28:53, 27.66s/it] 15%|█▌ | 773/5000 [6:03:36<32:52:36, 28.00s/it] 15%|█▌ | 774/5000 [6:04:04<32:33:05, 27.73s/it] 16%|█▌ | 775/5000 [6:04:31<32:25:51, 27.63s/it] 16%|█▌ | 775/5000 [6:04:31<32:25:51, 27.63s/it] 16%|█▌ | 776/5000 [6:04:58<32:20:50, 27.57s/it] 16%|█▌ | 777/5000 [6:05:25<32:09:51, 27.42s/it] 16%|█▌ | 778/5000 [6:05:53<32:08:01, 27.40s/it] 16%|█▌ | 779/5000 [6:06:26<34:05:05, 29.07s/it] 16%|█▌ | 780/5000 [6:06:53<33:26:32, 28.53s/it] 16%|█▌ | 781/5000 [6:07:21<33:04:20, 28.22s/it] 16%|█▌ | 782/5000 [6:07:49<33:15:12, 28.38s/it] 16%|█▌ | 783/5000 [6:08:16<32:46:58, 27.99s/it] 16%|█▌ | 784/5000 [6:08:44<32:28:10, 27.73s/it] 16%|█▌ | 785/5000 [6:09:11<32:19:03, 27.60s/it] 16%|█▌ | 786/5000 [6:09:40<32:53:14, 28.10s/it] 16%|█▌ | 787/5000 [6:10:07<32:31:51, 27.80s/it] 16%|█▌ | 788/5000 [6:10:34<32:07:03, 27.45s/it] 16%|█▌ | 789/5000 [6:11:02<32:30:34, 27.79s/it] 16%|█▌ | 790/5000 [6:11:30<32:23:16, 27.70s/it] 16%|█▌ | 791/5000 [6:11:57<32:15:07, 27.59s/it] 16%|█▌ | 792/5000 [6:12:26<32:45:02, 28.02s/it] 16%|█▌ | 793/5000 [6:12:54<32:33:00, 27.85s/it] 16%|█▌ | 794/5000 [6:13:20<32:04:35, 27.45s/it] 16%|█▌ | 795/5000 [6:13:48<32:20:21, 27.69s/it] 16%|█▌ | 796/5000 [6:14:16<32:08:30, 27.52s/it] 16%|█▌ | 797/5000 [6:14:42<31:54:08, 27.33s/it] 16%|█▌ | 798/5000 [6:15:11<32:25:26, 27.78s/it] 16%|█▌ | 799/5000 [6:15:38<32:07:25, 27.53s/it] 16%|█▌ | 800/5000 [6:16:05<31:52:14, 27.32s/it] 16%|█▌ | 800/5000 [6:16:05<31:52:14, 27.32s/it] 16%|█▌ | 801/5000 [6:16:35<32:43:39, 28.06s/it] 16%|█▌ | 802/5000 [6:17:01<32:07:25, 27.55s/it] 16%|█▌ | 803/5000 [6:17:29<32:04:13, 27.51s/it] 16%|█▌ | 804/5000 [6:17:58<32:39:16, 28.02s/it] 16%|█▌ | 805/5000 [6:18:25<32:30:30, 27.90s/it] 16%|█▌ | 806/5000 [6:18:53<32:15:14, 27.69s/it] 16%|█▌ | 807/5000 [6:19:20<32:07:52, 27.59s/it] 16%|█▌ | 808/5000 [6:19:47<31:59:21, 27.47s/it] 16%|█▌ | 809/5000 [6:20:14<31:49:52, 27.34s/it] 16%|█▌ | 810/5000 [6:20:42<32:03:25, 27.54s/it] 16%|█▌ | 811/5000 [6:21:10<31:57:46, 27.47s/it] 16%|█▌ | 812/5000 [6:21:37<31:55:38, 27.44s/it] 16%|█▋ | 813/5000 [6:22:05<32:09:12, 27.65s/it] 16%|█▋ | 814/5000 [6:22:19<27:24:16, 23.57s/it] 16%|█▋ | 815/5000 [6:22:30<22:57:38, 19.75s/it] 16%|█▋ | 816/5000 [6:22:41<19:50:24, 17.07s/it] 16%|█▋ | 817/5000 [6:22:52<17:40:15, 15.21s/it]{'loss': 0.1249, 'learning_rate': 9.615555555555558e-06, 'epoch': 4.0} {'loss': 0.1291, 'learning_rate': 9.56e-06, 'epoch': 4.01} {'loss': 0.1197, 'learning_rate': 9.504444444444446e-06, 'epoch': 4.01} {'loss': 0.0921, 'learning_rate': 9.44888888888889e-06, 'epoch': 4.02} {'loss': 0.1023, 'learning_rate': 9.393333333333334e-06, 'epoch': 4.02} {'loss': 0.0962, 'learning_rate': 9.33777777777778e-06, 'epoch': 4.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.99it/s] Reading metadata...: 14292it [00:00, 31622.25it/s] Reading metadata...: 22699it [00:01, 14675.60it/s] Reading metadata...: 28043it [00:01, 17791.47it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 3.06it/s] Reading metadata...: 10438it [00:00, 26121.88it/s] 16%|█▋ | 818/5000 [6:24:21<43:33:27, 37.50s/it] 16%|█▋ | 819/5000 [6:24:49<40:08:37, 34.57s/it] 16%|█▋ | 820/5000 [6:25:18<38:07:25, 32.83s/it] 16%|█▋ | 821/5000 [6:25:46<36:30:50, 31.46s/it] 16%|█▋ | 822/5000 [6:26:13<35:08:17, 30.28s/it] 16%|█▋ | 823/5000 [6:26:42<34:24:28, 29.65s/it] 16%|█▋ | 824/5000 [6:27:10<33:50:30, 29.17s/it] 16%|█▋ | 825/5000 [6:27:37<33:16:40, 28.69s/it] 16%|█▋ | 825/5000 [6:27:37<33:16:40, 28.69s/it] 17%|█▋ | 826/5000 [6:28:05<32:50:08, 28.32s/it] 17%|█▋ | 827/5000 [6:28:32<32:31:20, 28.06s/it] 17%|█▋ | 828/5000 [6:29:00<32:32:50, 28.08s/it] 17%|█▋ | 829/5000 [6:29:29<32:47:49, 28.31s/it] 17%|█▋ | 830/5000 [6:29:56<32:25:57, 28.00s/it] 17%|█▋ | 831/5000 [6:30:24<32:08:03, 27.75s/it] 17%|█▋ | 832/5000 [6:30:52<32:21:38, 27.95s/it] 17%|█▋ | 833/5000 [6:31:20<32:30:00, 28.08s/it] 17%|█▋ | 834/5000 [6:31:48<32:28:06, 28.06s/it] 17%|█▋ | 835/5000 [6:32:15<32:08:23, 27.78s/it] 17%|█▋ | 836/5000 [6:32:44<32:21:13, 27.97s/it] 17%|█▋ | 837/5000 [6:33:12<32:27:14, 28.06s/it] 17%|█▋ | 838/5000 [6:33:39<32:08:25, 27.80s/it] 17%|█▋ | 839/5000 [6:34:07<31:57:33, 27.65s/it] 17%|█▋ | 840/5000 [6:34:34<31:59:22, 27.68s/it] 17%|█▋ | 841/5000 [6:35:02<31:51:45, 27.58s/it] 17%|█▋ | 842/5000 [6:35:29<31:43:14, 27.46s/it] 17%|█▋ | 843/5000 [6:35:57<31:59:44, 27.71s/it] 17%|█▋ | 844/5000 [6:36:24<31:45:32, 27.51s/it] 17%|█▋ | 845/5000 [6:36:53<32:03:23, 27.77s/it] 17%|█▋ | 846/5000 [6:37:21<32:12:27, 27.91s/it] 17%|█▋ | 847/5000 [6:37:48<31:57:26, 27.70s/it] 17%|█▋ | 848/5000 [6:38:17<32:13:27, 27.94s/it] 17%|█▋ | 849/5000 [6:38:45<32:12:00, 27.93s/it] 17%|█▋ | 850/5000 [6:39:12<31:58:48, 27.74s/it] 17%|█▋ | 850/5000 [6:39:12<31:58:48, 27.74s/it] 17%|█▋ | 851/5000 [6:39:41<32:35:02, 28.27s/it] 17%|█▋ | 852/5000 [6:40:08<32:03:50, 27.83s/it] 17%|█▋ | 853/5000 [6:40:34<31:21:02, 27.22s/it] 17%|█▋ | 854/5000 [6:41:05<32:35:05, 28.29s/it] 17%|█▋ | 855/5000 [6:41:32<32:20:09, 28.08s/it] 17%|█▋ | 856/5000 [6:42:00<32:05:24, 27.88s/it] 17%|█▋ | 857/5000 [6:42:27<32:00:54, 27.82s/it] 17%|█▋ | 858/5000 [6:42:55<31:53:36, 27.72s/it] 17%|█▋ | 859/5000 [6:43:21<31:21:33, 27.26s/it] 17%|█▋ | 860/5000 [6:43:48<31:23:22, 27.30s/it] 17%|█▋ | 861/5000 [6:44:16<31:33:14, 27.44s/it] 17%|█▋ | 862/5000 [6:44:44<31:32:29, 27.44s/it] 17%|█▋ | 863/5000 [6:45:11<31:36:02, 27.50s/it] 17%|█▋ | 864/5000 [6:45:39<31:47:27, 27.67s/it] 17%|█▋ | 865/5000 [6:46:08<32:02:32, 27.90s/it] 17%|█▋ | 866/5000 [6:46:35<31:48:54, 27.71s/it] 17%|█▋ | 867/5000 [6:47:02<31:40:50, 27.59s/it] 17%|█▋ | 868/5000 [6:47:30<31:44:43, 27.66s/it] 17%|█▋ | 869/5000 [6:47:58<31:40:10, 27.60s/it] 17%|█▋ | 870/5000 [6:48:26<31:54:41, 27.82s/it] 17%|█▋ | 871/5000 [6:48:53<31:45:34, 27.69s/it] 17%|█▋ | 872/5000 [6:49:21<31:40:09, 27.62s/it] 17%|█▋ | 873/5000 [6:49:49<31:45:19, 27.70s/it] 17%|█▋ | 874/5000 [6:50:17<31:49:11, 27.76s/it] 18%|█▊ | 875/5000 [6:50:44<31:43:20, 27.68s/it] 18%|█▊ | 875/5000 [6:50:44<31:43:20, 27.68s/it] 18%|█▊ | 876/5000 [6:51:13<32:09:20, 28.07s/it] 18%|█▊ | 877/5000 [6:51:40<31:49:58, 27.79s/it] 18%|█▊ | 878/5000 [6:52:06<31:12:07, 27.25s/it] 18%|█▊ | 879/5000 [6:52:34<31:26:38, 27.47s/it] 18%|█▊ | 880/5000 [6:53:02<31:30:05, 27.53s/it] 18%|█▊ | 881/5000 [6:53:29<31:17:53, 27.35s/it] 18%|█▊ | 882/5000 [6:53:56<31:20:37, 27.40s/it] 18%|█▊ | 883/5000 [6:54:24<31:28:55, 27.53s/it] 18%|█▊ | 884/5000 [6:54:51<31:18:44, 27.39s/it] 18%|█▊ | 885/5000 [6:55:23<32:58:05, 28.84s/it] 18%|█▊ | 886/5000 [6:55:51<32:40:29, 28.59s/it] 18%|█▊ | 887/5000 [6:56:19<32:10:38, 28.16s/it] 18%|█▊ | 888/5000 [6:56:49<33:03:36, 28.94s/it] 18%|█▊ | 889/5000 [6:57:19<33:11:19, 29.06s/it] 18%|█▊ | 890/5000 [6:57:45<32:08:10, 28.15s/it] 18%|█▊ | 891/5000 [6:58:13<32:16:37, 28.28s/it] 18%|█▊ | 892/5000 [6:58:41<32:05:31, 28.12s/it] 18%|█▊ | 893/5000 [6:59:09<32:09:33, 28.19s/it] 18%|█▊ | 894/5000 [6:59:37<31:48:05, 27.88s/it] 18%|█▊ | 895/5000 [7:00:05<32:07:41, 28.18s/it] 18%|█▊ | 896/5000 [7:00:34<32:11:07, 28.23s/it] 18%|█▊ | 897/5000 [7:01:01<31:56:09, 28.02s/it] 18%|█▊ | 898/5000 [7:01:30<32:00:27, 28.09s/it] 18%|█▊ | 899/5000 [7:01:58<32:05:52, 28.18s/it] 18%|█▊ | 900/5000 [7:02:26<31:53:22, 28.00s/it] 18%|█▊ | 900/5000 [7:02:26<31:53:22, 28.00s/it] 18%|█▊ | 901/5000 [7:02:52<31:29:30, 27.66s/it] 18%|█▊ | 902/5000 [7:03:21<31:37:07, 27.78s/it] 18%|█▊ | 903/5000 [7:03:48<31:31:29, 27.70s/it] 18%|█▊ | 904/5000 [7:04:16<31:41:15, 27.85s/it] 18%|█▊ | 905/5000 [7:04:44<31:44:15, 27.90s/it] 18%|█▊ | 906/5000 [7:05:12<31:35:45, 27.78s/it] 18%|█▊ | 907/5000 [7:05:40<31:39:21, 27.84s/it] 18%|█▊ | 908/5000 [7:06:08<31:47:10, 27.96s/it] 18%|█▊ | 909/5000 [7:06:35<31:33:15, 27.77s/it] 18%|█▊ | 910/5000 [7:07:03<31:30:04, 27.73s/it] 18%|█▊ | 911/5000 [7:07:31<31:42:06, 27.91s/it] 18%|█▊ | 912/5000 [7:07:58<31:25:28, 27.67s/it] 18%|█▊ | 913/5000 [7:08:26<31:26:02, 27.69s/it] 18%|█▊ | 914/5000 [7:08:54<31:35:53, 27.84s/it] 18%|█▊ | 915/5000 [7:09:22<31:29:30, 27.75s/it] 18%|█▊ | 916/5000 [7:09:50<31:38:49, 27.90s/it] 18%|█▊ | 917/5000 [7:10:19<31:52:19, 28.10s/it] 18%|█▊ | 918/5000 [7:10:46<31:43:08, 27.97s/it] 18%|█▊ | 919/5000 [7:11:14<31:30:55, 27.80s/it] 18%|█▊ | 920/5000 [7:11:42<31:36:39, 27.89s/it] 18%|█▊ | 921/5000 [7:12:09<31:25:04, 27.73s/it] 18%|█▊ | 922/5000 [7:12:36<31:15:20, 27.59s/it] 18%|█▊ | 923/5000 [7:13:04<31:07:13, 27.48s/it] 18%|█▊ | 924/5000 [7:13:33<31:44:03, 28.03s/it] 18%|█▊ | 925/5000 [7:14:00<31:27:54, 27.80s/it] 18%|█▊ | 925/5000 [7:14:00<31:27:54, 27.80s/it] 19%|█▊ | 926/5000 [7:14:26<30:55:29, 27.33s/it] 19%|█▊ | 927/5000 [7:14:54<30:58:35, 27.38s/it] 19%|█▊ | 928/5000 [7:15:21<30:50:12, 27.26s/it] 19%|█▊ | 929/5000 [7:15:48<30:41:11, 27.14s/it] 19%|█▊ | 930/5000 [7:16:17<31:25:56, 27.80s/it] 19%|█▊ | 931/5000 [7:16:44<31:11:50, 27.60s/it] 19%|█▊ | 932/5000 [7:17:12<31:18:34, 27.71s/it] 19%|█▊ | 933/5000 [7:17:40<31:26:23, 27.83s/it] 19%|█▊ | 934/5000 [7:18:08<31:23:55, 27.80s/it] 19%|█▊ | 935/5000 [7:18:35<31:02:04, 27.48s/it] 19%|█▊ | 936/5000 [7:19:03<31:07:14, 27.57s/it] 19%|█▊ | 937/5000 [7:19:30<31:10:13, 27.62s/it] 19%|█▉ | 938/5000 [7:19:58<31:00:48, 27.49s/it] 19%|█▉ | 939/5000 [7:20:26<31:23:30, 27.83s/it] 19%|█▉ | 940/5000 [7:20:54<31:18:10, 27.76s/it] 19%|█▉ | 941/5000 [7:21:21<31:06:46, 27.59s/it] 19%|█▉ | 942/5000 [7:21:48<30:58:47, 27.48s/it] 19%|█▉ | 943/5000 [7:22:16<31:00:08, 27.51s/it] 19%|█▉ | 944/5000 [7:22:43<30:50:49, 27.38s/it] 19%|█▉ | 945/5000 [7:23:10<30:45:47, 27.31s/it] 19%|█▉ | 946/5000 [7:23:39<31:26:24, 27.92s/it] 19%|█▉ | 947/5000 [7:24:06<30:59:01, 27.52s/it] 19%|█▉ | 948/5000 [7:24:31<30:12:57, 26.85s/it] 19%|█▉ | 949/5000 [7:24:59<30:28:04, 27.08s/it] 19%|█▉ | 950/5000 [7:25:24<29:51:57, 26.55s/it] 19%|█▉ | 950/5000 [7:25:24<29:51:57, 26.55s/it] 19%|█▉ | 951/5000 [7:25:51<29:49:11, 26.51s/it] 19%|█▉ | 952/5000 [7:26:20<30:39:01, 27.26s/it] 19%|█▉ | 953/5000 [7:26:47<30:39:02, 27.27s/it] 19%|█▉ | 954/5000 [7:27:14<30:39:54, 27.28s/it] 19%|█▉ | 955/5000 [7:27:43<31:09:52, 27.74s/it] 19%|█▉ | 956/5000 [7:28:10<31:05:19, 27.68s/it] 19%|█▉ | 957/5000 [7:28:38<30:52:26, 27.49s/it] 19%|█▉ | 958/5000 [7:29:05<30:49:08, 27.45s/it] 19%|█▉ | 959/5000 [7:29:33<30:53:05, 27.51s/it] 19%|█▉ | 960/5000 [7:30:00<30:47:34, 27.44s/it] 19%|█▉ | 961/5000 [7:30:28<31:11:15, 27.80s/it] 19%|█▉ | 962/5000 [7:30:56<31:08:37, 27.77s/it] 19%|█▉ | 963/5000 [7:31:24<31:04:05, 27.71s/it] 19%|█▉ | 964/5000 [7:31:51<30:49:39, 27.50s/it] 19%|█▉ | 965/5000 [7:32:18<30:54:27, 27.58s/it] 19%|█▉ | 966/5000 [7:32:46<30:45:21, 27.45s/it] 19%|█▉ | 967/5000 [7:33:14<31:04:40, 27.74s/it] 19%|█▉ | 968/5000 [7:33:40<30:26:48, 27.18s/it] 19%|█▉ | 969/5000 [7:34:07<30:28:54, 27.22s/it] 19%|█▉ | 970/5000 [7:34:36<30:57:28, 27.65s/it] 19%|█▉ | 971/5000 [7:35:03<30:50:45, 27.56s/it] 19%|█▉ | 972/5000 [7:35:31<30:50:11, 27.56s/it] 19%|█▉ | 973/5000 [7:35:59<31:02:30, 27.75s/it] 19%|█▉ | 974/5000 [7:36:27<31:11:27, 27.89s/it] 20%|█▉ | 975/5000 [7:36:55<31:02:52, 27.77s/it] 20%|█▉ | 975/5000 [7:36:55<31:02:52, 27.77s/it] 20%|█▉ | 976/5000 [7:37:23<31:13:46, 27.94s/it] 20%|█▉ | 977/5000 [7:37:45<29:17:09, 26.21s/it] 20%|█▉ | 978/5000 [7:37:56<24:09:31, 21.62s/it] 20%|█▉ | 979/5000 [7:38:07<20:32:29, 18.39s/it] 20%|█▉ | 980/5000 [7:38:18<17:59:55, 16.12s/it] 20%|█▉ | 981/5000 [7:38:26<15:15:07, 13.66s/it]{'loss': 0.0803, 'learning_rate': 9.282222222222222e-06, 'epoch': 5.0} {'loss': 0.0877, 'learning_rate': 9.226666666666668e-06, 'epoch': 5.01} {'loss': 0.0812, 'learning_rate': 9.171111111111112e-06, 'epoch': 5.01} {'loss': 0.0639, 'learning_rate': 9.115555555555556e-06, 'epoch': 5.02} {'loss': 0.0615, 'learning_rate': 9.060000000000001e-06, 'epoch': 5.02} {'loss': 0.07, 'learning_rate': 9.004444444444445e-06, 'epoch': 5.03} {'loss': 0.053, 'learning_rate': 8.94888888888889e-06, 'epoch': 5.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.05it/s] Reading metadata...: 14689it [00:00, 33229.49it/s] Reading metadata...: 23329it [00:00, 27403.88it/s] Reading metadata...: 28043it [00:01, 28001.93it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.80it/s] Reading metadata...: 10438it [00:00, 24569.38it/s] 20%|█▉ | 982/5000 [7:40:08<44:59:17, 40.31s/it] 20%|█▉ | 983/5000 [7:40:39<41:48:40, 37.47s/it] 20%|█▉ | 984/5000 [7:41:06<38:21:27, 34.38s/it] 20%|█▉ | 985/5000 [7:41:35<36:19:09, 32.57s/it] 20%|█▉ | 986/5000 [7:42:03<34:54:32, 31.31s/it] 20%|█▉ | 987/5000 [7:42:31<33:49:17, 30.34s/it] 20%|█▉ | 988/5000 [7:42:58<32:48:35, 29.44s/it] 20%|█▉ | 989/5000 [7:43:27<32:34:34, 29.24s/it] 20%|█▉ | 990/5000 [7:43:55<32:09:24, 28.87s/it] 20%|█▉ | 991/5000 [7:44:22<31:31:59, 28.32s/it] 20%|█▉ | 992/5000 [7:44:50<31:14:19, 28.06s/it] 20%|█▉ | 993/5000 [7:45:18<31:13:43, 28.06s/it] 20%|█▉ | 994/5000 [7:45:46<31:25:23, 28.24s/it] 20%|█▉ | 995/5000 [7:46:15<31:30:11, 28.32s/it] 20%|█▉ | 996/5000 [7:46:42<31:11:18, 28.04s/it] 20%|█▉ | 997/5000 [7:47:14<32:27:39, 29.19s/it] 20%|█▉ | 998/5000 [7:47:43<32:18:30, 29.06s/it] 20%|█▉ | 999/5000 [7:48:10<31:41:54, 28.52s/it] 20%|██ | 1000/5000 [7:48:38<31:29:41, 28.35s/it] 20%|██ | 1000/5000 [7:48:38<31:29:41, 28.35s/it][INFO|trainer.py:3138] 2023-05-07 18:22:49,172 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-07 18:22:49,172 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-07 18:22:49,172 >> Batch size = 64 {'loss': 0.0519, 'learning_rate': 8.893333333333333e-06, 'epoch': 6.0} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:02, 2.58s/it] Reading metadata...: 10440it [00:02, 3954.23it/s] [INFO|trainer_utils.py:693] 2023-05-07 18:23:04,305 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. 20%|██ | 1000/5000 [8:26:06<31:29:41, 28.35s/it][INFO|trainer.py:2877] 2023-05-07 19:00:17,386 >> Saving model checkpoint to ./checkpoint-1000 [INFO|configuration_utils.py:458] 2023-05-07 19:00:17,393 >> Configuration saved in ./checkpoint-1000/config.json [INFO|configuration_utils.py:364] 2023-05-07 19:00:17,398 >> Configuration saved in ./checkpoint-1000/generation_config.json [INFO|modeling_utils.py:1855] 2023-05-07 19:00:20,753 >> Model weights saved in ./checkpoint-1000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-07 19:00:20,758 >> Feature extractor saved in ./checkpoint-1000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-07 19:00:30,115 >> Feature extractor saved in ./preprocessor_config.json Adding files tracked by Git LFS: ['wandb/run-20230506_113337-ysywp688/run-ysywp688.wandb', 'wandb/run-20230507_103405-9zf5xxpu/run-9zf5xxpu.wandb']. This may take a bit of time if the files are large. {'eval_loss': 0.43405279517173767, 'eval_wer': 54.25600000000001, 'eval_runtime': 2248.2056, 'eval_samples_per_second': 4.644, 'eval_steps_per_second': 0.073, 'epoch': 6.0} 05/07/2023 19:00:40 - WARNING - huggingface_hub.repository - Adding files tracked by Git LFS: ['wandb/run-20230506_113337-ysywp688/run-ysywp688.wandb', 'wandb/run-20230507_103405-9zf5xxpu/run-9zf5xxpu.wandb']. This may take a bit of time if the files are large.