wandb: Currently logged in as: sanchit-gandhi (use `wandb login --relogin` to force relogin) wandb: wandb version 0.12.17 is available! To upgrade, please run: wandb: $ pip install wandb --upgrade wandb: Tracking run with wandb version 0.12.15 wandb: Run data is saved locally in /home/sanchitgandhi/flax-wav2vec2-2-bart-large-ls-960h-feature-encoder/wandb/run-20220530_122038-277l6opb wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run flax-wav2vec2-2-bart-large-ls-960h-feature-encoder wandb: ⭐️ View project at https://wandb.ai/sanchit-gandhi/librispeech_960h wandb: 🚀 View run at https://wandb.ai/sanchit-gandhi/librispeech_960h/runs/277l6opb 05/30/2022 12:20:39 - INFO - __main__ - Training/evaluation parameters FlaxSeq2SeqTrainingArguments( _n_gpu=-1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, debug=, deepspeed=None, disable_tqdm=None, do_eval=True, do_predict=True, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=10000, evaluation_strategy=no, final_generation_max_length=200, final_generation_num_beams=5, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, generation_length_penalty=1.2, generation_max_length=40, generation_num_beams=1, gradient_accumulation_steps=1, gradient_checkpointing=True, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_strategy=every_save, hub_token=, ignore_data_skip=False, label_names=None, label_smoothing_factor=0.0, learning_rate=0.0001, length_column_name=input_length, load_best_model_at_end=False, local_rank=-1, log_level=passive, log_level_replica=passive, log_on_each_node=True, logging_dir=None, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=25, logging_strategy=steps, lr_scheduler_type=linear, matmul_precision=default, max_grad_norm=1.0, max_steps=50000, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=adamw_hf, output_dir=./, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=4, per_device_train_batch_size=8, precision=full, predict_with_generate=True, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, remove_unused_columns=True, report_to=None, resume_from_checkpoint=None, run_name=None, save_on_each_node=False, save_steps=10000, save_strategy=steps, save_total_limit=1, seed=42, sharded_ddp=, skip_memory_metrics=True, sortish_sampler=False, tf32=None, tpu_metrics_debug=False, tpu_num_cores=None, use_legacy_prediction_loop=False, warmup_ratio=0.0, warmup_steps=500, weight_decay=0.0, xpu_backend=None, ) 05/30/2022 12:20:39 - INFO - __main__ - JAX devices: 8, matmul precision: default Downloading data files: 0% 0/7 [00:00 main() File "run_flax_speech_recognition_seq2seq.py", line 1400, in main state, train_metric = p_train_step(state, batch) ValueError: RESOURCE_EXHAUSTED: Attempting to reserve 6.05G at the bottom of memory. That was not possible. There are 7.00G free, 0B reserved, and 5.97G reservable.: while running replica 0 and partition 0 of a replicated computation (other replicas may have failed as well). wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing. wandb: - 0.002 MB of 0.002 MB uploaded (0.000 MB deduped) wandb: \ 0.002 MB of 0.002 MB uploaded (0.000 MB deduped) wandb: | 0.002 MB of 0.038 MB uploaded (0.000 MB deduped) wandb: / 0.002 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: - 0.045 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: \ 0.045 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: | 0.045 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: / 0.045 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: - 0.045 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: \ 0.045 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: | 0.045 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: / 0.045 MB of 0.045 MB uploaded (0.000 MB deduped) wandb: wandb: wandb: Run history: wandb: train/decoder_grad_norm ▂█▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: train/decoder_param_norm ██▇▇▆▅▅▄▄▃▃▂▂▂▁▁▁▁▁▁▁▁ wandb: train/encoder_grad_norm ▂█▂▂▁▁▁▁▁▁▁▂▂▁▁▁▁▁▁▂▁▁ wandb: train/encoder_param_norm ▁▁▁▁▁▁▁▂▂▂▂▂▃▃▄▄▅▅▆▆▇█ wandb: train/grad_norm ▂█▂▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁ wandb: train/learning_rate ▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇████ wandb: train/loss █▆▄▄▃▄▄▃▃▃▃▃▃▂▃▂▃▃▂▁▃▁ wandb: train/param_norm ██▇▆▆▅▄▃▃▂▁▁▁▁▁▂▂▂▃▃▄▅ wandb: wandb: Run summary: wandb: train/decoder_grad_norm 4.9307 wandb: train/decoder_param_norm 1056.73962 wandb: train/encoder_grad_norm 3.26562 wandb: train/encoder_param_norm 2309.60815 wandb: train/grad_norm 5.91406 wandb: train/learning_rate 0.0001 wandb: train/loss 3.93325 wandb: train/param_norm 2539.87988 wandb: wandb: Synced flax-wav2vec2-2-bart-large-ls-960h-feature-encoder: https://wandb.ai/sanchit-gandhi/librispeech_960h/runs/277l6opb wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20220530_122038-277l6opb/logs