wandb: Currently logged in as: sanchit-gandhi (use `wandb login --relogin` to force relogin) wandb: wandb version 0.12.16 is available! To upgrade, please run: wandb: $ pip install wandb --upgrade wandb: Tracking run with wandb version 0.12.15 wandb: Run data is saved locally in /home/sanchitgandhi/train-flax-wav2vec2-ctc-cv9-baseline/wandb/run-20220516_161739-qrxgdh9s wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run flax-wav2vec2-ctc-baseline wandb: ⭐️ View project at https://wandb.ai/sanchit-gandhi/commonvoice_9_0 wandb: 🚀 View run at https://wandb.ai/sanchit-gandhi/commonvoice_9_0/runs/qrxgdh9s 05/16/2022 16:17:41 - INFO - __main__ - Training/evaluation parameters FlaxTrainingArguments( _n_gpu=0, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=True, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=10000, evaluation_strategy=IntervalStrategy.NO, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, gradient_accumulation_steps=1, gradient_checkpointing=True, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_strategy=HubStrategy.EVERY_SAVE, hub_token=, ignore_data_skip=False, label_names=None, label_smoothing_factor=0.0, learning_rate=0.0003, length_column_name=input_length, load_best_model_at_end=False, local_rank=-1, log_level=-1, log_level_replica=-1, log_on_each_node=True, logging_dir=./flax-wav2vec2-ctc-cv9-baseline/runs/May16_16-17-35_t1v-n-7e6d8bf0-w-0, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=25, logging_strategy=IntervalStrategy.STEPS, lr_scheduler_type=SchedulerType.LINEAR, matmul_precision=default, max_grad_norm=1.0, max_steps=50000, metric_for_best_model=None, mp_parameters=, multisteps=False, no_cuda=False, num_train_epochs=3.0, optim=OptimizerNames.ADAMW_HF, output_dir=./flax-wav2vec2-ctc-cv9-baseline, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=8, per_device_train_batch_size=8, precision=full, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, remove_unused_columns=True, report_to=['wandb'], resume_from_checkpoint=None, run_name=./flax-wav2vec2-ctc-cv9-baseline, save_on_each_node=False, save_steps=10000, save_strategy=IntervalStrategy.STEPS, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, tf32=None, tpu_metrics_debug=False, tpu_num_cores=None, use_legacy_prediction_loop=False, warmup_ratio=0.0, warmup_steps=500, weight_decay=0.0, xpu_backend=None, ) 05/16/2022 16:17:41 - INFO - __main__ - JAX devices: 8, matmul precision: default 05/16/2022 16:17:43 - WARNING - datasets.builder - Reusing dataset common_voice (/home/sanchitgandhi/cache/huggingface/datasets/mozilla-foundation___common_voice/en/9.0.0/26f54721b57ee2f31a333b315ed9151fbd8e693a3983c295fef63c67a12b9bf7) 05/16/2022 16:17:46 - WARNING - datasets.builder - Reusing dataset common_voice (/home/sanchitgandhi/cache/huggingface/datasets/mozilla-foundation___common_voice/en/9.0.0/26f54721b57ee2f31a333b315ed9151fbd8e693a3983c295fef63c67a12b9bf7) 05/16/2022 16:17:48 - WARNING - datasets.builder - Reusing dataset common_voice (/home/sanchitgandhi/cache/huggingface/datasets/mozilla-foundation___common_voice/en/9.0.0/26f54721b57ee2f31a333b315ed9151fbd8e693a3983c295fef63c67a12b9bf7) loading configuration file https://huggingface.co/speech-seq2seq/flax-wav2vec2-large-lv60-scan/resolve/main/config.json from cache at /home/sanchitgandhi/.cache/huggingface/transformers/af26a73be492846deff70471176e0f6c3134134a11dc5908c11fbc12ed7c7c8e.d533ca185cf60c851bce32022efe596ce1bfaca5e73858c4e3b0cb8d6986cafd /home/sanchitgandhi/transformers/src/transformers/configuration_utils.py:358: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`. warnings.warn( Model config Wav2Vec2Config { "activation_dropout": 0.1, "adapter_kernel_size": 3, "adapter_stride": 2, "add_adapter": false, "apply_spec_augment": true, "architectures": [ "Wav2Vec2Model" ], "attention_dropout": 0.1, "bos_token_id": 1, "classifier_proj_size": 256, "codevector_dim": 768, "contrastive_logits_temperature": 0.1, "conv_bias": true, "conv_dim": [ 512, 512, 512, 512, 512, 512, 512 ], "conv_kernel": [ 10, 3, 3, 3, 3, 2, 2 ], "conv_stride": [ 5, 2, 2, 2, 2, 2, 2 ], "ctc_loss_reduction": "sum", "ctc_zero_infinity": false, "diversity_loss_weight": 0.1, "do_stable_layer_norm": true, "eos_token_id": 2, "feat_extract_activation": "gelu", "feat_extract_dropout": 0.0, "feat_extract_norm": "layer", "feat_proj_dropout": 0.0, "feat_quantizer_dropout": 0.0, "final_dropout": 0.0, "fuse_matmuls": false, "gradient_checkpointing": true, "hidden_act": "gelu", "hidden_dropout": 0.1, "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-05, "layerdrop": 0.0, "mask_feature_length": 10, "mask_feature_min_masks": 0, "mask_feature_prob": 0.0, "mask_time_length": 10, "mask_time_min_masks": 2, "mask_time_prob": 0.1, "model_type": "wav2vec2", "num_adapter_layers": 3, "num_attention_heads": 16, "num_codevector_groups": 2, "num_codevectors_per_group": 320, "num_conv_pos_embedding_groups": 16, "num_conv_pos_embeddings": 128, "num_feat_extract_layers": 7, "num_hidden_layers": 24, "num_negatives": 100, "output_hidden_size": 1024, "pad_token_id": 0, "proj_codevector_dim": 768, "tdnn_dilation": [ 1, 2, 3, 1, 1 ], "tdnn_dim": [ 512, 512, 512, 512, 1500 ], "tdnn_kernel": [ 5, 3, 3, 1, 1 ], "transformers_version": "4.18.0.dev0", "use_scan": true, "use_weighted_layer_sum": false, "vocab_size": 32, "xvector_output_dim": 512 } loading feature extractor configuration file https://huggingface.co/speech-seq2seq/flax-wav2vec2-large-lv60-scan/resolve/main/preprocessor_config.json from cache at /home/sanchitgandhi/.cache/huggingface/transformers/b496e500d1063975aa580ec835deb5094401775d2869a2e5d79556b66a21dc87.bef560b27c62cea1af8278853fdffeaf0141c9c44f4298df07ba06cdf6f8f963 Feature extractor Wav2Vec2FeatureExtractor { "do_normalize": true, "feature_extractor_type": "Wav2Vec2FeatureExtractor", "feature_size": 1, "padding_side": "right", "padding_value": 0.0, "processor_class": "Wav2Vec2Processor", "return_attention_mask": true, "sampling_rate": 16000 } loading file https://huggingface.co/patrickvonplaten/wav2vec2_ctc_cv9_tokenizer/resolve/main/vocab.json from cache at /home/sanchitgandhi/.cache/huggingface/transformers/a87a81d5e020048c331ad61c1f343904c1fff28b3ef918ce75aa0038d162c698.65c0f0efa332224f1dfbaeb422c47d17ba6a46c5c04ab9daa7da4acee61ccf81 loading file https://huggingface.co/patrickvonplaten/wav2vec2_ctc_cv9_tokenizer/resolve/main/tokenizer_config.json from cache at /home/sanchitgandhi/.cache/huggingface/transformers/ec2c496723611c2682a969c25d4cddec6efed7470218fc741989fe1298b15972.e954104c33fad04298b0c8a5c24afc74ea5b3a84fcf93e46b0ad82daa32bbafc loading file https://huggingface.co/patrickvonplaten/wav2vec2_ctc_cv9_tokenizer/resolve/main/added_tokens.json from cache at None loading file https://huggingface.co/patrickvonplaten/wav2vec2_ctc_cv9_tokenizer/resolve/main/special_tokens_map.json from cache at /home/sanchitgandhi/.cache/huggingface/transformers/c45c68d593f1bfa292a7ecd94f759a0126ca0c36029c2ca2a2c2fe3c5cb3d93e.9d6cd81ef646692fb1c169a880161ea1cb95f49694f220aced9b704b457e51dd loading weights file https://huggingface.co/speech-seq2seq/flax-wav2vec2-large-lv60-scan/resolve/main/flax_model.msgpack from cache at /home/sanchitgandhi/.cache/huggingface/transformers/8738d0bc737753f3081921d57d6a989166e00a293f8cbc50ed7ac1a45c8bc8ae.30c766abe352bc4a15a81a2c3bff36eccb623f4dcb64b4a1ead66bbce22b6ca9 tcmalloc: large alloc 1261764608 bytes == 0x9dd74000 @ 0x7f6b45402680 0x7f6b45423824 0x5f8a01 0x648cf1 0x5c4676 0x4f290e 0x64f718 0x5048b3 0x56b1da 0x56939a 0x50aaa0 0x56c28c 0x56939a 0x5f6a13 0x56b0ae 0x56939a 0x68d047 0x67e351 0x67e3cf 0x67e471 0x67e817 0x6b6fe2 0x6b736d 0x7f6b452150b3 0x5fa5ce /home/sanchitgandhi/hf/lib/python3.8/site-packages/jax/_src/tree_util.py:188: FutureWarning: jax.tree_util.tree_multimap() is deprecated. Please use jax.tree_util.tree_map() instead as a drop-in replacement. warnings.warn('jax.tree_util.tree_multimap() is deprecated. Please use jax.tree_util.tree_map() ' All model checkpoint weights were used when initializing FlaxWav2Vec2ForCTC. Some weights of FlaxWav2Vec2ForCTC were not initialized from the model checkpoint at speech-seq2seq/flax-wav2vec2-large-lv60-scan and are newly initialized: {('lm_head', 'kernel'), ('lm_head', 'bias')} You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 05/16/2022 16:18:15 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/sanchitgandhi/cache/huggingface/datasets/mozilla-foundation___common_voice/en/9.0.0/26f54721b57ee2f31a333b315ed9151fbd8e693a3983c295fef63c67a12b9bf7/cache-37d399322cc458c7.arrow 05/16/2022 16:18:15 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/sanchitgandhi/cache/huggingface/datasets/mozilla-foundation___common_voice/en/9.0.0/26f54721b57ee2f31a333b315ed9151fbd8e693a3983c295fef63c67a12b9bf7/cache-52d5a8be94f19888.arrow preprocess dataset: 0% 0/16335 [00:00