Saving weights and logs of step 500

c3a9018 about 3 years ago

No virus

19.6 kB

	[16:04:24] - INFO - __main__ - Training/evaluation parameters TrainingArguments(
	_n_gpu=-1,
	adafactor=False,
	adam_beta1=0.9,
	adam_beta2=0.999,
	adam_epsilon=1e-08,
	dataloader_drop_last=False,
	dataloader_num_workers=0,
	dataloader_pin_memory=True,
	ddp_find_unused_parameters=None,
	debug=[],
	deepspeed=None,
	disable_tqdm=False,
	do_eval=False,
	do_predict=False,
	do_train=False,
	eval_accumulation_steps=None,
	eval_steps=500,
	evaluation_strategy=IntervalStrategy.NO,
	fp16=False,
	fp16_backend=auto,
	fp16_full_eval=False,
	fp16_opt_level=O1,
	gradient_accumulation_steps=1,
	greater_is_better=None,
	group_by_length=False,
	ignore_data_skip=False,
	label_names=None,
	label_smoothing_factor=0.0,
	learning_rate=0.0003,
	length_column_name=length,
	load_best_model_at_end=False,
	local_rank=-1,
	log_level=-1,
	log_level_replica=-1,
	log_on_each_node=True,
	logging_dir=./runs/Jul08_16-04-24_t1v-n-112df4a9-w-0,
	logging_first_step=False,
	logging_steps=500,
	logging_strategy=IntervalStrategy.STEPS,
	lr_scheduler_type=SchedulerType.LINEAR,
	max_grad_norm=1.0,
	max_steps=-1,
	metric_for_best_model=None,
	mp_parameters=,
	no_cuda=False,
	num_train_epochs=8.0,
	output_dir=./,
	overwrite_output_dir=True,
	past_index=-1,
	per_device_eval_batch_size=4,
	per_device_train_batch_size=4,
	prediction_loss_only=False,
	push_to_hub=True,
	push_to_hub_model_id=flax-community/roberta-base-mr,
	push_to_hub_organization=None,
	push_to_hub_token=vdIAyRvCACJNslYtyLHufmNDnUIyknPzUgVDMFiXqJoulvMqjoubonLJzXOJQJczWfRMJumVaMFjGSFVnQAMdswvZkzNIthKrxBeARBXfqnIwjABkKpCbjGEgnkjpjKi,
	remove_unused_columns=True,
	report_to=[],
	resume_from_checkpoint=None,
	run_name=./,
	save_on_each_node=False,
	save_steps=500,
	save_strategy=IntervalStrategy.STEPS,
	save_total_limit=None,
	seed=42,
	sharded_ddp=[],
	skip_memory_metrics=True,
	tpu_metrics_debug=False,
	tpu_num_cores=None,
	use_legacy_prediction_loop=False,
	warmup_ratio=0.0,
	warmup_steps=1000,
	weight_decay=0.0,
	)
	[16:04:24] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): s3.amazonaws.com:443
	[16:04:24] - DEBUG - urllib3.connectionpool - https://s3.amazonaws.com:443 "HEAD /datasets.huggingface.co/datasets/datasets/oscar/oscar.py HTTP/1.1" 404 0
	[16:04:24] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
	[16:04:24] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/oscar.py HTTP/1.1" 200 0
	[16:04:24] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
	[16:04:24] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/dataset_infos.json HTTP/1.1" 200 0
	[16:04:24] - WARNING - datasets.builder - Reusing dataset oscar (/home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2)
	[16:04:24] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): s3.amazonaws.com:443
	[16:04:24] - DEBUG - urllib3.connectionpool - https://s3.amazonaws.com:443 "HEAD /datasets.huggingface.co/datasets/datasets/oscar/oscar.py HTTP/1.1" 404 0
	[16:04:24] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
	[16:04:24] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/oscar.py HTTP/1.1" 200 0
	[16:04:24] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
	[16:04:24] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/dataset_infos.json HTTP/1.1" 200 0
	[16:04:24] - WARNING - datasets.builder - Reusing dataset oscar (/home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2)
	[16:04:24] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): s3.amazonaws.com:443
	[16:04:25] - DEBUG - urllib3.connectionpool - https://s3.amazonaws.com:443 "HEAD /datasets.huggingface.co/datasets/datasets/oscar/oscar.py HTTP/1.1" 404 0
	[16:04:25] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
	[16:04:25] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/oscar.py HTTP/1.1" 200 0
	[16:04:25] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
	[16:04:25] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/dataset_infos.json HTTP/1.1" 200 0
	[16:04:25] - WARNING - datasets.builder - Reusing dataset oscar (/home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2)
	0%\| \| 0/5 [00:00<?, ?ba/s] 20%\|██ \| 1/5 [00:00<00:00, 8.80ba/s] 40%\|████ \| 2/5 [00:00<00:00, 9.16ba/s] 60%\|██████ \| 3/5 [00:00<00:00, 9.52ba/s] 80%\|████████ \| 4/5 [00:00<00:00, 7.13ba/s] 100%\|██████████\| 5/5 [00:00<00:00, 8.76ba/s]
	0%\| \| 0/1 [00:00<?, ?ba/s] 100%\|██████████\| 1/1 [00:00<00:00, 21.45ba/s]
	0%\| \| 0/5 [00:00<?, ?ba/s] 20%\|██ \| 1/5 [00:00<00:03, 1.32ba/s] 40%\|████ \| 2/5 [00:01<00:02, 1.24ba/s] 60%\|██████ \| 3/5 [00:02<00:01, 1.21ba/s] 80%\|████████ \| 4/5 [00:03<00:00, 1.13ba/s] 100%\|██████████\| 5/5 [00:03<00:00, 1.43ba/s]
	0%\| \| 0/1 [00:00<?, ?ba/s] 100%\|██████████\| 1/1 [00:00<00:00, 11.84ba/s]
	[16:04:29] - WARNING - __main__ - Unable to display metrics through TensorBoard because the package is not installed: Please run pip install tensorboard to enable.
	[16:04:29] - INFO - absl - Starting the local TPU driver.
	[16:04:29] - INFO - absl - Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
	[16:04:29] - INFO - absl - Unable to initialize backend 'gpu': Not found: Could not find registered platform with name: "cuda". Available platform names are: Interpreter TPU Host
	/home/nipunsadvilkar/roberta_mr_env/lib/python3.8/site-packages/jax/lib/xla_bridge.py:382: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
	warnings.warn(
	/home/nipunsadvilkar/roberta_mr_env/lib/python3.8/site-packages/jax/lib/xla_bridge.py:369: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
	warnings.warn(
	Epoch ... (1/8): 0%\| \| 0/8 [00:00<?, ?it/s]
	Training...: 0%\| \| 0/142 [00:00<?, ?it/s][A
	Training...: 1%\| \| 1/142 [01:09<2:42:39, 69.21s/it][A
	Training...: 4%\|▎ \| 5/142 [01:09<23:35, 10.33s/it] [A
	Training...: 6%\|▋ \| 9/142 [01:09<10:22, 4.68s/it][A
	Training...: 9%\|▉ \| 13/142 [01:09<05:40, 2.64s/it][A
	Training...: 12%\|█▏ \| 17/142 [01:09<03:23, 1.63s/it][A
	Training...: 15%\|█▍ \| 21/142 [01:09<02:08, 1.06s/it][A
	Training...: 18%\|█▊ \| 25/142 [01:09<01:23, 1.40it/s][A
	Training...: 20%\|██ \| 29/142 [01:10<00:55, 2.04it/s][A
	Training...: 23%\|██▎ \| 33/142 [01:10<00:37, 2.91it/s][A
	Training...: 26%\|██▌ \| 37/142 [01:10<00:25, 4.07it/s][A
	Training...: 29%\|██▉ \| 41/142 [01:10<00:18, 5.60it/s][A
	Training...: 32%\|███▏ \| 45/142 [01:10<00:12, 7.53it/s][A
	Training...: 35%\|███▍ \| 49/142 [01:10<00:09, 9.89it/s][A
	Training...: 37%\|███▋ \| 53/142 [01:10<00:07, 11.55it/s][A
	Training...: 40%\|████ \| 57/142 [01:10<00:05, 14.49it/s][A
	Training...: 43%\|████▎ \| 61/142 [01:11<00:04, 17.61it/s][A
	Training...: 46%\|████▌ \| 65/142 [01:11<00:03, 20.76it/s][A
	Training...: 49%\|████▊ \| 69/142 [01:11<00:03, 23.71it/s][A
	Training...: 51%\|█████▏ \| 73/142 [01:11<00:02, 25.21it/s][A
	Training...: 54%\|█████▍ \| 77/142 [01:11<00:02, 27.38it/s][A
	Training...: 57%\|█████▋ \| 81/142 [01:11<00:02, 29.32it/s][A
	Training...: 60%\|█████▉ \| 85/142 [01:11<00:01, 30.64it/s][A
	Training...: 63%\|██████▎ \| 89/142 [01:11<00:01, 31.96it/s][A
	Training...: 65%\|██████▌ \| 93/142 [01:11<00:01, 32.96it/s][A
	Training...: 68%\|██████▊ \| 97/142 [01:12<00:01, 33.77it/s][A
	Training...: 71%\|███████ \| 101/142 [01:12<00:01, 28.88it/s][A
	Training...: 74%\|███████▍ \| 105/142 [01:12<00:01, 30.53it/s][A
	Training...: 77%\|███████▋ \| 109/142 [01:12<00:01, 31.69it/s][A
	Training...: 80%\|███████▉ \| 113/142 [01:12<00:00, 32.56it/s][A
	Training...: 82%\|████████▏ \| 117/142 [01:12<00:00, 33.33it/s][A
	Training...: 85%\|████████▌ \| 121/142 [01:12<00:00, 33.91it/s][A
	Training...: 88%\|████████▊ \| 125/142 [01:12<00:00, 34.33it/s][A
	Training...: 91%\|█████████ \| 129/142 [01:13<00:00, 34.57it/s][A
	Training...: 94%\|█████████▎\| 133/142 [01:13<00:00, 34.81it/s][A
	Training...: 96%\|█████████▋\| 137/142 [01:13<00:00, 34.79it/s][A
	Training...: 99%\|█████████▉\| 141/142 [01:13<00:00, 34.88it/s][A Training...: 100%\|██████████\| 142/142 [01:13<00:00, 1.93it/s]
	Epoch ... (1/8): 12%\|█▎ \| 1/8 [01:15<08:45, 75.01s/it]
	Training...: 0%\| \| 0/142 [00:00<?, ?it/s][A
	Training...: 1%\|▏ \| 2/142 [00:00<00:09, 14.83it/s][A
	Training...: 4%\|▍ \| 6/142 [00:00<00:05, 25.49it/s][A
	Training...: 7%\|▋ \| 10/142 [00:00<00:04, 29.83it/s][A
	Training...: 10%\|▉ \| 14/142 [00:00<00:03, 32.12it/s][A
	Training...: 13%\|█▎ \| 18/142 [00:00<00:03, 33.43it/s][A
	Training...: 15%\|█▌ \| 22/142 [00:00<00:03, 34.49it/s][A
	Training...: 18%\|█▊ \| 26/142 [00:00<00:03, 35.13it/s][A
	Training...: 21%\|██ \| 30/142 [00:00<00:03, 35.34it/s][A
	Training...: 24%\|██▍ \| 34/142 [00:01<00:03, 35.02it/s][A
	Training...: 27%\|██▋ \| 38/142 [00:01<00:02, 34.76it/s][A
	Training...: 30%\|██▉ \| 42/142 [00:01<00:02, 34.71it/s][A
	Training...: 32%\|███▏ \| 46/142 [00:01<00:02, 34.74it/s][A
	Training...: 35%\|███▌ \| 50/142 [00:01<00:03, 29.45it/s][A
	Training...: 38%\|███▊ \| 54/142 [00:01<00:02, 30.62it/s][A
	Training...: 41%\|████ \| 58/142 [00:01<00:02, 32.07it/s][A
	Training...: 44%\|████▎ \| 62/142 [00:01<00:02, 32.46it/s][A
	Training...: 46%\|████▋ \| 66/142 [00:02<00:02, 32.83it/s][A
	Training...: 49%\|████▉ \| 70/142 [00:02<00:02, 33.43it/s][A
	Training...: 52%\|█████▏ \| 74/142 [00:02<00:02, 33.75it/s][A
	Training...: 55%\|█████▍ \| 78/142 [00:02<00:01, 33.50it/s][A
	Training...: 58%\|█████▊ \| 82/142 [00:02<00:01, 33.88it/s][A
	Training...: 61%\|██████ \| 86/142 [00:02<00:01, 34.22it/s][A
	Training...: 63%\|██████▎ \| 90/142 [00:02<00:01, 34.66it/s][A
	Training...: 66%\|██████▌ \| 94/142 [00:02<00:01, 29.58it/s][A
	Training...: 69%\|██████▉ \| 98/142 [00:03<00:01, 31.15it/s][A
	Training...: 72%\|███████▏ \| 102/142 [00:03<00:01, 32.31it/s][A
	Training...: 75%\|███████▍ \| 106/142 [00:03<00:01, 33.06it/s][A
	Training...: 77%\|███████▋ \| 110/142 [00:03<00:00, 33.50it/s][A
	Training...: 80%\|████████ \| 114/142 [00:03<00:00, 34.14it/s][A
	Training...: 83%\|████████▎ \| 118/142 [00:03<00:00, 34.75it/s][A
	Training...: 86%\|████████▌ \| 122/142 [00:03<00:00, 35.06it/s][A
	Training...: 89%\|████████▊ \| 126/142 [00:03<00:00, 35.30it/s][A
	Training...: 92%\|█████████▏\| 130/142 [00:03<00:00, 35.25it/s][A
	Training...: 94%\|█████████▍\| 134/142 [00:04<00:00, 35.33it/s][A
	Training...: 97%\|█████████▋\| 138/142 [00:04<00:00, 35.43it/s][A
	Training...: 100%\|██████████\| 142/142 [00:04<00:00, 29.31it/s][A Training...: 100%\|██████████\| 142/142 [00:04<00:00, 32.72it/s]
	Epoch ... (1/8): 25%\|██▌ \| 2/8 [01:20<03:23, 33.88s/it]
	Training...: 0%\| \| 0/142 [00:00<?, ?it/s][A
	Training...: 3%\|▎ \| 4/142 [00:00<00:04, 32.68it/s][A
	Training...: 6%\|▌ \| 8/142 [00:00<00:03, 33.63it/s][A
	Training...: 8%\|▊ \| 12/142 [00:00<00:03, 34.11it/s][A
	Training...: 11%\|█▏ \| 16/142 [00:00<00:03, 34.59it/s][A
	Training...: 14%\|█▍ \| 20/142 [00:00<00:03, 34.64it/s][A
	Training...: 17%\|█▋ \| 24/142 [00:00<00:03, 34.90it/s][A
	Training...: 20%\|█▉ \| 28/142 [00:00<00:03, 35.08it/s][A
	Training...: 23%\|██▎ \| 32/142 [00:00<00:03, 35.12it/s][A
	Training...: 25%\|██▌ \| 36/142 [00:01<00:03, 35.02it/s][A
	Training...: 28%\|██▊ \| 40/142 [00:01<00:02, 34.59it/s][A
	Training...: 31%\|███ \| 44/142 [00:01<00:03, 28.66it/s][A
	Training...: 34%\|███▍ \| 48/142 [00:01<00:03, 29.82it/s][A
	Training...: 37%\|███▋ \| 52/142 [00:01<00:02, 31.16it/s][A
	Training...: 39%\|███▉ \| 56/142 [00:01<00:02, 32.08it/s][A
	Training...: 42%\|████▏ \| 60/142 [00:01<00:02, 32.86it/s][A
	Training...: 45%\|████▌ \| 64/142 [00:01<00:02, 33.56it/s][A
	Training...: 48%\|████▊ \| 68/142 [00:02<00:02, 33.50it/s][A
	Training...: 51%\|█████ \| 72/142 [00:02<00:02, 33.45it/s][A
	Training...: 54%\|█████▎ \| 76/142 [00:02<00:01, 33.40it/s][A
	Training...: 56%\|█████▋ \| 80/142 [00:02<00:01, 33.76it/s][A
	Training...: 59%\|█████▉ \| 84/142 [00:02<00:01, 34.01it/s][A
	Training...: 62%\|██████▏ \| 88/142 [00:02<00:01, 34.42it/s][A
	Training...: 65%\|██████▍ \| 92/142 [00:02<00:01, 29.32it/s][A
	Training...: 68%\|██████▊ \| 96/142 [00:02<00:01, 30.69it/s][A
	Training...: 70%\|███████ \| 100/142 [00:03<00:01, 31.87it/s][A
	Training...: 73%\|███████▎ \| 104/142 [00:03<00:01, 32.95it/s][A
	Training...: 76%\|███████▌ \| 108/142 [00:03<00:01, 33.52it/s][A
	Training...: 79%\|███████▉ \| 112/142 [00:03<00:00, 34.02it/s][A
	Training...: 82%\|████████▏ \| 116/142 [00:03<00:00, 34.45it/s][A
	Training...: 85%\|████████▍ \| 120/142 [00:03<00:00, 34.80it/s][A
	Training...: 87%\|████████▋ \| 124/142 [00:03<00:00, 35.17it/s][A
	Training...: 90%\|█████████ \| 128/142 [00:03<00:00, 35.39it/s][A
	Training...: 93%\|█████████▎\| 132/142 [00:03<00:00, 35.41it/s][A
	Training...: 96%\|█████████▌\| 136/142 [00:04<00:00, 29.58it/s][A
	Training...: 99%\|█████████▊\| 140/142 [00:04<00:00, 30.88it/s][A Training...: 100%\|██████████\| 142/142 [00:04<00:00, 32.95it/s]
	Epoch ... (1/8): 38%\|███▊ \| 3/8 [01:25<01:43, 20.71s/it]
	Training...: 0%\| \| 0/142 [00:00<?, ?it/s][A
	Training...: 3%\|▎ \| 4/142 [00:00<00:04, 32.53it/s][A
	Training...: 6%\|▌ \| 8/142 [00:00<00:03, 33.90it/s][A
	Training...: 8%\|▊ \| 12/142 [00:00<00:03, 34.26it/s][A
	Training...: 11%\|█▏ \| 16/142 [00:00<00:03, 34.63it/s][A
	Training...: 14%\|█▍ \| 20/142 [00:00<00:03, 34.78it/s][A
	Training...: 17%\|█▋ \| 24/142 [00:00<00:03, 34.94it/s][A
	Training...: 20%\|█▉ \| 28/142 [00:00<00:03, 34.97it/s][A
	Training...: 23%\|██▎ \| 32/142 [00:00<00:03, 35.05it/s][A
	Training...: 25%\|██▌ \| 36/142 [00:01<00:03, 35.13it/s][A
	Training...: 28%\|██▊ \| 40/142 [00:01<00:03, 28.50it/s][A
	Training...: 31%\|███ \| 44/142 [00:01<00:03, 29.94it/s][A
	Training...: 34%\|███▍ \| 48/142 [00:01<00:03, 31.28it/s][A
	Training...: 37%\|███▋ \| 52/142 [00:01<00:02, 32.30it/s][A
	Training...: 39%\|███▉ \| 56/142 [00:01<00:02, 33.20it/s][A
	Training...: 42%\|████▏ \| 60/142 [00:01<00:02, 33.35it/s][A
	Training...: 45%\|████▌ \| 64/142 [00:01<00:02, 33.99it/s][A
	Training...: 48%\|████▊ \| 68/142 [00:02<00:02, 34.16it/s][A
	Training...: 51%\|█████ \| 72/142 [00:02<00:02, 34.41it/s][A
	[A Epoch ... (1/8): 38%\|███▊ \| 3/8 [01:28<01:43, 20.71s/it]
	Training...: 51%\|█████ \| 72/142 [00:02<00:02, 34.41it/s][AStep... (500 \| Loss: 8.018753051757812, Learning Rate: 0.0001500000071246177)


	Evaluating ...: 0%\| \| 0/10 [00:00<?, ?it/s][A[A

	Evaluating ...: 10%\|█ \| 1/10 [00:04<00:41, 4.60s/it][A[A

	Evaluating ...: 90%\|█████████ \| 9/10 [00:04<00:00, 2.62it/s][A[A Evaluating ...: 100%\|██████████\| 10/10 [00:04<00:00, 2.12it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true \| false)
	huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true \| false)

	[16:06:12] - INFO - huggingface_hub.repository - git version 2.25.1
	git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
	huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true \| false)
	[16:06:12] - DEBUG - huggingface_hub.repository - [Repository] is a valid git repo
	huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true \| false)
	huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true \| false)