roberta-base-mr / run.log
nipunsadvilkar's picture
Saving weights and logs of step 1000
faa64e8
raw
history blame
No virus
29.9 kB
[10:39:50] - INFO - __main__ - Training/evaluation parameters TrainingArguments(
_n_gpu=-1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_find_unused_parameters=None,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=False,
do_predict=False,
do_train=False,
eval_accumulation_steps=None,
eval_steps=500,
evaluation_strategy=IntervalStrategy.NO,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
gradient_accumulation_steps=1,
greater_is_better=None,
group_by_length=False,
ignore_data_skip=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=0.0003,
length_column_name=length,
load_best_model_at_end=False,
local_rank=-1,
log_level=-1,
log_level_replica=-1,
log_on_each_node=True,
logging_dir=./runs/Jul09_10-39-50_t1v-n-112df4a9-w-0,
logging_first_step=False,
logging_steps=500,
logging_strategy=IntervalStrategy.STEPS,
lr_scheduler_type=SchedulerType.LINEAR,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
no_cuda=False,
num_train_epochs=8.0,
output_dir=./,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=4,
per_device_train_batch_size=4,
prediction_loss_only=False,
push_to_hub=True,
push_to_hub_model_id=flax-community/roberta-base-mr,
push_to_hub_organization=None,
push_to_hub_token=vdIAyRvCACJNslYtyLHufmNDnUIyknPzUgVDMFiXqJoulvMqjoubonLJzXOJQJczWfRMJumVaMFjGSFVnQAMdswvZkzNIthKrxBeARBXfqnIwjABkKpCbjGEgnkjpjKi,
remove_unused_columns=True,
report_to=['wandb'],
resume_from_checkpoint=None,
run_name=hf-flax-robert-base-mr,
save_on_each_node=False,
save_steps=500,
save_strategy=IntervalStrategy.STEPS,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_legacy_prediction_loop=False,
warmup_ratio=0.0,
warmup_steps=1000,
weight_decay=0.0,
)
[10:39:50] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): s3.amazonaws.com:443
[10:39:50] - DEBUG - urllib3.connectionpool - https://s3.amazonaws.com:443 "HEAD /datasets.huggingface.co/datasets/datasets/oscar/oscar.py HTTP/1.1" 404 0
[10:39:50] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
[10:39:50] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/oscar.py HTTP/1.1" 200 0
[10:39:50] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
[10:39:50] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/dataset_infos.json HTTP/1.1" 200 0
[10:39:50] - WARNING - datasets.builder - Reusing dataset oscar (/home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2)
[10:39:50] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): s3.amazonaws.com:443
[10:39:51] - DEBUG - urllib3.connectionpool - https://s3.amazonaws.com:443 "HEAD /datasets.huggingface.co/datasets/datasets/oscar/oscar.py HTTP/1.1" 404 0
[10:39:51] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
[10:39:51] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/oscar.py HTTP/1.1" 200 0
[10:39:51] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
[10:39:51] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/dataset_infos.json HTTP/1.1" 200 0
[10:39:51] - WARNING - datasets.builder - Reusing dataset oscar (/home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2)
[10:39:51] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): s3.amazonaws.com:443
[10:39:51] - DEBUG - urllib3.connectionpool - https://s3.amazonaws.com:443 "HEAD /datasets.huggingface.co/datasets/datasets/oscar/oscar.py HTTP/1.1" 404 0
[10:39:51] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
[10:39:51] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/oscar.py HTTP/1.1" 200 0
[10:39:51] - DEBUG - urllib3.connectionpool - Starting new HTTPS connection (1): raw.githubusercontent.com:443
[10:39:51] - DEBUG - urllib3.connectionpool - https://raw.githubusercontent.com:443 "HEAD /huggingface/datasets/master/datasets/oscar/dataset_infos.json HTTP/1.1" 200 0
[10:39:51] - WARNING - datasets.builder - Reusing dataset oscar (/home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2)
[10:39:51] - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2/cache-0f52086e7b10d7e8.arrow
[10:39:51] - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2/cache-a39e5f5a5c6c69fc.arrow
[10:39:51] - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2/cache-e4d3282a2dd50fa0.arrow
[10:39:51] - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/nipunsadvilkar/.cache/huggingface/datasets/oscar/unshuffled_deduplicated_als/1.0.0/84838bd49d2295f62008383b05620571535451d84545037bb94d6f3501651df2/cache-b9a3aa9913be3b34.arrow
[10:39:51] - INFO - absl - Starting the local TPU driver.
[10:39:51] - INFO - absl - Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
[10:39:51] - INFO - absl - Unable to initialize backend 'gpu': Not found: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host TPU
[10:39:55] - WARNING - __main__ - Unable to display metrics through TensorBoard because some package are not installed: No module named 'tensorflow'
/home/nipunsadvilkar/roberta_mr_env/lib/python3.8/site-packages/jax/lib/xla_bridge.py:382: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
warnings.warn(
/home/nipunsadvilkar/roberta_mr_env/lib/python3.8/site-packages/jax/lib/xla_bridge.py:369: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
warnings.warn(
Epoch ... (1/8): 0%| | 0/8 [00:00<?, ?it/s]
Training...: 0%| | 0/142 [00:00<?, ?it/s]
Training...: 1%| | 1/142 [01:11<2:47:29, 71.28s/it]
Training...: 4%|▎ | 5/142 [01:11<24:17, 10.64s/it] 
Training...: 6%|▋ | 9/142 [01:11<10:42, 4.83s/it]
Training...: 9%|▉ | 13/142 [01:11<05:50, 2.72s/it]
Training...: 12%|█▏ | 17/142 [01:11<03:29, 1.68s/it]
Training...: 15%|█▍ | 21/142 [01:11<02:12, 1.09s/it]
Training...: 18%|█▊ | 25/142 [01:11<01:25, 1.36it/s]
Training...: 20%|██ | 29/142 [01:12<00:56, 1.99it/s]
Training...: 23%|██▎ | 33/142 [01:12<00:38, 2.84it/s]
Training...: 26%|██▌ | 37/142 [01:12<00:26, 4.00it/s]
Training...: 29%|██▉ | 41/142 [01:12<00:18, 5.53it/s]
Training...: 32%|███▏ | 45/142 [01:12<00:12, 7.50it/s]
Training...: 35%|███▍ | 49/142 [01:12<00:09, 9.95it/s]
Training...: 37%|███▋ | 53/142 [01:12<00:07, 12.04it/s]
Training...: 40%|████ | 57/142 [01:12<00:05, 15.21it/s]
Training...: 43%|████▎ | 61/142 [01:12<00:04, 18.68it/s]
Training...: 46%|████▌ | 65/142 [01:13<00:03, 22.25it/s]
Training...: 49%|████▊ | 69/142 [01:13<00:02, 25.60it/s]
Training...: 51%|█████▏ | 73/142 [01:13<00:02, 28.52it/s]
Training...: 54%|█████▍ | 77/142 [01:13<00:02, 31.04it/s]
Training...: 57%|█████▋ | 81/142 [01:13<00:01, 32.77it/s]
Training...: 60%|█████▉ | 85/142 [01:13<00:01, 34.41it/s]
Training...: 63%|██████▎ | 89/142 [01:13<00:01, 35.52it/s]
Training...: 65%|██████▌ | 93/142 [01:13<00:01, 36.36it/s]
Training...: 68%|██████▊ | 97/142 [01:13<00:01, 37.24it/s]
Training...: 71%|███████ | 101/142 [01:14<00:01, 31.78it/s]
Training...: 74%|███████▍ | 105/142 [01:14<00:01, 33.60it/s]
Training...: 77%|███████▋ | 109/142 [01:14<00:00, 35.07it/s]
Training...: 80%|███████▉ | 113/142 [01:14<00:00, 36.13it/s]
Training...: 82%|████████▏ | 117/142 [01:14<00:00, 36.96it/s]
Training...: 85%|████████▌ | 121/142 [01:14<00:00, 37.73it/s]
Training...: 88%|████████▊ | 125/142 [01:14<00:00, 37.99it/s]
Training...: 91%|█████████ | 129/142 [01:14<00:00, 38.18it/s]
Training...: 94%|█████████▎| 133/142 [01:14<00:00, 38.35it/s]
Training...: 96%|█████████▋| 137/142 [01:14<00:00, 38.48it/s]
Training...: 99%|█████████▉| 141/142 [01:15<00:00, 38.68it/s] Training...: 100%|██████████| 142/142 [01:15<00:00, 1.89it/s]
Epoch ... (1/8): 12%|█▎ | 1/8 [01:16<08:56, 76.63s/it]
Training...: 0%| | 0/142 [00:00<?, ?it/s]
Training...: 3%|▎ | 4/142 [00:00<00:06, 22.73it/s]
Training...: 6%|▌ | 8/142 [00:00<00:04, 30.02it/s]
Training...: 8%|▊ | 12/142 [00:00<00:03, 33.57it/s]
Training...: 11%|█▏ | 16/142 [00:00<00:03, 35.43it/s]
Training...: 14%|█▍ | 20/142 [00:00<00:03, 36.16it/s]
Training...: 17%|█▋ | 24/142 [00:00<00:03, 37.10it/s]
Training...: 20%|█▉ | 28/142 [00:00<00:03, 37.86it/s]
Training...: 23%|██▎ | 32/142 [00:00<00:02, 38.32it/s]
Training...: 25%|██▌ | 36/142 [00:00<00:02, 38.49it/s]
Training...: 28%|██▊ | 40/142 [00:01<00:02, 38.77it/s]
Training...: 31%|███ | 44/142 [00:01<00:02, 38.95it/s]
Training...: 34%|███▍ | 48/142 [00:01<00:02, 39.06it/s]
Training...: 37%|███▋ | 52/142 [00:01<00:02, 32.56it/s]
Training...: 39%|███▉ | 56/142 [00:01<00:02, 34.34it/s]
Training...: 42%|████▏ | 60/142 [00:01<00:02, 35.68it/s]
Training...: 45%|████▌ | 64/142 [00:01<00:02, 36.38it/s]
Training...: 48%|████▊ | 68/142 [00:01<00:02, 36.97it/s]
Training...: 51%|█████ | 72/142 [00:01<00:01, 37.55it/s]
Training...: 54%|█████▎ | 76/142 [00:02<00:01, 37.95it/s]
Training...: 56%|█████▋ | 80/142 [00:02<00:01, 38.38it/s]
Training...: 59%|█████▉ | 84/142 [00:02<00:01, 38.74it/s]
Training...: 62%|██████▏ | 88/142 [00:02<00:01, 38.75it/s]
Training...: 65%|██████▍ | 92/142 [00:02<00:01, 38.84it/s]
Training...: 68%|██████▊ | 96/142 [00:02<00:01, 32.04it/s]
Training...: 70%|███████ | 100/142 [00:02<00:01, 33.77it/s]
Training...: 73%|███████▎ | 104/142 [00:02<00:01, 35.13it/s]
Training...: 76%|███████▌ | 108/142 [00:02<00:00, 36.06it/s]
Training...: 79%|███████▉ | 112/142 [00:03<00:00, 36.88it/s]
Training...: 82%|████████▏ | 116/142 [00:03<00:00, 37.52it/s]
Training...: 85%|████████▍ | 120/142 [00:03<00:00, 37.90it/s]
Training...: 87%|████████▋ | 124/142 [00:03<00:00, 37.82it/s]
Training...: 90%|█████████ | 128/142 [00:03<00:00, 37.72it/s]
Training...: 93%|█████████▎| 132/142 [00:03<00:00, 37.34it/s]
Training...: 96%|█████████▌| 136/142 [00:03<00:00, 37.80it/s]
Training...: 99%|█████████▊| 140/142 [00:03<00:00, 37.94it/s] Training...: 100%|██████████| 142/142 [00:03<00:00, 36.62it/s]
Epoch ... (1/8): 25%|██▌ | 2/8 [01:21<03:27, 34.53s/it]
Training...: 0%| | 0/142 [00:00<?, ?it/s]
Training...: 1%|▏ | 2/142 [00:00<00:08, 15.66it/s]
Training...: 4%|▍ | 6/142 [00:00<00:04, 27.89it/s]
Training...: 7%|▋ | 10/142 [00:00<00:04, 32.70it/s]
Training...: 10%|▉ | 14/142 [00:00<00:03, 34.96it/s]
Training...: 13%|█▎ | 18/142 [00:00<00:03, 35.69it/s]
Training...: 15%|█▌ | 22/142 [00:00<00:03, 36.78it/s]
Training...: 18%|█▊ | 26/142 [00:00<00:03, 37.30it/s]
Training...: 21%|██ | 30/142 [00:00<00:02, 37.77it/s]
Training...: 24%|██▍ | 34/142 [00:00<00:02, 38.08it/s]
Training...: 27%|██▋ | 38/142 [00:01<00:02, 38.31it/s]
Training...: 30%|██▉ | 42/142 [00:01<00:02, 38.45it/s]
Training...: 32%|███▏ | 46/142 [00:01<00:02, 38.54it/s]
Training...: 35%|███▌ | 50/142 [00:01<00:02, 32.09it/s]
Training...: 38%|███▊ | 54/142 [00:01<00:02, 33.31it/s]
Training...: 41%|████ | 58/142 [00:01<00:02, 34.82it/s]
Training...: 44%|████▎ | 62/142 [00:01<00:02, 35.87it/s]
Training...: 46%|████▋ | 66/142 [00:01<00:02, 36.41it/s]
Training...: 49%|████▉ | 70/142 [00:01<00:01, 36.91it/s]
Training...: 52%|█████▏ | 74/142 [00:02<00:01, 37.54it/s]
Training...: 55%|█████▍ | 78/142 [00:02<00:01, 37.97it/s]
Training...: 58%|█████▊ | 82/142 [00:02<00:01, 38.26it/s]
Training...: 61%|██████ | 86/142 [00:02<00:01, 38.15it/s]
Training...: 63%|██████▎ | 90/142 [00:02<00:01, 38.21it/s]
Training...: 66%|██████▌ | 94/142 [00:02<00:01, 31.80it/s]
Training...: 69%|██████▉ | 98/142 [00:02<00:01, 33.50it/s]
Training...: 72%|███████▏ | 102/142 [00:02<00:01, 34.90it/s]
Training...: 75%|███████▍ | 106/142 [00:02<00:01, 35.99it/s]
Training...: 77%|███████▋ | 110/142 [00:03<00:00, 36.77it/s]
Training...: 80%|████████ | 114/142 [00:03<00:00, 37.46it/s]
Training...: 83%|████████▎ | 118/142 [00:03<00:00, 37.99it/s]
Training...: 86%|████████▌ | 122/142 [00:03<00:00, 38.01it/s]
Training...: 89%|████████▊ | 126/142 [00:03<00:00, 38.23it/s]
Training...: 92%|█████████▏| 130/142 [00:03<00:00, 38.39it/s]
Training...: 94%|█████████▍| 134/142 [00:03<00:00, 38.40it/s]
Training...: 97%|█████████▋| 138/142 [00:03<00:00, 32.32it/s]
Training...: 100%|██████████| 142/142 [00:03<00:00, 33.73it/s] Training...: 100%|██████████| 142/142 [00:03<00:00, 35.76it/s]
Epoch ... (1/8): 38%|███▊ | 3/8 [01:26<01:45, 21.11s/it]
Training...: 0%| | 0/142 [00:00<?, ?it/s]
Training...: 3%|▎ | 4/142 [00:00<00:03, 36.08it/s]
Training...: 6%|▌ | 8/142 [00:00<00:03, 36.95it/s]
Training...: 8%|▊ | 12/142 [00:00<00:03, 37.42it/s]
Training...: 11%|█▏ | 16/142 [00:00<00:03, 38.09it/s]
Training...: 14%|█▍ | 20/142 [00:00<00:03, 38.37it/s]
Training...: 17%|█▋ | 24/142 [00:00<00:03, 38.66it/s]
Training...: 20%|█▉ | 28/142 [00:00<00:02, 38.79it/s]
Training...: 23%|██▎ | 32/142 [00:00<00:02, 38.70it/s]
Training...: 25%|██▌ | 36/142 [00:00<00:02, 38.85it/s]
Training...: 28%|██▊ | 40/142 [00:01<00:02, 39.07it/s]
Training...: 31%|███ | 44/142 [00:01<00:02, 38.60it/s]
Training...: 34%|███▍ | 48/142 [00:01<00:03, 31.07it/s]
Training...: 37%|███▋ | 52/142 [00:01<00:02, 33.04it/s]
Training...: 39%|███▉ | 56/142 [00:01<00:02, 34.37it/s]
Training...: 42%|████▏ | 60/142 [00:01<00:02, 35.64it/s]
Training...: 45%|████▌ | 64/142 [00:01<00:02, 36.55it/s]
Training...: 48%|████▊ | 68/142 [00:01<00:01, 37.17it/s]
Training...: 51%|█████ | 72/142 [00:01<00:01, 37.75it/s]
 Epoch ... (1/8): 38%|███▊ | 3/8 [01:30<01:45, 21.11s/it]
Training...: 51%|█████ | 72/142 [00:02<00:01, 37.75it/s]Step... (500 | Loss: 8.018753051757812, Learning Rate: 0.0001500000071246177)
Evaluating ...: 0%| | 0/10 [00:00<?, ?it/s]
Evaluating ...: 10%|█ | 1/10 [00:03<00:35, 3.99s/it]
Evaluating ...: 100%|██████████| 10/10 [00:04<00:00, 3.34it/s] Evaluating ...: 100%|██████████| 10/10 [00:04<00:00, 2.44it/s]
[10:41:35] - INFO - huggingface_hub.repository - git version 2.25.1
git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
[10:41:35] - DEBUG - huggingface_hub.repository - [Repository] is a valid git repo
Training...: 51%|█████ | 72/142 [00:13<00:01, 37.75it/s][10:41:45] - INFO - huggingface_hub.repository - Uploading LFS objects: 100% (1/1), 499 MB | 0 B/s, done.
Training...: 53%|█████▎ | 75/142 [00:17<01:28, 1.32s/it]
Training...: 56%|█████▌ | 79/142 [00:18<00:57, 1.10it/s]
Training...: 58%|█████▊ | 83/142 [00:18<00:37, 1.57it/s]
Training...: 61%|██████▏ | 87/142 [00:18<00:24, 2.23it/s]
Training...: 64%|██████▍ | 91/142 [00:18<00:16, 3.14it/s]
Training...: 67%|██████▋ | 95/142 [00:18<00:10, 4.35it/s]
Training...: 70%|██████▉ | 99/142 [00:18<00:07, 5.94it/s]
Training...: 73%|███████▎ | 103/142 [00:18<00:05, 7.59it/s]
Training...: 75%|███████▌ | 107/142 [00:18<00:03, 10.00it/s]
Training...: 78%|███████▊ | 111/142 [00:19<00:02, 12.87it/s]
Training...: 81%|████████ | 115/142 [00:19<00:01, 16.11it/s]
Training...: 84%|████████▍ | 119/142 [00:19<00:01, 19.52it/s]
Training...: 87%|████████▋ | 123/142 [00:19<00:00, 22.83it/s]
Training...: 89%|████████▉ | 127/142 [00:19<00:00, 26.05it/s]
Training...: 92%|█████████▏| 131/142 [00:19<00:00, 28.85it/s]
Training...: 95%|█████████▌| 135/142 [00:19<00:00, 31.19it/s]
Training...: 98%|█████████▊| 139/142 [00:19<00:00, 33.12it/s] Training...: 100%|██████████| 142/142 [00:19<00:00, 7.17it/s]
Step... (500 | Loss: 8.205772399902344, Acc: 0.0773010179400444): 50%|█████ | 4/8 [01:47<01:24, 21.03s/it]
Training...: 0%| | 0/142 [00:00<?, ?it/s]
Training...: 3%|▎ | 4/142 [00:00<00:03, 36.55it/s]
Training...: 6%|▌ | 8/142 [00:00<00:03, 37.90it/s]
Training...: 8%|▊ | 12/142 [00:00<00:04, 29.06it/s]
Training...: 11%|█▏ | 16/142 [00:00<00:03, 32.02it/s]
Training...: 14%|█▍ | 20/142 [00:00<00:03, 34.16it/s]
Training...: 17%|█▋ | 24/142 [00:00<00:03, 35.55it/s]
Training...: 20%|█▉ | 28/142 [00:00<00:03, 36.39it/s]
Training...: 23%|██▎ | 32/142 [00:00<00:02, 36.97it/s]
Training...: 25%|██▌ | 36/142 [00:01<00:02, 37.26it/s]
Training...: 28%|██▊ | 40/142 [00:01<00:02, 37.64it/s]
Training...: 31%|███ | 44/142 [00:01<00:02, 37.99it/s]
Training...: 34%|███▍ | 48/142 [00:01<00:02, 38.07it/s]
Training...: 37%|███▋ | 52/142 [00:01<00:02, 38.06it/s]
Training...: 39%|███▉ | 56/142 [00:01<00:02, 38.16it/s]
Training...: 42%|████▏ | 60/142 [00:01<00:02, 38.21it/s]
Training...: 45%|████▌ | 64/142 [00:01<00:02, 31.65it/s]
Training...: 48%|████▊ | 68/142 [00:01<00:02, 33.46it/s]
Training...: 51%|█████ | 72/142 [00:02<00:02, 34.76it/s]
Training...: 54%|█████▎ | 76/142 [00:02<00:01, 35.75it/s]
Training...: 56%|█████▋ | 80/142 [00:02<00:01, 36.59it/s]
Training...: 59%|█████▉ | 84/142 [00:02<00:01, 37.14it/s]
Training...: 62%|██████▏ | 88/142 [00:02<00:01, 37.29it/s]
Training...: 65%|██████▍ | 92/142 [00:02<00:01, 37.80it/s]
Training...: 68%|██████▊ | 96/142 [00:02<00:01, 38.07it/s]
Training...: 70%|███████ | 100/142 [00:02<00:01, 38.38it/s]
Training...: 73%|███████▎ | 104/142 [00:02<00:00, 38.51it/s]
Training...: 76%|███████▌ | 108/142 [00:02<00:00, 38.29it/s]
Training...: 79%|███████▉ | 112/142 [00:03<00:00, 31.72it/s]
Training...: 82%|████████▏ | 116/142 [00:03<00:00, 33.22it/s]
Training...: 85%|████████▍ | 120/142 [00:03<00:00, 34.52it/s]
Training...: 87%|████████▋ | 124/142 [00:03<00:00, 35.43it/s]
Training...: 90%|█████████ | 128/142 [00:03<00:00, 36.22it/s]
Training...: 93%|█████████▎| 132/142 [00:03<00:00, 36.56it/s]
Training...: 96%|█████████▌| 136/142 [00:03<00:00, 36.70it/s]
Training...: 99%|█████████▊| 140/142 [00:03<00:00, 36.69it/s] Training...: 100%|██████████| 142/142 [00:03<00:00, 36.06it/s]
Step... (500 | Loss: 8.205772399902344, Acc: 0.0773010179400444): 62%|██████▎ | 5/8 [01:52<00:45, 15.10s/it]
Training...: 0%| | 0/142 [00:00<?, ?it/s]
Training...: 3%|▎ | 4/142 [00:00<00:03, 34.85it/s]
Training...: 6%|▌ | 8/142 [00:00<00:03, 36.26it/s]
Training...: 8%|▊ | 12/142 [00:00<00:03, 36.55it/s]
Training...: 11%|█▏ | 16/142 [00:00<00:03, 36.40it/s]
Training...: 14%|█▍ | 20/142 [00:00<00:03, 36.70it/s]
Training...: 17%|█▋ | 24/142 [00:00<00:03, 29.87it/s]
Training...: 20%|█▉ | 28/142 [00:00<00:03, 30.01it/s]
Training...: 23%|██▎ | 32/142 [00:00<00:03, 31.93it/s]
Training...: 25%|██▌ | 36/142 [00:01<00:03, 33.23it/s]
Training...: 28%|██▊ | 40/142 [00:01<00:02, 34.10it/s]
Training...: 31%|███ | 44/142 [00:01<00:02, 34.87it/s]
Training...: 34%|███▍ | 48/142 [00:01<00:02, 35.52it/s]
Training...: 37%|███▋ | 52/142 [00:01<00:02, 36.11it/s]
Training...: 39%|███▉ | 56/142 [00:01<00:02, 36.62it/s]
Training...: 42%|████▏ | 60/142 [00:01<00:02, 36.92it/s]
Training...: 45%|████▌ | 64/142 [00:01<00:02, 36.97it/s]
Training...: 48%|████▊ | 68/142 [00:01<00:02, 36.96it/s]
Training...: 51%|█████ | 72/142 [00:02<00:02, 30.68it/s]
Training...: 54%|█████▎ | 76/142 [00:02<00:02, 32.39it/s]
Training...: 56%|█████▋ | 80/142 [00:02<00:01, 33.79it/s]
Training...: 59%|█████▉ | 84/142 [00:02<00:01, 34.84it/s]
Training...: 62%|██████▏ | 88/142 [00:02<00:01, 35.62it/s]
Training...: 65%|██████▍ | 92/142 [00:02<00:01, 36.20it/s]
Training...: 68%|██████▊ | 96/142 [00:02<00:01, 36.63it/s]
Training...: 70%|███████ | 100/142 [00:02<00:01, 36.78it/s]
Training...: 73%|███████▎ | 104/142 [00:02<00:01, 37.14it/s]
Training...: 76%|███████▌ | 108/142 [00:03<00:00, 37.25it/s]
Training...: 79%|███████▉ | 112/142 [00:03<00:00, 37.32it/s]
Training...: 82%|████████▏ | 116/142 [00:03<00:00, 37.39it/s]
Training...: 85%|████████▍ | 120/142 [00:03<00:00, 37.24it/s]
Training...: 87%|████████▋ | 124/142 [00:03<00:00, 30.76it/s]
Training...: 90%|█████████ | 128/142 [00:03<00:00, 32.55it/s]
Training...: 93%|█████████▎| 132/142 [00:03<00:00, 33.90it/s]
Training...: 96%|█████████▌| 136/142 [00:03<00:00, 34.84it/s]
Training...: 99%|█████████▊| 140/142 [00:04<00:00, 35.61it/s] Training...: 100%|██████████| 142/142 [00:04<00:00, 34.90it/s]
Step... (500 | Loss: 8.205772399902344, Acc: 0.0773010179400444): 75%|███████▌ | 6/8 [01:57<00:23, 11.71s/it]
Training...: 0%| | 0/142 [00:00<?, ?it/s]
Training...: 3%|▎ | 4/142 [00:00<00:03, 34.82it/s]
Training...: 6%|▌ | 8/142 [00:00<00:03, 36.31it/s]
Training...: 8%|▊ | 12/142 [00:00<00:03, 36.77it/s]
Training...: 11%|█▏ | 16/142 [00:00<00:03, 36.94it/s]
Training...: 14%|█▍ | 20/142 [00:00<00:03, 37.29it/s]
Training...: 17%|█▋ | 24/142 [00:00<00:03, 37.27it/s]
Training...: 20%|█▉ | 28/142 [00:00<00:03, 37.30it/s]
Training...: 23%|██▎ | 32/142 [00:00<00:03, 30.36it/s]
Training...: 25%|██▌ | 36/142 [00:01<00:03, 32.21it/s]
Training...: 28%|██▊ | 40/142 [00:01<00:03, 33.69it/s]
Training...: 31%|███ | 44/142 [00:01<00:02, 34.69it/s]
Training...: 34%|███▍ | 48/142 [00:01<00:02, 35.49it/s]
Training...: 37%|███▋ | 52/142 [00:01<00:02, 35.93it/s]
Training...: 39%|███▉ | 56/142 [00:01<00:02, 36.41it/s]
Training...: 42%|████▏ | 60/142 [00:01<00:02, 36.70it/s]
Training...: 45%|████▌ | 64/142 [00:01<00:02, 36.97it/s]
Training...: 48%|████▊ | 68/142 [00:01<00:01, 37.02it/s]
Training...: 51%|█████ | 72/142 [00:02<00:01, 36.95it/s]
Training...: 54%|█████▎ | 76/142 [00:02<00:01, 36.96it/s]
Training...: 56%|█████▋ | 80/142 [00:02<00:01, 37.19it/s]
Training...: 59%|█████▉ | 84/142 [00:02<00:01, 30.86it/s]
Training...: 62%|██████▏ | 88/142 [00:02<00:01, 32.52it/s]
Training...: 65%|██████▍ | 92/142 [00:02<00:01, 33.90it/s]
Training...: 68%|██████▊ | 96/142 [00:02<00:01, 34.89it/s]
Training...: 70%|███████ | 100/142 [00:02<00:01, 35.49it/s]
Training...: 73%|███████▎ | 104/142 [00:02<00:01, 36.04it/s]
Training...: 76%|███████▌ | 108/142 [00:03<00:00, 36.26it/s]
Training...: 79%|███████▉ | 112/142 [00:03<00:00, 36.46it/s]
Training...: 82%|████████▏ | 116/142 [00:03<00:00, 36.83it/s]
Training...: 85%|████████▍ | 120/142 [00:03<00:00, 36.95it/s]
Training...: 87%|████████▋ | 124/142 [00:03<00:00, 36.81it/s]
Training...: 90%|█████████ | 128/142 [00:03<00:00, 36.96it/s]
Training...: 93%|█████████▎| 132/142 [00:03<00:00, 30.71it/s]
Training...: 96%|█████████▌| 136/142 [00:03<00:00, 32.31it/s]
Training...: 99%|█████████▊| 140/142 [00:03<00:00, 33.64it/s] Training...: 100%|██████████| 142/142 [00:04<00:00, 35.11it/s]
Step... (500 | Loss: 8.205772399902344, Acc: 0.0773010179400444): 88%|████████▊ | 7/8 [02:02<00:09, 9.51s/it]
Training...: 0%| | 0/142 [00:00<?, ?it/s]
Training...: 3%|▎ | 4/142 [00:00<00:03, 34.95it/s]
 Step... (500 | Loss: 8.205772399902344, Acc: 0.0773010179400444): 88%|████████▊ | 7/8 [02:03<00:09, 9.51s/it]
Training...: 3%|▎ | 4/142 [00:00<00:03, 34.95it/s]Step... (1000 | Loss: 7.807257652282715, Learning Rate: 0.0003000000142492354)
Evaluating ...: 0%| | 0/10 [00:00<?, ?it/s]
Evaluating ...: 80%|████████ | 8/10 [00:00<00:00, 78.51it/s] Evaluating ...: 100%|██████████| 10/10 [00:00<00:00, 78.72it/s]
[10:42:04] - INFO - huggingface_hub.repository - git version 2.25.1
git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
[10:42:04] - DEBUG - huggingface_hub.repository - [Repository] is a valid git repo