dat commited on Jul 16, 2021

Commit

25565f9

1 Parent(s): def9a45

update readme and pt model

Files changed (40) hide show

README.md +64 -0
events.out.tfevents.1626429561.t1v-n-f5c06ea1-w-0.782479.3.v2 +2 -2
events.out.tfevents.1626474327.t1v-n-f5c06ea1-w-0.794570.3.v2 +3 -0
events.out.tfevents.1626474410.t1v-n-f5c06ea1-w-0.796231.3.v2 +3 -0
events.out.tfevents.1626474829.t1v-n-f5c06ea1-w-0.798495.3.v2 +3 -0
pytorch_model.bin +1 -1
run.sh +3 -3
run_mlm_flax_no_accum.py +3 -2
wandb/debug-internal.log +1 -1
wandb/debug.log +1 -1
wandb/latest-run +1 -1
wandb/run-20210716_095921-13hxxunp/files/output.log +17 -0
wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json +1 -1
wandb/run-20210716_095921-13hxxunp/logs/debug-internal.log +64 -0
wandb/run-20210716_095921-13hxxunp/logs/debug.log +2 -0
wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb +0 -0
wandb/run-20210716_222528-3qk3dij4/files/config.yaml +308 -0
wandb/run-20210716_222528-3qk3dij4/files/output.log +6 -0
wandb/run-20210716_222528-3qk3dij4/files/requirements.txt +95 -0
wandb/run-20210716_222528-3qk3dij4/files/wandb-metadata.json +45 -0
wandb/run-20210716_222528-3qk3dij4/files/wandb-summary.json +1 -0
wandb/run-20210716_222528-3qk3dij4/logs/debug-internal.log +54 -0
wandb/run-20210716_222528-3qk3dij4/logs/debug.log +28 -0
wandb/run-20210716_222528-3qk3dij4/run-3qk3dij4.wandb +0 -0
wandb/run-20210716_222651-1lrzcta0/files/config.yaml +308 -0
wandb/run-20210716_222651-1lrzcta0/files/output.log +8 -0
wandb/run-20210716_222651-1lrzcta0/files/requirements.txt +95 -0
wandb/run-20210716_222651-1lrzcta0/files/wandb-metadata.json +45 -0
wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json +1 -0
wandb/run-20210716_222651-1lrzcta0/logs/debug-internal.log +111 -0
wandb/run-20210716_222651-1lrzcta0/logs/debug.log +28 -0
wandb/run-20210716_222651-1lrzcta0/run-1lrzcta0.wandb +0 -0
wandb/run-20210716_223350-8eukt20m/files/config.yaml +308 -0
wandb/run-20210716_223350-8eukt20m/files/output.log +1646 -0
wandb/run-20210716_223350-8eukt20m/files/requirements.txt +95 -0
wandb/run-20210716_223350-8eukt20m/files/wandb-metadata.json +45 -0
wandb/run-20210716_223350-8eukt20m/files/wandb-summary.json +1 -0
wandb/run-20210716_223350-8eukt20m/logs/debug-internal.log +0 -0
wandb/run-20210716_223350-8eukt20m/logs/debug.log +26 -0
wandb/run-20210716_223350-8eukt20m/run-8eukt20m.wandb +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+language: nl
+datasets:
+- mC4
+- Dutch_news
+---
+# Pino (BigBird) base model
+Dat Nguyen & Yeb Havinga
+BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.
+It is a pretrained model on Dutch language using a masked language modeling (MLM) objective. It was introduced in this [paper](https://arxiv.org/abs/2007.14062) and first released in this [repository](https://github.com/google-research/bigbird).
+## Model description
+BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts.
+## How to use
+Here is how to use this model to get the features of a given text in PyTorch:
+```python
+from transformers import BigBirdModel
+# by default its in `block_sparse` mode with num_random_blocks=3, block_size=64
+model = BigBirdModel.from_pretrained("flax-community/pino-roberta-base")
+# you can change `attention_type` to full attention like this:
+model = BigBirdModel.from_pretrained("flax-community/pino-roberta-base", attention_type="original_full")
+# you can change `block_size` & `num_random_blocks` like this:
+model = BigBirdModel.from_pretrained("flax-community/pino-roberta-base", block_size=16, num_random_blocks=2)
+```
+## Training Data
+This model is pre-trained on four publicly available datasets: **mC4**, and scraped **Dutch news** from NRC en Nu.nl. It uses the the fast universal Byte-level BPE (BBPE) in contrast to the sentence piece tokenizer and vocabulary as RoBERTa (which is in turn borrowed from GPT2).
+## Training Procedure
+The data is cleaned as follows:
+Remove texts containing HTML codes / javascript codes / loremipsum / policies
+Remove lines without end mark.
+Remove too short texts, words
+Remove too long texts, words
+Remove bad words
+## BibTeX entry and citation info
+```tex
+@misc{zaheer2021big,
+      title={Big Bird: Transformers for Longer Sequences},
+      author={Manzil Zaheer and Guru Guruganesh and Avinava Dubey and Joshua Ainslie and Chris Alberti and Santiago Ontanon and Philip Pham and Anirudh Ravula and Qifan Wang and Li Yang and Amr Ahmed},
+      year={2021},
+      eprint={2007.14062},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG}
+}
+```

events.out.tfevents.1626429561.t1v-n-f5c06ea1-w-0.782479.3.v2 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:968e47ce5036297240debd5f269c8afd988281dc089a30a1cbea0d3083893fc3
-size 15794596

 version https://git-lfs.github.com/spec/v1
+oid sha256:d746ea6c7a1002bb44f887ef5f4e46ed8344c8164b769381c0b9da5a54fbacdc
+size 15817156

events.out.tfevents.1626474327.t1v-n-f5c06ea1-w-0.794570.3.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3c3120bcdd6eb007b39508ccc87d58e9f28da9c4173915e10832c085a9525130
+size 40

events.out.tfevents.1626474410.t1v-n-f5c06ea1-w-0.796231.3.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3c3f249799957fa144a13564f964d2d639a475b7bc1e5567aba7616209a57bb8
+size 40

events.out.tfevents.1626474829.t1v-n-f5c06ea1-w-0.798495.3.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4a8aebbd12cebb7e110121ed87c8f1b9dcdef4c391cbb90c97f8e37653c5d3d0
+size 1579452

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2bb67c5dbe6876a3e485f3c060c6381b0c583763f51c53e14dfed57d103ca218
 size 512555623

 version https://git-lfs.github.com/spec/v1
+oid sha256:b790aa29e6afdb9ee10d9eb4a2b45f5db49b0f6ebfac95e8c220af4c5c68954f
 size 512555623

run.sh CHANGED Viewed

@@ -15,14 +15,14 @@ python ./run_mlm_flax_no_accum.py \
     --adam_beta1="0.9" \
     --adam_beta2="0.98" \
     --logging_steps="50" \
-    --eval_steps="6000" \
-    --num_train_epochs="5"\
     --preprocessing_num_workers="96" \
     --save_steps="15000" \
     --learning_rate="3e-5" \
     --per_device_train_batch_size="1" \
     --per_device_eval_batch_size="1" \
-    --save_total_limit="20"\
     --max_eval_samples="4000"\
     --resume_from_checkpoint="./"\
     #--gradient_accumulation_steps="4"\

     --adam_beta1="0.9" \
     --adam_beta2="0.98" \
     --logging_steps="50" \
+    --eval_steps="10000" \
+    --num_train_epochs="4"\
     --preprocessing_num_workers="96" \
     --save_steps="15000" \
     --learning_rate="3e-5" \
     --per_device_train_batch_size="1" \
     --per_device_eval_batch_size="1" \
+    --save_total_limit="50"\
     --max_eval_samples="4000"\
     --resume_from_checkpoint="./"\
     #--gradient_accumulation_steps="4"\

run_mlm_flax_no_accum.py CHANGED Viewed

@@ -422,9 +422,9 @@ if __name__ == "__main__":
         tokenized_datasets = DatasetDict.load_from_disk("/data/tokenized_data")
         logger.info("Setting max validation examples to ")
         print(f"Number of validation examples {data_args.max_eval_samples}")
-        tokenized_datasets["train"]= tokenized_datasets["train"].select(range(int(0.35*len(tokenized_datasets["train"]))))
         if data_args.max_eval_samples is not None:
-            tokenized_datasets["validation"] = tokenized_datasets["validation"].select(range(data_args.max_eval_samples))
     else:
         if training_args.do_train:
             column_names = datasets["train"].column_names
@@ -703,6 +703,7 @@ if __name__ == "__main__":
             cur_step = epoch * (num_train_samples // train_batch_size) + step
             if cur_step == resume_step:
                 logging.info('Initial compilation completed.')
             #if cur_step < resume_step:
             #    continue

         tokenized_datasets = DatasetDict.load_from_disk("/data/tokenized_data")
         logger.info("Setting max validation examples to ")
         print(f"Number of validation examples {data_args.max_eval_samples}")
+        tokenized_datasets["train"]= tokenized_datasets["train"].select(range(int(0.35*len(tokenized_datasets["train"])),int(0.7*len(tokenized_datasets["train"]))))
         if data_args.max_eval_samples is not None:
+            tokenized_datasets["validation"] = tokenized_datasets["validation"].select(range(data_args.max_eval_samples,2 * data_args.max_eval_samples))
     else:
         if training_args.do_train:
             column_names = datasets["train"].column_names
             cur_step = epoch * (num_train_samples // train_batch_size) + step
             if cur_step == resume_step:
                 logging.info('Initial compilation completed.')
+                resume_step = 0
             #if cur_step < resume_step:
             #    continue

wandb/debug-internal.log CHANGED Viewed

	@@ -1 +1 @@
1	- run-~~20210716_095921~~-~~13hxxunp~~/logs/debug-internal.log


1	+ run-20210716_223350-8eukt20m/logs/debug-internal.log

wandb/debug.log CHANGED Viewed

	@@ -1 +1 @@
1	- run-~~20210716_095921~~-~~13hxxunp~~/logs/debug.log


1	+ run-20210716_223350-8eukt20m/logs/debug.log

wandb/latest-run CHANGED Viewed

	@@ -1 +1 @@
1	- run-~~20210716_095921~~-~~13hxxunp~~


1	+ run-20210716_223350-8eukt20m

wandb/run-20210716_095921-13hxxunp/files/output.log CHANGED Viewed

@@ -12129,3 +12129,20 @@ Training...: 104999it [12:12:55,  2.71it/s]████████████
 [22:19:21] - INFO - absl - Saved checkpoint at checkpoint_330000█████████████████████████████| 500/500 [00:59<00:00,  7.90it/s]
 [22:19:22] - INFO - huggingface_hub.repository - git version 2.25.1
 git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)

 [22:19:21] - INFO - absl - Saved checkpoint at checkpoint_330000█████████████████████████████| 500/500 [00:59<00:00,  7.90it/s]
 [22:19:22] - INFO - huggingface_hub.repository - git version 2.25.1
 git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
+[22:19:23] - DEBUG - huggingface_hub.repository - [Repository] is a valid git repo
+[22:20:34] - INFO - huggingface_hub.repository - Uploading LFS objects: 100% (3/3), 2.1 GB | 46 MB/s, done.
+Training...: 105049it [12:15:27,  2.77it/s]
+Training...: 105099it [12:15:48,  2.91it/s]
+Training...: 105149it [12:16:08,  2.81it/s]

wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json CHANGED Viewed

	@@ -1 +1 @@
1	- {"training_step": ~~330000~~, "learning_rate": 2.~~4526265406166203e~~-05, "train_loss": 2.~~006321430206299~~, "_runtime": ~~44391~~, "_timestamp": ~~1626473953~~, "_step": ~~2117~~, "eval_step": 330000, "eval_accuracy": 0.6395835280418396, "eval_loss": 1.8301599025726318}


1	+ {"training_step": 330150, "learning_rate": 2.45236988121178e-05, "train_loss": 1.8688039779663086, "_runtime": 44533, "_timestamp": 1626474095, "_step": 2120, "eval_step": 330000, "eval_accuracy": 0.6395835280418396, "eval_loss": 1.8301599025726318}

wandb/run-20210716_095921-13hxxunp/logs/debug-internal.log CHANGED Viewed

@@ -26741,3 +26741,67 @@
 2021-07-16 22:19:21,321 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
 2021-07-16 22:19:24,036 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
 2021-07-16 22:19:26,037 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log

 2021-07-16 22:19:21,321 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
 2021-07-16 22:19:24,036 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
 2021-07-16 22:19:26,037 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:19:36,453 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:19:36,565 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:19:45,717 DEBUG   SenderThread:783720 [sender.py:send():179] send: stats
+2021-07-16 22:19:51,701 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:19:51,702 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:20:06,833 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:20:06,833 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:20:15,796 DEBUG   SenderThread:783720 [sender.py:send():179] send: stats
+2021-07-16 22:20:21,963 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:20:21,964 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:20:36,065 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:20:37,107 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:20:37,107 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:20:38,066 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:20:40,067 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:20:42,068 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:20:44,068 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:20:45,875 DEBUG   SenderThread:783720 [sender.py:send():179] send: stats
+2021-07-16 22:20:52,269 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:20:52,269 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:20:55,801 DEBUG   SenderThread:783720 [sender.py:send():179] send: history
+2021-07-16 22:20:55,802 DEBUG   SenderThread:783720 [sender.py:send():179] send: summary
+2021-07-16 22:20:55,802 INFO    SenderThread:783720 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
+2021-07-16 22:20:56,074 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json
+2021-07-16 22:20:58,075 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:00,076 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:02,076 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:04,077 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:07,399 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:21:07,400 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:21:15,889 DEBUG   SenderThread:783720 [sender.py:send():179] send: history
+2021-07-16 22:21:15,889 DEBUG   SenderThread:783720 [sender.py:send():179] send: summary
+2021-07-16 22:21:15,892 INFO    SenderThread:783720 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
+2021-07-16 22:21:15,956 DEBUG   SenderThread:783720 [sender.py:send():179] send: stats
+2021-07-16 22:21:16,083 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json
+2021-07-16 22:21:18,084 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:20,084 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:22,085 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:22,531 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:21:22,531 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:21:24,086 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:35,930 DEBUG   SenderThread:783720 [sender.py:send():179] send: history
+2021-07-16 22:21:35,931 DEBUG   SenderThread:783720 [sender.py:send():179] send: summary
+2021-07-16 22:21:35,931 INFO    SenderThread:783720 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
+2021-07-16 22:21:36,091 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json
+2021-07-16 22:21:37,691 DEBUG   HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:21:37,691 DEBUG   SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:21:38,092 INFO    Thread-8  :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
+2021-07-16 22:21:39,177 WARNING MainThread:783720 [internal.py:wandb_internal():147] Internal process interrupt: 1
+2021-07-16 22:21:39,673 WARNING MainThread:783720 [internal.py:wandb_internal():147] Internal process interrupt: 2
+2021-07-16 22:21:39,674 ERROR   MainThread:783720 [internal.py:wandb_internal():150] Internal process interrupted.
+2021-07-16 22:21:39,833 INFO    SenderThread:783720 [sender.py:finish():945] shutting down sender
+2021-07-16 22:21:39,833 INFO    WriterThread:783720 [datastore.py:close():288] close: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb
+2021-07-16 22:21:39,833 INFO    SenderThread:783720 [dir_watcher.py:finish():282] shutting down directory watcher
+2021-07-16 22:21:39,834 INFO    HandlerThread:783720 [handler.py:finish():638] shutting down handler
+2021-07-16 22:21:40,093 INFO    SenderThread:783720 [dir_watcher.py:finish():312] scan: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files
+2021-07-16 22:21:40,093 INFO    SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/requirements.txt requirements.txt
+2021-07-16 22:21:40,093 INFO    SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log output.log
+2021-07-16 22:21:40,094 INFO    SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-metadata.json wandb-metadata.json
+2021-07-16 22:21:40,098 INFO    SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/config.yaml config.yaml
+2021-07-16 22:21:40,098 INFO    SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json wandb-summary.json
+2021-07-16 22:21:40,098 INFO    SenderThread:783720 [file_pusher.py:finish():177] shutting down file pusher
+2021-07-16 22:21:40,098 INFO    SenderThread:783720 [file_pusher.py:join():182] waiting for file pusher
+2021-07-16 22:21:40,111 INFO    MainThread:783720 [internal.py:handle_exit():78] Internal process exited

wandb/run-20210716_095921-13hxxunp/logs/debug.log CHANGED Viewed

@@ -24,3 +24,5 @@ config: {}
 2021-07-16 09:59:24,061 INFO    MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 5.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_09-59-13_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 20, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 6000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
 2021-07-16 09:59:24,063 INFO    MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
 2021-07-16 09:59:24,065 INFO    MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}

 2021-07-16 09:59:24,061 INFO    MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 5.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_09-59-13_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 20, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 6000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
 2021-07-16 09:59:24,063 INFO    MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
 2021-07-16 09:59:24,065 INFO    MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}
+2021-07-16 22:21:39,193 INFO    MainThread:782479 [wandb_run.py:_atexit_cleanup():1593] got exitcode: 255
+2021-07-16 22:21:39,193 INFO    MainThread:782479 [wandb_run.py:_restore():1565] restore

wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb CHANGED Viewed

Binary files a/wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb and b/wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb differ

wandb/run-20210716_222528-3qk3dij4/files/config.yaml ADDED Viewed

	@@ -0,0 +1,308 @@

+wandb_version: 1
+__cached__setup_devices:
+  desc: null
+  value: cpu
+_n_gpu:
+  desc: null
+  value: 0
+_wandb:
+  desc: null
+  value:
+    cli_version: 0.10.33
+    framework: huggingface
+    huggingface_version: 4.9.0.dev0
+    is_jupyter_run: false
+    is_kaggle_kernel: false
+    python_version: 3.8.10
+    t:
+      1:
+      - 1
+      - 3
+      - 11
+      4: 3.8.10
+      5: 0.10.33
+      6: 4.9.0.dev0
+      8:
+      - 5
+adafactor:
+  desc: null
+  value: false
+adam_beta1:
+  desc: null
+  value: 0.9
+adam_beta2:
+  desc: null
+  value: 0.98
+adam_epsilon:
+  desc: null
+  value: 1.0e-08
+cache_dir:
+  desc: null
+  value: null
+config_name:
+  desc: null
+  value: ./
+dataloader_drop_last:
+  desc: null
+  value: false
+dataloader_num_workers:
+  desc: null
+  value: 0
+dataloader_pin_memory:
+  desc: null
+  value: true
+dataset_config_name:
+  desc: null
+  value: null
+dataset_name:
+  desc: null
+  value: null
+ddp_find_unused_parameters:
+  desc: null
+  value: null
+debug:
+  desc: null
+  value: []
+deepspeed:
+  desc: null
+  value: null
+disable_tqdm:
+  desc: null
+  value: false
+do_eval:
+  desc: null
+  value: false
+do_predict:
+  desc: null
+  value: false
+do_train:
+  desc: null
+  value: false
+dtype:
+  desc: null
+  value: float32
+eval_accumulation_steps:
+  desc: null
+  value: null
+eval_steps:
+  desc: null
+  value: 10000
+evaluation_strategy:
+  desc: null
+  value: IntervalStrategy.NO
+fp16:
+  desc: null
+  value: false
+fp16_backend:
+  desc: null
+  value: auto
+fp16_full_eval:
+  desc: null
+  value: false
+fp16_opt_level:
+  desc: null
+  value: O1
+gradient_accumulation_steps:
+  desc: null
+  value: 1
+greater_is_better:
+  desc: null
+  value: null
+group_by_length:
+  desc: null
+  value: false
+ignore_data_skip:
+  desc: null
+  value: false
+label_names:
+  desc: null
+  value: null
+label_smoothing_factor:
+  desc: null
+  value: 0.0
+learning_rate:
+  desc: null
+  value: 3.0e-05
+length_column_name:
+  desc: null
+  value: length
+line_by_line:
+  desc: null
+  value: false
+load_best_model_at_end:
+  desc: null
+  value: false
+local_rank:
+  desc: null
+  value: -1
+log_level:
+  desc: null
+  value: -1
+log_level_replica:
+  desc: null
+  value: -1
+log_on_each_node:
+  desc: null
+  value: true
+logging_dir:
+  desc: null
+  value: ./runs/Jul16_22-25-20_t1v-n-f5c06ea1-w-0
+logging_first_step:
+  desc: null
+  value: false
+logging_steps:
+  desc: null
+  value: 50
+logging_strategy:
+  desc: null
+  value: IntervalStrategy.STEPS
+lr_scheduler_type:
+  desc: null
+  value: SchedulerType.LINEAR
+max_eval_samples:
+  desc: null
+  value: 4000
+max_grad_norm:
+  desc: null
+  value: 1.0
+max_seq_length:
+  desc: null
+  value: 4096
+max_steps:
+  desc: null
+  value: -1
+metric_for_best_model:
+  desc: null
+  value: null
+mlm_probability:
+  desc: null
+  value: 0.15
+model_name_or_path:
+  desc: null
+  value: null
+model_type:
+  desc: null
+  value: big_bird
+mp_parameters:
+  desc: null
+  value: ''
+no_cuda:
+  desc: null
+  value: false
+num_train_epochs:
+  desc: null
+  value: 5.0
+output_dir:
+  desc: null
+  value: ./
+overwrite_cache:
+  desc: null
+  value: false
+overwrite_output_dir:
+  desc: null
+  value: true
+pad_to_max_length:
+  desc: null
+  value: false
+past_index:
+  desc: null
+  value: -1
+per_device_eval_batch_size:
+  desc: null
+  value: 1
+per_device_train_batch_size:
+  desc: null
+  value: 1
+per_gpu_eval_batch_size:
+  desc: null
+  value: null
+per_gpu_train_batch_size:
+  desc: null
+  value: null
+prediction_loss_only:
+  desc: null
+  value: false
+preprocessing_num_workers:
+  desc: null
+  value: 96
+push_to_hub:
+  desc: null
+  value: true
+push_to_hub_model_id:
+  desc: null
+  value: ''
+push_to_hub_organization:
+  desc: null
+  value: null
+push_to_hub_token:
+  desc: null
+  value: null
+remove_unused_columns:
+  desc: null
+  value: true
+report_to:
+  desc: null
+  value:
+  - tensorboard
+  - wandb
+resume_from_checkpoint:
+  desc: null
+  value: ./
+run_name:
+  desc: null
+  value: ./
+save_on_each_node:
+  desc: null
+  value: false
+save_steps:
+  desc: null
+  value: 15000
+save_strategy:
+  desc: null
+  value: IntervalStrategy.STEPS
+save_total_limit:
+  desc: null
+  value: 50
+seed:
+  desc: null
+  value: 42
+sharded_ddp:
+  desc: null
+  value: []
+skip_memory_metrics:
+  desc: null
+  value: true
+tokenizer_name:
+  desc: null
+  value: ./
+tpu_metrics_debug:
+  desc: null
+  value: false
+tpu_num_cores:
+  desc: null
+  value: null
+train_ref_file:
+  desc: null
+  value: null
+use_fast_tokenizer:
+  desc: null
+  value: true
+use_legacy_prediction_loop:
+  desc: null
+  value: false
+validation_ref_file:
+  desc: null
+  value: null
+validation_split_percentage:
+  desc: null
+  value: 5
+warmup_ratio:
+  desc: null
+  value: 0.0
+warmup_steps:
+  desc: null
+  value: 10000
+weight_decay:
+  desc: null
+  value: 0.0095

wandb/run-20210716_222528-3qk3dij4/files/output.log ADDED Viewed

	@@ -0,0 +1,6 @@

+[22:25:43] - INFO - absl - Restoring checkpoint from ./checkpoint_330000
+tcmalloc: large alloc 1530273792 bytes == 0x9d9a6000 @  0x7fd1459cb680 0x7fd1459ec824 0x5b9a14 0x50b2ae 0x50cb1b 0x5a6f17 0x5f3010 0x56fd36 0x568d9a 0x5f5b33 0x56aadf 0x568d9a 0x68cdc7 0x67e161 0x67e1df 0x67e281 0x67e627 0x6b6e62 0x6b71ed 0x7fd1457e00b3 0x5f96de
+/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
+  warnings.warn(
+/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
+  warnings.warn(

wandb/run-20210716_222528-3qk3dij4/files/requirements.txt ADDED Viewed

	@@ -0,0 +1,95 @@

+absl-py==0.13.0
+aiohttp==3.7.4.post0
+astunparse==1.6.3
+async-timeout==3.0.1
+attrs==21.2.0
+cachetools==4.2.2
+certifi==2021.5.30
+chardet==4.0.0
+charset-normalizer==2.0.1
+chex==0.0.8
+click==8.0.1
+configparser==5.0.2
+cycler==0.10.0
+datasets==1.9.1.dev0
+dill==0.3.4
+dm-tree==0.1.6
+docker-pycreds==0.4.0
+filelock==3.0.12
+flatbuffers==1.12
+flax==0.3.4
+fsspec==2021.7.0
+gast==0.4.0
+gitdb==4.0.7
+gitpython==3.1.18
+google-auth-oauthlib==0.4.4
+google-auth==1.32.1
+google-pasta==0.2.0
+grpcio==1.34.1
+h5py==3.1.0
+huggingface-hub==0.0.12
+idna==3.2
+install==1.3.4
+jax==0.2.17
+jaxlib==0.1.68
+joblib==1.0.1
+keras-nightly==2.5.0.dev2021032900
+keras-preprocessing==1.1.2
+kiwisolver==1.3.1
+libtpu-nightly==0.1.dev20210615
+markdown==3.3.4
+matplotlib==3.4.2
+msgpack==1.0.2
+multidict==5.1.0
+multiprocess==0.70.12.2
+numpy==1.19.5
+oauthlib==3.1.1
+opt-einsum==3.3.0
+optax==0.0.9
+packaging==21.0
+pandas==1.3.0
+pathtools==0.1.2
+pillow==8.3.1
+pip==20.0.2
+pkg-resources==0.0.0
+promise==2.3
+protobuf==3.17.3
+psutil==5.8.0
+pyarrow==4.0.1
+pyasn1-modules==0.2.8
+pyasn1==0.4.8
+pyparsing==2.4.7
+python-dateutil==2.8.1
+pytz==2021.1
+pyyaml==5.4.1
+regex==2021.7.6
+requests-oauthlib==1.3.0
+requests==2.26.0
+rsa==4.7.2
+sacremoses==0.0.45
+scipy==1.7.0
+sentry-sdk==1.3.0
+setuptools==44.0.0
+shortuuid==1.0.1
+six==1.15.0
+smmap==4.0.0
+subprocess32==3.5.4
+tensorboard-data-server==0.6.1
+tensorboard-plugin-wit==1.8.0
+tensorboard==2.5.0
+tensorflow-estimator==2.5.0
+tensorflow==2.5.0
+termcolor==1.1.0
+tokenizers==0.10.3
+toolz==0.11.1
+torch==1.9.0
+tqdm==4.61.2
+transformers==4.9.0.dev0
+typing-extensions==3.7.4.3
+urllib3==1.26.6
+wandb==0.10.33
+werkzeug==2.0.1
+wheel==0.36.2
+wrapt==1.12.1
+xxhash==2.0.2
+yarl==1.6.3

wandb/run-20210716_222528-3qk3dij4/files/wandb-metadata.json ADDED Viewed

	@@ -0,0 +1,45 @@

+{
+    "os": "Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29",
+    "python": "3.8.10",
+    "heartbeatAt": "2021-07-16T22:25:30.712229",
+    "startedAt": "2021-07-16T22:25:28.616115",
+    "docker": null,
+    "cpu_count": 96,
+    "cuda": null,
+    "args": [
+        "--push_to_hub",
+        "--output_dir=./",
+        "--model_type=big_bird",
+        "--config_name=./",
+        "--tokenizer_name=./",
+        "--max_seq_length=4096",
+        "--weight_decay=0.0095",
+        "--warmup_steps=10000",
+        "--overwrite_output_dir",
+        "--adam_beta1=0.9",
+        "--adam_beta2=0.98",
+        "--logging_steps=50",
+        "--eval_steps=10000",
+        "--num_train_epochs=5",
+        "--preprocessing_num_workers=96",
+        "--save_steps=15000",
+        "--learning_rate=3e-5",
+        "--per_device_train_batch_size=1",
+        "--per_device_eval_batch_size=1",
+        "--save_total_limit=50",
+        "--max_eval_samples=4000",
+        "--resume_from_checkpoint=./"
+    ],
+    "state": "running",
+    "program": "./run_mlm_flax_no_accum.py",
+    "codePath": "run_mlm_flax_no_accum.py",
+    "git": {
+        "remote": "https://huggingface.co/flax-community/pino-roberta-base",
+        "commit": "def9a456105f36b517155343f42ff643df2d20ce"
+    },
+    "email": null,
+    "root": "/home/dat/pino-roberta-base",
+    "host": "t1v-n-f5c06ea1-w-0",
+    "username": "dat",
+    "executable": "/home/dat/pino/bin/python"
+}

wandb/run-20210716_222528-3qk3dij4/files/wandb-summary.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {}

wandb/run-20210716_222528-3qk3dij4/logs/debug-internal.log ADDED Viewed

	@@ -0,0 +1,54 @@

+2021-07-16 22:25:29,348 INFO    MainThread:795833 [internal.py:wandb_internal():88] W&B internal server running at pid: 795833, started at: 2021-07-16 22:25:29.348306
+2021-07-16 22:25:29,350 DEBUG   HandlerThread:795833 [handler.py:handle_request():124] handle_request: check_version
+2021-07-16 22:25:29,351 INFO    WriterThread:795833 [datastore.py:open_for_write():80] open: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/run-3qk3dij4.wandb
+2021-07-16 22:25:29,352 DEBUG   SenderThread:795833 [sender.py:send():179] send: header
+2021-07-16 22:25:29,352 DEBUG   SenderThread:795833 [sender.py:send_request():193] send_request: check_version
+2021-07-16 22:25:29,393 DEBUG   SenderThread:795833 [sender.py:send():179] send: run
+2021-07-16 22:25:29,571 INFO    SenderThread:795833 [dir_watcher.py:__init__():168] watching files in: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files
+2021-07-16 22:25:29,571 INFO    SenderThread:795833 [sender.py:_start_run_threads():716] run started: 3qk3dij4 with start time 1626474328
+2021-07-16 22:25:29,571 DEBUG   SenderThread:795833 [sender.py:send():179] send: summary
+2021-07-16 22:25:29,572 DEBUG   HandlerThread:795833 [handler.py:handle_request():124] handle_request: run_start
+2021-07-16 22:25:29,573 INFO    SenderThread:795833 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
+2021-07-16 22:25:30,574 INFO    Thread-8  :795833 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/wandb-summary.json
+2021-07-16 22:25:30,711 DEBUG   HandlerThread:795833 [meta.py:__init__():39] meta init
+2021-07-16 22:25:30,712 DEBUG   HandlerThread:795833 [meta.py:__init__():53] meta init done
+2021-07-16 22:25:30,712 DEBUG   HandlerThread:795833 [meta.py:probe():210] probe
+2021-07-16 22:25:30,713 DEBUG   HandlerThread:795833 [meta.py:_setup_git():200] setup git
+2021-07-16 22:25:30,743 DEBUG   HandlerThread:795833 [meta.py:_setup_git():207] setup git done
+2021-07-16 22:25:30,744 DEBUG   HandlerThread:795833 [meta.py:_save_pip():57] save pip
+2021-07-16 22:25:30,744 DEBUG   HandlerThread:795833 [meta.py:_save_pip():71] save pip done
+2021-07-16 22:25:30,744 DEBUG   HandlerThread:795833 [meta.py:probe():252] probe done
+2021-07-16 22:25:30,748 DEBUG   SenderThread:795833 [sender.py:send():179] send: files
+2021-07-16 22:25:30,748 INFO    SenderThread:795833 [sender.py:_save_file():841] saving file wandb-metadata.json with policy now
+2021-07-16 22:25:30,756 DEBUG   HandlerThread:795833 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:25:30,756 DEBUG   SenderThread:795833 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:25:30,883 DEBUG   SenderThread:795833 [sender.py:send():179] send: config
+2021-07-16 22:25:30,884 DEBUG   SenderThread:795833 [sender.py:send():179] send: config
+2021-07-16 22:25:30,884 DEBUG   SenderThread:795833 [sender.py:send():179] send: config
+2021-07-16 22:25:31,207 INFO    Thread-11 :795833 [upload_job.py:push():137] Uploaded file /tmp/tmp88lb3201wandb/38nccn88-wandb-metadata.json
+2021-07-16 22:25:31,573 INFO    Thread-8  :795833 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/wandb-metadata.json
+2021-07-16 22:25:31,573 INFO    Thread-8  :795833 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/requirements.txt
+2021-07-16 22:25:31,574 INFO    Thread-8  :795833 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/output.log
+2021-07-16 22:25:45,579 INFO    Thread-8  :795833 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/output.log
+2021-07-16 22:25:45,929 DEBUG   HandlerThread:795833 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:25:45,929 DEBUG   SenderThread:795833 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:25:47,580 INFO    Thread-8  :795833 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/output.log
+2021-07-16 22:25:58,797 DEBUG   SenderThread:795833 [sender.py:send():179] send: stats
+2021-07-16 22:26:00,585 INFO    Thread-8  :795833 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/config.yaml
+2021-07-16 22:26:01,115 DEBUG   HandlerThread:795833 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:26:01,115 DEBUG   SenderThread:795833 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:26:02,749 WARNING MainThread:795833 [internal.py:wandb_internal():147] Internal process interrupt: 1
+2021-07-16 22:26:02,938 WARNING MainThread:795833 [internal.py:wandb_internal():147] Internal process interrupt: 2
+2021-07-16 22:26:02,938 ERROR   MainThread:795833 [internal.py:wandb_internal():150] Internal process interrupted.
+2021-07-16 22:26:03,122 INFO    HandlerThread:795833 [handler.py:finish():638] shutting down handler
+2021-07-16 22:26:03,245 INFO    SenderThread:795833 [sender.py:finish():945] shutting down sender
+2021-07-16 22:26:03,245 INFO    SenderThread:795833 [dir_watcher.py:finish():282] shutting down directory watcher
+2021-07-16 22:26:03,587 INFO    SenderThread:795833 [dir_watcher.py:finish():312] scan: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files
+2021-07-16 22:26:03,587 INFO    SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/requirements.txt requirements.txt
+2021-07-16 22:26:03,587 INFO    SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/output.log output.log
+2021-07-16 22:26:03,587 INFO    SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/wandb-metadata.json wandb-metadata.json
+2021-07-16 22:26:03,588 INFO    SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/config.yaml config.yaml
+2021-07-16 22:26:03,588 INFO    SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/wandb-summary.json wandb-summary.json
+2021-07-16 22:26:03,588 INFO    SenderThread:795833 [file_pusher.py:finish():177] shutting down file pusher
+2021-07-16 22:26:03,588 INFO    SenderThread:795833 [file_pusher.py:join():182] waiting for file pusher
+2021-07-16 22:26:03,685 INFO    MainThread:795833 [internal.py:handle_exit():78] Internal process exited

wandb/run-20210716_222528-3qk3dij4/logs/debug.log ADDED Viewed

	@@ -0,0 +1,28 @@

+2021-07-16 22:25:28,617 INFO    MainThread:794570 [wandb_setup.py:_flush():69] setting env: {}
+2021-07-16 22:25:28,617 INFO    MainThread:794570 [wandb_setup.py:_flush():69] setting login settings: {}
+2021-07-16 22:25:28,617 INFO    MainThread:794570 [wandb_init.py:_log_setup():337] Logging user logs to /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/logs/debug.log
+2021-07-16 22:25:28,617 INFO    MainThread:794570 [wandb_init.py:_log_setup():338] Logging internal logs to /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/logs/debug-internal.log
+2021-07-16 22:25:28,618 INFO    MainThread:794570 [wandb_init.py:init():370] calling init triggers
+2021-07-16 22:25:28,618 INFO    MainThread:794570 [wandb_init.py:init():375] wandb.init called with sweep_config: {}
+config: {}
+2021-07-16 22:25:28,618 INFO    MainThread:794570 [wandb_init.py:init():419] starting backend
+2021-07-16 22:25:28,618 INFO    MainThread:794570 [backend.py:_multiprocessing_setup():70] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
+2021-07-16 22:25:28,669 INFO    MainThread:794570 [backend.py:ensure_launched():135] starting backend process...
+2021-07-16 22:25:28,721 INFO    MainThread:794570 [backend.py:ensure_launched():139] started backend process with pid: 795833
+2021-07-16 22:25:28,723 INFO    MainThread:794570 [wandb_init.py:init():424] backend started and connected
+2021-07-16 22:25:28,727 INFO    MainThread:794570 [wandb_init.py:init():472] updated telemetry
+2021-07-16 22:25:28,728 INFO    MainThread:794570 [wandb_init.py:init():491] communicating current version
+2021-07-16 22:25:29,392 INFO    MainThread:794570 [wandb_init.py:init():496] got version response upgrade_message: "wandb version 0.11.0 is available!  To upgrade, please run:\n $ pip install wandb --upgrade"
+2021-07-16 22:25:29,392 INFO    MainThread:794570 [wandb_init.py:init():504] communicating run to backend with 30 second timeout
+2021-07-16 22:25:29,571 INFO    MainThread:794570 [wandb_init.py:init():529] starting run threads in backend
+2021-07-16 22:25:30,752 INFO    MainThread:794570 [wandb_run.py:_console_start():1623] atexit reg
+2021-07-16 22:25:30,752 INFO    MainThread:794570 [wandb_run.py:_redirect():1497] redirect: SettingsConsole.REDIRECT
+2021-07-16 22:25:30,753 INFO    MainThread:794570 [wandb_run.py:_redirect():1502] Redirecting console.
+2021-07-16 22:25:30,755 INFO    MainThread:794570 [wandb_run.py:_redirect():1558] Redirects installed.
+2021-07-16 22:25:30,755 INFO    MainThread:794570 [wandb_init.py:init():554] run started, returning control to user process
+2021-07-16 22:25:30,761 INFO    MainThread:794570 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 5.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_22-25-20_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 50, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
+2021-07-16 22:25:30,763 INFO    MainThread:794570 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
+2021-07-16 22:25:30,764 INFO    MainThread:794570 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}
+2021-07-16 22:26:02,781 INFO    MainThread:794570 [wandb_run.py:_atexit_cleanup():1593] got exitcode: 255
+2021-07-16 22:26:02,781 INFO    MainThread:794570 [wandb_run.py:_restore():1565] restore

wandb/run-20210716_222528-3qk3dij4/run-3qk3dij4.wandb ADDED Viewed

File without changes

wandb/run-20210716_222651-1lrzcta0/files/config.yaml ADDED Viewed

	@@ -0,0 +1,308 @@

+wandb_version: 1
+__cached__setup_devices:
+  desc: null
+  value: cpu
+_n_gpu:
+  desc: null
+  value: 0
+_wandb:
+  desc: null
+  value:
+    cli_version: 0.10.33
+    framework: huggingface
+    huggingface_version: 4.9.0.dev0
+    is_jupyter_run: false
+    is_kaggle_kernel: false
+    python_version: 3.8.10
+    t:
+      1:
+      - 1
+      - 3
+      - 11
+      4: 3.8.10
+      5: 0.10.33
+      6: 4.9.0.dev0
+      8:
+      - 5
+adafactor:
+  desc: null
+  value: false
+adam_beta1:
+  desc: null
+  value: 0.9
+adam_beta2:
+  desc: null
+  value: 0.98
+adam_epsilon:
+  desc: null
+  value: 1.0e-08
+cache_dir:
+  desc: null
+  value: null
+config_name:
+  desc: null
+  value: ./
+dataloader_drop_last:
+  desc: null
+  value: false
+dataloader_num_workers:
+  desc: null
+  value: 0
+dataloader_pin_memory:
+  desc: null
+  value: true
+dataset_config_name:
+  desc: null
+  value: null
+dataset_name:
+  desc: null
+  value: null
+ddp_find_unused_parameters:
+  desc: null
+  value: null
+debug:
+  desc: null
+  value: []
+deepspeed:
+  desc: null
+  value: null
+disable_tqdm:
+  desc: null
+  value: false
+do_eval:
+  desc: null
+  value: false
+do_predict:
+  desc: null
+  value: false
+do_train:
+  desc: null
+  value: false
+dtype:
+  desc: null
+  value: float32
+eval_accumulation_steps:
+  desc: null
+  value: null
+eval_steps:
+  desc: null
+  value: 10000
+evaluation_strategy:
+  desc: null
+  value: IntervalStrategy.NO
+fp16:
+  desc: null
+  value: false
+fp16_backend:
+  desc: null
+  value: auto
+fp16_full_eval:
+  desc: null
+  value: false
+fp16_opt_level:
+  desc: null
+  value: O1
+gradient_accumulation_steps:
+  desc: null
+  value: 1
+greater_is_better:
+  desc: null
+  value: null
+group_by_length:
+  desc: null
+  value: false
+ignore_data_skip:
+  desc: null
+  value: false
+label_names:
+  desc: null
+  value: null
+label_smoothing_factor:
+  desc: null
+  value: 0.0
+learning_rate:
+  desc: null
+  value: 3.0e-05
+length_column_name:
+  desc: null
+  value: length
+line_by_line:
+  desc: null
+  value: false
+load_best_model_at_end:
+  desc: null
+  value: false
+local_rank:
+  desc: null
+  value: -1
+log_level:
+  desc: null
+  value: -1
+log_level_replica:
+  desc: null
+  value: -1
+log_on_each_node:
+  desc: null
+  value: true
+logging_dir:
+  desc: null
+  value: ./runs/Jul16_22-26-42_t1v-n-f5c06ea1-w-0
+logging_first_step:
+  desc: null
+  value: false
+logging_steps:
+  desc: null
+  value: 50
+logging_strategy:
+  desc: null
+  value: IntervalStrategy.STEPS
+lr_scheduler_type:
+  desc: null
+  value: SchedulerType.LINEAR
+max_eval_samples:
+  desc: null
+  value: 4000
+max_grad_norm:
+  desc: null
+  value: 1.0
+max_seq_length:
+  desc: null
+  value: 4096
+max_steps:
+  desc: null
+  value: -1
+metric_for_best_model:
+  desc: null
+  value: null
+mlm_probability:
+  desc: null
+  value: 0.15
+model_name_or_path:
+  desc: null
+  value: null
+model_type:
+  desc: null
+  value: big_bird
+mp_parameters:
+  desc: null
+  value: ''
+no_cuda:
+  desc: null
+  value: false
+num_train_epochs:
+  desc: null
+  value: 4.0
+output_dir:
+  desc: null
+  value: ./
+overwrite_cache:
+  desc: null
+  value: false
+overwrite_output_dir:
+  desc: null
+  value: true
+pad_to_max_length:
+  desc: null
+  value: false
+past_index:
+  desc: null
+  value: -1
+per_device_eval_batch_size:
+  desc: null
+  value: 1
+per_device_train_batch_size:
+  desc: null
+  value: 1
+per_gpu_eval_batch_size:
+  desc: null
+  value: null
+per_gpu_train_batch_size:
+  desc: null
+  value: null
+prediction_loss_only:
+  desc: null
+  value: false
+preprocessing_num_workers:
+  desc: null
+  value: 96
+push_to_hub:
+  desc: null
+  value: true
+push_to_hub_model_id:
+  desc: null
+  value: ''
+push_to_hub_organization:
+  desc: null
+  value: null
+push_to_hub_token:
+  desc: null
+  value: null
+remove_unused_columns:
+  desc: null
+  value: true
+report_to:
+  desc: null
+  value:
+  - tensorboard
+  - wandb
+resume_from_checkpoint:
+  desc: null
+  value: ./
+run_name:
+  desc: null
+  value: ./
+save_on_each_node:
+  desc: null
+  value: false
+save_steps:
+  desc: null
+  value: 15000
+save_strategy:
+  desc: null
+  value: IntervalStrategy.STEPS
+save_total_limit:
+  desc: null
+  value: 50
+seed:
+  desc: null
+  value: 42
+sharded_ddp:
+  desc: null
+  value: []
+skip_memory_metrics:
+  desc: null
+  value: true
+tokenizer_name:
+  desc: null
+  value: ./
+tpu_metrics_debug:
+  desc: null
+  value: false
+tpu_num_cores:
+  desc: null
+  value: null
+train_ref_file:
+  desc: null
+  value: null
+use_fast_tokenizer:
+  desc: null
+  value: true
+use_legacy_prediction_loop:
+  desc: null
+  value: false
+validation_ref_file:
+  desc: null
+  value: null
+validation_split_percentage:
+  desc: null
+  value: 5
+warmup_ratio:
+  desc: null
+  value: 0.0
+warmup_steps:
+  desc: null
+  value: 10000
+weight_decay:
+  desc: null
+  value: 0.0095

wandb/run-20210716_222651-1lrzcta0/files/output.log ADDED Viewed

	@@ -0,0 +1,8 @@

+[22:27:05] - INFO - absl - Restoring checkpoint from ./checkpoint_330000
+tcmalloc: large alloc 1530273792 bytes == 0x9c650000 @  0x7f9d1bb89680 0x7f9d1bbaa824 0x5b9a14 0x50b2ae 0x50cb1b 0x5a6f17 0x5f3010 0x56fd36 0x568d9a 0x5f5b33 0x56aadf 0x568d9a 0x68cdc7 0x67e161 0x67e1df 0x67e281 0x67e627 0x6b6e62 0x6b71ed 0x7f9d1b99e0b3 0x5f96de
+/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
+  warnings.warn(
+/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
+  warnings.warn(
+Epoch ... (1/4):   0%|                                                                                                  | 0/4 [00:00<?, ?it/s]

wandb/run-20210716_222651-1lrzcta0/files/requirements.txt ADDED Viewed

	@@ -0,0 +1,95 @@

+absl-py==0.13.0
+aiohttp==3.7.4.post0
+astunparse==1.6.3
+async-timeout==3.0.1
+attrs==21.2.0
+cachetools==4.2.2
+certifi==2021.5.30
+chardet==4.0.0
+charset-normalizer==2.0.1
+chex==0.0.8
+click==8.0.1
+configparser==5.0.2
+cycler==0.10.0
+datasets==1.9.1.dev0
+dill==0.3.4
+dm-tree==0.1.6
+docker-pycreds==0.4.0
+filelock==3.0.12
+flatbuffers==1.12
+flax==0.3.4
+fsspec==2021.7.0
+gast==0.4.0
+gitdb==4.0.7
+gitpython==3.1.18
+google-auth-oauthlib==0.4.4
+google-auth==1.32.1
+google-pasta==0.2.0
+grpcio==1.34.1
+h5py==3.1.0
+huggingface-hub==0.0.12
+idna==3.2
+install==1.3.4
+jax==0.2.17
+jaxlib==0.1.68
+joblib==1.0.1
+keras-nightly==2.5.0.dev2021032900
+keras-preprocessing==1.1.2
+kiwisolver==1.3.1
+libtpu-nightly==0.1.dev20210615
+markdown==3.3.4
+matplotlib==3.4.2
+msgpack==1.0.2
+multidict==5.1.0
+multiprocess==0.70.12.2
+numpy==1.19.5
+oauthlib==3.1.1
+opt-einsum==3.3.0
+optax==0.0.9
+packaging==21.0
+pandas==1.3.0
+pathtools==0.1.2
+pillow==8.3.1
+pip==20.0.2
+pkg-resources==0.0.0
+promise==2.3
+protobuf==3.17.3
+psutil==5.8.0
+pyarrow==4.0.1
+pyasn1-modules==0.2.8
+pyasn1==0.4.8
+pyparsing==2.4.7
+python-dateutil==2.8.1
+pytz==2021.1
+pyyaml==5.4.1
+regex==2021.7.6
+requests-oauthlib==1.3.0
+requests==2.26.0
+rsa==4.7.2
+sacremoses==0.0.45
+scipy==1.7.0
+sentry-sdk==1.3.0
+setuptools==44.0.0
+shortuuid==1.0.1
+six==1.15.0
+smmap==4.0.0
+subprocess32==3.5.4
+tensorboard-data-server==0.6.1
+tensorboard-plugin-wit==1.8.0
+tensorboard==2.5.0
+tensorflow-estimator==2.5.0
+tensorflow==2.5.0
+termcolor==1.1.0
+tokenizers==0.10.3
+toolz==0.11.1
+torch==1.9.0
+tqdm==4.61.2
+transformers==4.9.0.dev0
+typing-extensions==3.7.4.3
+urllib3==1.26.6
+wandb==0.10.33
+werkzeug==2.0.1
+wheel==0.36.2
+wrapt==1.12.1
+xxhash==2.0.2
+yarl==1.6.3

wandb/run-20210716_222651-1lrzcta0/files/wandb-metadata.json ADDED Viewed

	@@ -0,0 +1,45 @@

+{
+    "os": "Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29",
+    "python": "3.8.10",
+    "heartbeatAt": "2021-07-16T22:26:53.104485",
+    "startedAt": "2021-07-16T22:26:51.031817",
+    "docker": null,
+    "cpu_count": 96,
+    "cuda": null,
+    "args": [
+        "--push_to_hub",
+        "--output_dir=./",
+        "--model_type=big_bird",
+        "--config_name=./",
+        "--tokenizer_name=./",
+        "--max_seq_length=4096",
+        "--weight_decay=0.0095",
+        "--warmup_steps=10000",
+        "--overwrite_output_dir",
+        "--adam_beta1=0.9",
+        "--adam_beta2=0.98",
+        "--logging_steps=50",
+        "--eval_steps=10000",
+        "--num_train_epochs=4",
+        "--preprocessing_num_workers=96",
+        "--save_steps=15000",
+        "--learning_rate=3e-5",
+        "--per_device_train_batch_size=1",
+        "--per_device_eval_batch_size=1",
+        "--save_total_limit=50",
+        "--max_eval_samples=4000",
+        "--resume_from_checkpoint=./"
+    ],
+    "state": "running",
+    "program": "./run_mlm_flax_no_accum.py",
+    "codePath": "run_mlm_flax_no_accum.py",
+    "git": {
+        "remote": "https://huggingface.co/flax-community/pino-roberta-base",
+        "commit": "def9a456105f36b517155343f42ff643df2d20ce"
+    },
+    "email": null,
+    "root": "/home/dat/pino-roberta-base",
+    "host": "t1v-n-f5c06ea1-w-0",
+    "username": "dat",
+    "executable": "/home/dat/pino/bin/python"
+}

wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {}

wandb/run-20210716_222651-1lrzcta0/logs/debug-internal.log ADDED Viewed

	@@ -0,0 +1,111 @@

+2021-07-16 22:26:51,733 INFO    MainThread:797496 [internal.py:wandb_internal():88] W&B internal server running at pid: 797496, started at: 2021-07-16 22:26:51.733071
+2021-07-16 22:26:51,735 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: check_version
+2021-07-16 22:26:51,735 INFO    WriterThread:797496 [datastore.py:open_for_write():80] open: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/run-1lrzcta0.wandb
+2021-07-16 22:26:51,736 DEBUG   SenderThread:797496 [sender.py:send():179] send: header
+2021-07-16 22:26:51,736 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: check_version
+2021-07-16 22:26:51,775 DEBUG   SenderThread:797496 [sender.py:send():179] send: run
+2021-07-16 22:26:51,967 INFO    SenderThread:797496 [dir_watcher.py:__init__():168] watching files in: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files
+2021-07-16 22:26:51,968 INFO    SenderThread:797496 [sender.py:_start_run_threads():716] run started: 1lrzcta0 with start time 1626474411
+2021-07-16 22:26:51,968 DEBUG   SenderThread:797496 [sender.py:send():179] send: summary
+2021-07-16 22:26:51,968 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: run_start
+2021-07-16 22:26:51,969 INFO    SenderThread:797496 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
+2021-07-16 22:26:52,972 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json
+2021-07-16 22:26:53,104 DEBUG   HandlerThread:797496 [meta.py:__init__():39] meta init
+2021-07-16 22:26:53,104 DEBUG   HandlerThread:797496 [meta.py:__init__():53] meta init done
+2021-07-16 22:26:53,104 DEBUG   HandlerThread:797496 [meta.py:probe():210] probe
+2021-07-16 22:26:53,105 DEBUG   HandlerThread:797496 [meta.py:_setup_git():200] setup git
+2021-07-16 22:26:53,134 DEBUG   HandlerThread:797496 [meta.py:_setup_git():207] setup git done
+2021-07-16 22:26:53,135 DEBUG   HandlerThread:797496 [meta.py:_save_pip():57] save pip
+2021-07-16 22:26:53,135 DEBUG   HandlerThread:797496 [meta.py:_save_pip():71] save pip done
+2021-07-16 22:26:53,135 DEBUG   HandlerThread:797496 [meta.py:probe():252] probe done
+2021-07-16 22:26:53,138 DEBUG   SenderThread:797496 [sender.py:send():179] send: files
+2021-07-16 22:26:53,138 INFO    SenderThread:797496 [sender.py:_save_file():841] saving file wandb-metadata.json with policy now
+2021-07-16 22:26:53,146 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:26:53,147 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:26:53,278 DEBUG   SenderThread:797496 [sender.py:send():179] send: config
+2021-07-16 22:26:53,280 DEBUG   SenderThread:797496 [sender.py:send():179] send: config
+2021-07-16 22:26:53,280 DEBUG   SenderThread:797496 [sender.py:send():179] send: config
+2021-07-16 22:26:53,574 INFO    Thread-11 :797496 [upload_job.py:push():137] Uploaded file /tmp/tmpq24rtm31wandb/2zocqi83-wandb-metadata.json
+2021-07-16 22:26:53,971 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/requirements.txt
+2021-07-16 22:26:53,972 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-metadata.json
+2021-07-16 22:26:53,972 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
+2021-07-16 22:27:07,978 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
+2021-07-16 22:27:08,331 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:27:08,331 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:27:09,979 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
+2021-07-16 22:27:11,980 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
+2021-07-16 22:27:21,188 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:27:22,985 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/config.yaml
+2021-07-16 22:27:23,496 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:27:23,496 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:27:38,629 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:27:38,629 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:27:51,269 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:27:53,761 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:27:53,762 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:28:08,896 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:28:08,896 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:28:21,343 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:28:24,028 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:28:24,029 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:28:39,163 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:28:39,163 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:28:51,416 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:28:54,295 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:28:54,295 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:29:09,427 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:29:09,428 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:29:21,488 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:29:24,558 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:29:24,559 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:29:39,688 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:29:39,689 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:29:51,560 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:29:54,818 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:29:54,818 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:30:09,948 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:30:09,948 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:30:21,629 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:30:25,078 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:30:25,078 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:30:40,210 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:30:40,211 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:30:51,688 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:30:55,351 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:30:55,351 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:31:10,485 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:31:10,485 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:31:21,758 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:31:25,617 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:31:25,617 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:31:40,750 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:31:40,750 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:31:51,829 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:31:55,880 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:31:55,881 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:32:11,013 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:32:11,014 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:32:21,898 DEBUG   SenderThread:797496 [sender.py:send():179] send: stats
+2021-07-16 22:32:26,146 DEBUG   HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
+2021-07-16 22:32:26,147 DEBUG   SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
+2021-07-16 22:32:39,632 WARNING MainThread:797496 [internal.py:wandb_internal():147] Internal process interrupt: 1
+2021-07-16 22:32:40,112 INFO    Thread-8  :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
+2021-07-16 22:32:40,157 WARNING MainThread:797496 [internal.py:wandb_internal():147] Internal process interrupt: 2
+2021-07-16 22:32:40,157 ERROR   MainThread:797496 [internal.py:wandb_internal():150] Internal process interrupted.
+2021-07-16 22:32:40,814 INFO    WriterThread:797496 [datastore.py:close():288] close: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/run-1lrzcta0.wandb
+2021-07-16 22:32:40,815 INFO    SenderThread:797496 [sender.py:finish():945] shutting down sender
+2021-07-16 22:32:40,815 INFO    SenderThread:797496 [dir_watcher.py:finish():282] shutting down directory watcher
+2021-07-16 22:32:40,815 INFO    HandlerThread:797496 [handler.py:finish():638] shutting down handler
+2021-07-16 22:32:41,113 INFO    SenderThread:797496 [dir_watcher.py:finish():312] scan: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files
+2021-07-16 22:32:41,113 INFO    SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/requirements.txt requirements.txt
+2021-07-16 22:32:41,114 INFO    SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log output.log
+2021-07-16 22:32:41,114 INFO    SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-metadata.json wandb-metadata.json
+2021-07-16 22:32:41,117 INFO    SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/config.yaml config.yaml
+2021-07-16 22:32:41,118 INFO    SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json wandb-summary.json
+2021-07-16 22:32:41,121 INFO    SenderThread:797496 [file_pusher.py:finish():177] shutting down file pusher
+2021-07-16 22:32:41,121 INFO    SenderThread:797496 [file_pusher.py:join():182] waiting for file pusher
+2021-07-16 22:32:41,606 INFO    Thread-14 :797496 [upload_job.py:push():137] Uploaded file /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/config.yaml
+2021-07-16 22:32:41,611 INFO    Thread-13 :797496 [upload_job.py:push():137] Uploaded file /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
+2021-07-16 22:32:41,674 INFO    Thread-12 :797496 [upload_job.py:push():137] Uploaded file /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/requirements.txt
+2021-07-16 22:32:41,777 INFO    Thread-15 :797496 [upload_job.py:push():137] Uploaded file /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json
+2021-07-16 22:32:42,440 INFO    MainThread:797496 [internal.py:handle_exit():78] Internal process exited

wandb/run-20210716_222651-1lrzcta0/logs/debug.log ADDED Viewed

	@@ -0,0 +1,28 @@

+2021-07-16 22:26:51,033 INFO    MainThread:796231 [wandb_setup.py:_flush():69] setting env: {}
+2021-07-16 22:26:51,033 INFO    MainThread:796231 [wandb_setup.py:_flush():69] setting login settings: {}
+2021-07-16 22:26:51,033 INFO    MainThread:796231 [wandb_init.py:_log_setup():337] Logging user logs to /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/logs/debug.log
+2021-07-16 22:26:51,033 INFO    MainThread:796231 [wandb_init.py:_log_setup():338] Logging internal logs to /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/logs/debug-internal.log
+2021-07-16 22:26:51,033 INFO    MainThread:796231 [wandb_init.py:init():370] calling init triggers
+2021-07-16 22:26:51,034 INFO    MainThread:796231 [wandb_init.py:init():375] wandb.init called with sweep_config: {}
+config: {}
+2021-07-16 22:26:51,034 INFO    MainThread:796231 [wandb_init.py:init():419] starting backend
+2021-07-16 22:26:51,034 INFO    MainThread:796231 [backend.py:_multiprocessing_setup():70] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
+2021-07-16 22:26:51,081 INFO    MainThread:796231 [backend.py:ensure_launched():135] starting backend process...
+2021-07-16 22:26:51,130 INFO    MainThread:796231 [backend.py:ensure_launched():139] started backend process with pid: 797496
+2021-07-16 22:26:51,132 INFO    MainThread:796231 [wandb_init.py:init():424] backend started and connected
+2021-07-16 22:26:51,135 INFO    MainThread:796231 [wandb_init.py:init():472] updated telemetry
+2021-07-16 22:26:51,136 INFO    MainThread:796231 [wandb_init.py:init():491] communicating current version
+2021-07-16 22:26:51,773 INFO    MainThread:796231 [wandb_init.py:init():496] got version response upgrade_message: "wandb version 0.11.0 is available!  To upgrade, please run:\n $ pip install wandb --upgrade"
+2021-07-16 22:26:51,773 INFO    MainThread:796231 [wandb_init.py:init():504] communicating run to backend with 30 second timeout
+2021-07-16 22:26:51,967 INFO    MainThread:796231 [wandb_init.py:init():529] starting run threads in backend
+2021-07-16 22:26:53,141 INFO    MainThread:796231 [wandb_run.py:_console_start():1623] atexit reg
+2021-07-16 22:26:53,142 INFO    MainThread:796231 [wandb_run.py:_redirect():1497] redirect: SettingsConsole.REDIRECT
+2021-07-16 22:26:53,142 INFO    MainThread:796231 [wandb_run.py:_redirect():1502] Redirecting console.
+2021-07-16 22:26:53,144 INFO    MainThread:796231 [wandb_run.py:_redirect():1558] Redirects installed.
+2021-07-16 22:26:53,145 INFO    MainThread:796231 [wandb_init.py:init():554] run started, returning control to user process
+2021-07-16 22:26:53,151 INFO    MainThread:796231 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 4.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_22-26-42_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 50, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
+2021-07-16 22:26:53,153 INFO    MainThread:796231 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
+2021-07-16 22:26:53,154 INFO    MainThread:796231 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}
+2021-07-16 22:32:39,717 INFO    MainThread:796231 [wandb_run.py:_atexit_cleanup():1593] got exitcode: 255
+2021-07-16 22:32:39,718 INFO    MainThread:796231 [wandb_run.py:_restore():1565] restore

wandb/run-20210716_222651-1lrzcta0/run-1lrzcta0.wandb ADDED Viewed

Binary file (6.6 kB). View file

wandb/run-20210716_223350-8eukt20m/files/config.yaml ADDED Viewed

	@@ -0,0 +1,308 @@

+wandb_version: 1
+__cached__setup_devices:
+  desc: null
+  value: cpu
+_n_gpu:
+  desc: null
+  value: 0
+_wandb:
+  desc: null
+  value:
+    cli_version: 0.10.33
+    framework: huggingface
+    huggingface_version: 4.9.0.dev0
+    is_jupyter_run: false
+    is_kaggle_kernel: false
+    python_version: 3.8.10
+    t:
+      1:
+      - 1
+      - 3
+      - 11
+      4: 3.8.10
+      5: 0.10.33
+      6: 4.9.0.dev0
+      8:
+      - 5
+adafactor:
+  desc: null
+  value: false
+adam_beta1:
+  desc: null
+  value: 0.9
+adam_beta2:
+  desc: null
+  value: 0.98
+adam_epsilon:
+  desc: null
+  value: 1.0e-08
+cache_dir:
+  desc: null
+  value: null
+config_name:
+  desc: null
+  value: ./
+dataloader_drop_last:
+  desc: null
+  value: false
+dataloader_num_workers:
+  desc: null
+  value: 0
+dataloader_pin_memory:
+  desc: null
+  value: true
+dataset_config_name:
+  desc: null
+  value: null
+dataset_name:
+  desc: null
+  value: null
+ddp_find_unused_parameters:
+  desc: null
+  value: null
+debug:
+  desc: null
+  value: []
+deepspeed:
+  desc: null
+  value: null
+disable_tqdm:
+  desc: null
+  value: false
+do_eval:
+  desc: null
+  value: false
+do_predict:
+  desc: null
+  value: false
+do_train:
+  desc: null
+  value: false
+dtype:
+  desc: null
+  value: float32
+eval_accumulation_steps:
+  desc: null
+  value: null
+eval_steps:
+  desc: null
+  value: 10000
+evaluation_strategy:
+  desc: null
+  value: IntervalStrategy.NO
+fp16:
+  desc: null
+  value: false
+fp16_backend:
+  desc: null
+  value: auto
+fp16_full_eval:
+  desc: null
+  value: false
+fp16_opt_level:
+  desc: null
+  value: O1
+gradient_accumulation_steps:
+  desc: null
+  value: 1
+greater_is_better:
+  desc: null
+  value: null
+group_by_length:
+  desc: null
+  value: false
+ignore_data_skip:
+  desc: null
+  value: false
+label_names:
+  desc: null
+  value: null
+label_smoothing_factor:
+  desc: null
+  value: 0.0
+learning_rate:
+  desc: null
+  value: 3.0e-05
+length_column_name:
+  desc: null
+  value: length
+line_by_line:
+  desc: null
+  value: false
+load_best_model_at_end:
+  desc: null
+  value: false
+local_rank:
+  desc: null
+  value: -1
+log_level:
+  desc: null
+  value: -1
+log_level_replica:
+  desc: null
+  value: -1
+log_on_each_node:
+  desc: null
+  value: true
+logging_dir:
+  desc: null
+  value: ./runs/Jul16_22-33-42_t1v-n-f5c06ea1-w-0
+logging_first_step:
+  desc: null
+  value: false
+logging_steps:
+  desc: null
+  value: 50
+logging_strategy:
+  desc: null
+  value: IntervalStrategy.STEPS
+lr_scheduler_type:
+  desc: null
+  value: SchedulerType.LINEAR
+max_eval_samples:
+  desc: null
+  value: 4000
+max_grad_norm:
+  desc: null
+  value: 1.0
+max_seq_length:
+  desc: null
+  value: 4096
+max_steps:
+  desc: null
+  value: -1
+metric_for_best_model:
+  desc: null
+  value: null
+mlm_probability:
+  desc: null
+  value: 0.15
+model_name_or_path:
+  desc: null
+  value: null
+model_type:
+  desc: null
+  value: big_bird
+mp_parameters:
+  desc: null
+  value: ''
+no_cuda:
+  desc: null
+  value: false
+num_train_epochs:
+  desc: null
+  value: 4.0
+output_dir:
+  desc: null
+  value: ./
+overwrite_cache:
+  desc: null
+  value: false
+overwrite_output_dir:
+  desc: null
+  value: true
+pad_to_max_length:
+  desc: null
+  value: false
+past_index:
+  desc: null
+  value: -1
+per_device_eval_batch_size:
+  desc: null
+  value: 1
+per_device_train_batch_size:
+  desc: null
+  value: 1
+per_gpu_eval_batch_size:
+  desc: null
+  value: null
+per_gpu_train_batch_size:
+  desc: null
+  value: null
+prediction_loss_only:
+  desc: null
+  value: false
+preprocessing_num_workers:
+  desc: null
+  value: 96
+push_to_hub:
+  desc: null
+  value: true
+push_to_hub_model_id:
+  desc: null
+  value: ''
+push_to_hub_organization:
+  desc: null
+  value: null
+push_to_hub_token:
+  desc: null
+  value: null
+remove_unused_columns:
+  desc: null
+  value: true
+report_to:
+  desc: null
+  value:
+  - tensorboard
+  - wandb
+resume_from_checkpoint:
+  desc: null
+  value: ./
+run_name:
+  desc: null
+  value: ./
+save_on_each_node:
+  desc: null
+  value: false
+save_steps:
+  desc: null
+  value: 15000
+save_strategy:
+  desc: null
+  value: IntervalStrategy.STEPS
+save_total_limit:
+  desc: null
+  value: 50
+seed:
+  desc: null
+  value: 42
+sharded_ddp:
+  desc: null
+  value: []
+skip_memory_metrics:
+  desc: null
+  value: true
+tokenizer_name:
+  desc: null
+  value: ./
+tpu_metrics_debug:
+  desc: null
+  value: false
+tpu_num_cores:
+  desc: null
+  value: null
+train_ref_file:
+  desc: null
+  value: null
+use_fast_tokenizer:
+  desc: null
+  value: true
+use_legacy_prediction_loop:
+  desc: null
+  value: false
+validation_ref_file:
+  desc: null
+  value: null
+validation_split_percentage:
+  desc: null
+  value: 5
+warmup_ratio:
+  desc: null
+  value: 0.0
+warmup_steps:
+  desc: null
+  value: 10000
+weight_decay:
+  desc: null
+  value: 0.0095

wandb/run-20210716_223350-8eukt20m/files/output.log ADDED Viewed

	@@ -0,0 +1,1646 @@

+[22:34:05] - INFO - absl - Restoring checkpoint from ./checkpoint_330000
+tcmalloc: large alloc 1530273792 bytes == 0x9c4ae000 @  0x7f2f3656a680 0x7f2f3658b824 0x5b9a14 0x50b2ae 0x50cb1b 0x5a6f17 0x5f3010 0x56fd36 0x568d9a 0x5f5b33 0x56aadf 0x568d9a 0x68cdc7 0x67e161 0x67e1df 0x67e281 0x67e627 0x6b6e62 0x6b71ed 0x7f2f3637f0b3 0x5f96de
+/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
+  warnings.warn(
+/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
+  warnings.warn(
+Epoch ... (1/4):   0%|                                                                                                  | 0/4 [00:00<?, ?it/s]
+Training...: 0it [00:00, ?it/s]
+Training...: 49it [04:31,  3.54it/s]
+Training...: 100it [04:58,  2.04s/it]
+Training...: 150it [05:18,  2.15s/it]
+Training...: 202it [05:38,  1.19s/it]
+Training...: 249it [05:52,  3.26it/s]
+Training...: 299it [06:12,  3.37it/s]
+Training...: 349it [06:32,  3.95it/s]
+Training...: 399it [06:52,  3.88it/s]
+Training...: 449it [07:13,  3.26it/s]
+Training...: 500it [07:39,  2.27s/it]
+Training...: 551it [08:00,  1.69s/it]
+Training...: 603it [08:21,  1.02it/s]
+Training...: 653it [08:41,  1.04it/s]
+Training...: 699it [08:54,  3.99it/s]
+Training...: 749it [09:13,  3.65it/s]
+Training...: 799it [09:34,  3.86it/s]
+Training...: 849it [09:54,  4.29it/s]
+Training...: 899it [10:14,  4.42it/s]
+Training...: 950it [10:41,  2.24s/it]
+Training...: 1001it [11:02,  1.78s/it]
+Training...: 1049it [11:15,  3.52it/s]
+Training...: 1099it [11:34,  3.59it/s]
+Training...: 1149it [11:55,  3.88it/s]
+Training...: 1199it [12:16,  3.57it/s]
+Training...: 1249it [12:35,  3.90it/s]
+Training...: 1300it [13:03,  2.49s/it]
+Training...: 1350it [13:23,  2.47s/it]
+Training...: 1399it [13:36,  3.15it/s]
+Training...: 1449it [13:57,  3.80it/s]
+Training...: 1499it [14:18,  3.49it/s]
+Training...: 1549it [14:37,  4.04it/s]
+Training...: 1599it [14:58,  3.46it/s]
+Training...: 1649it [15:17,  3.58it/s]
+Training...: 1700it [15:45,  2.38s/it]
+Training...: 1750it [16:05,  2.42s/it]
+Training...: 1802it [16:26,  1.40s/it]
+Training...: 1849it [16:39,  3.88it/s]
+Training...: 1899it [16:58,  3.93it/s]
+Training...: 1949it [17:18,  3.92it/s]
+Training...: 1999it [17:39,  3.55it/s]
+Training...: 2049it [17:58,  4.16it/s]
+Training...: 2099it [18:19,  4.03it/s]
+Training...: 2150it [18:47,  2.55s/it]
+Training...: 2202it [19:08,  1.25s/it]
+Training...: 2251it [19:28,  1.91s/it]
+Training...: 2299it [19:41,  3.47it/s]
+Training...: 2349it [20:00,  4.14it/s]
+Training...: 2399it [20:21,  4.23it/s]
+Training...: 2449it [20:41,  4.51it/s]
+Training...: 2499it [21:01,  4.35it/s]
+Training...: 2549it [21:22,  4.17it/s]
+Training...: 2599it [21:41,  5.19it/s]
+Training...: 2649it [22:02,  3.83it/s]
+Training...: 2699it [22:22,  3.85it/s]
+Training...: 2749it [22:42,  3.32it/s]
+Training...: 2799it [23:02,  3.71it/s]
+Training...: 2849it [23:22,  3.98it/s]
+Training...: 2899it [23:43,  4.62it/s]
+Training...: 2949it [24:02,  4.47it/s]
+Training...: 3001it [24:31,  1.80s/it]
+Training...: 3050it [24:51,  2.53s/it]
+Training...: 3099it [25:04,  3.12it/s]
+Training...: 3149it [25:24,  4.20it/s]
+Training...: 3199it [25:44,  4.03it/s]
+Training...: 3249it [26:04,  4.35it/s]
+Training...: 3299it [26:25,  4.08it/s]
+Training...: 3355it [26:54,  1.51it/s]
+Training...: 3405it [27:14,  1.65it/s]
+Training...: 3455it [27:35,  1.53it/s]
+Training...: 3499it [27:45,  4.18it/s]
+Training...: 3549it [28:06,  4.75it/s]
+Training...: 3600it [28:34,  2.52s/it]
+Training...: 3649it [28:46,  4.11it/s]
+Training...: 3703it [29:15,  1.08s/it]
+Training...: 3755it [29:36,  1.55it/s]
+Training...: 3805it [29:56,  1.46it/s]
+Training...: 3856it [30:17,  1.92it/s]
+Training...: 3899it [30:27,  4.17it/s]
+Training...: 3950it [30:55,  2.71s/it]
+Training...: 4001it [31:16,  2.00s/it]
+Training...: 4049it [31:28,  4.39it/s]
+Training...: 4102it [31:57,  1.50s/it]
+Training...: 4153it [32:17,  1.03s/it]
+Training...: 4199it [32:28,  3.92it/s]
+Training...: 4255it [32:58,  1.53it/s]
+Training...: 4299it [33:09,  4.63it/s]
+Training...: 4356it [33:39,  1.95it/s]
+Training...: 4407it [33:59,  2.24it/s]
+Training...: 4449it [34:10,  3.99it/s]
+Training...: 4501it [34:38,  2.05s/it]
+Training...: 4551it [34:58,  1.99s/it]
+Training...: 4599it [35:10,  4.03it/s]
+Training...: 4649it [35:30,  3.36it/s]
+Training...: 4699it [35:50,  3.58it/s]
+Training...: 4749it [36:11,  3.65it/s]
+Training...: 4799it [36:30,  4.12it/s]
+Training...: 4849it [36:51,  4.20it/s]
+Training...: 4901it [37:19,  1.95s/it]
+Training...: 4949it [37:31,  4.48it/s]
+Training...: 4999it [37:51,  4.76it/s]
+Training...: 5049it [38:11,  4.47it/s]
+Training...: 5099it [38:31,  4.10it/s]
+Training...: 5149it [38:51,  5.07it/s]
+Training...: 5199it [39:12,  4.14it/s]
+Training...: 5249it [39:32,  4.37it/s]
+Training...: 5300it [40:01,  2.86s/it]
+Training...: 5349it [40:13,  4.06it/s]
+Training...: 5399it [40:33,  4.06it/s]
+Training...: 5449it [40:53,  4.52it/s]
+Training...: 5499it [41:13,  4.03it/s]
+Training...: 5549it [41:33,  4.36it/s]
+Training...: 5599it [41:53,  4.64it/s]
+Training...: 5649it [42:14,  3.88it/s]
+Training...: 5699it [42:34,  4.44it/s]
+Training...: 5758it [43:05,  2.51it/s]
+Training...: 5799it [43:15,  4.46it/s]
+Training...: 5849it [43:34,  3.52it/s]
+Training...: 5899it [43:54,  3.85it/s]
+Training...: 5949it [44:15,  3.37it/s]
+Training...: 5999it [44:35,  4.29it/s]
+Training...: 6054it [45:05,  1.10it/s]
+Training...: 6099it [45:15,  4.25it/s]
+Training...: 6149it [45:36,  3.81it/s]
+Training...: 6206it [46:06,  1.92it/s]
+Training...: 6249it [46:16,  4.23it/s]
+Training...: 6300it [46:45,  2.89s/it]
+Training...: 6349it [46:55,  5.06it/s]
+Training...: 6399it [47:17,  4.44it/s]
+Training...: 6449it [47:36,  4.63it/s]
+Training...: 6499it [47:57,  5.20it/s]
+Training...: 6549it [48:17,  4.47it/s]
+Training...: 6599it [48:37,  4.40it/s]
+Training...: 6649it [48:58,  3.63it/s]
+Training...: 6699it [49:17,  5.70it/s]
+Training...: 6749it [49:37,  4.09it/s]
+Training...: 6799it [49:58,  5.26it/s]
+Training...: 6849it [50:18,  4.74it/s]
+Training...: 6899it [50:38,  4.45it/s]
+Training...: 6949it [50:58,  4.79it/s]
+Training...: 6999it [51:18,  4.90it/s]
+Training...: 7050it [51:48,  2.98s/it]
+Training...: 7103it [52:08,  1.06s/it]
+Training...: 7151it [52:28,  2.23s/it]
+Training...: 7199it [52:39,  4.10it/s]
+Training...: 7249it [52:59,  4.83it/s]
+Training...: 7299it [53:19,  3.87it/s]
+Training...: 7349it [53:39,  3.92it/s]
+Training...: 7399it [53:59,  5.25it/s]
+Training...: 7449it [54:20,  4.91it/s]
+Training...: 7499it [54:41,  4.09it/s]
+Training...: 7549it [55:01,  4.48it/s]
+Training...: 7599it [55:21,  4.50it/s]
+Training...: 7649it [55:42,  3.76it/s]
+Training...: 7699it [56:01,  4.86it/s]
+Training...: 7749it [56:21,  4.62it/s]
+Training...: 7800it [56:50,  2.99s/it]
+Training...: 7851it [57:11,  2.11s/it]
+Training...: 7901it [57:31,  2.21s/it]
+Training...: 7949it [57:42,  3.98it/s]
+Training...: 7999it [58:02,  4.47it/s]
+Training...: 8049it [58:22,  4.42it/s]
+Training...: 8099it [58:42,  5.26it/s]
+Training...: 8149it [59:03,  4.28it/s]
+Training...: 8199it [59:23,  4.92it/s]
+Training...: 8249it [59:43,  4.31it/s]
+Training...: 8299it [1:00:02,  4.23it/s]
+Training...: 8349it [1:00:24,  4.62it/s]
+Training...: 8399it [1:00:43,  4.49it/s]
+Training...: 8450it [1:01:13,  3.12s/it]
+Training...: 8499it [1:01:24,  4.54it/s]
+Training...: 8550it [1:01:53,  2.94s/it]
+Training...: 8602it [1:02:14,  1.53s/it]
+Training...: 8649it [1:02:24,  5.00it/s]
+Training...: 8699it [1:02:45,  4.03it/s]
+Training...: 8749it [1:03:04,  4.99it/s]
+Training...: 8799it [1:03:24,  4.86it/s]
+Training...: 8849it [1:03:45,  4.86it/s]
+Training...: 8899it [1:04:04,  5.70it/s]
+Training...: 8949it [1:04:25,  4.74it/s]
+Training...: 8999it [1:04:45,  3.77it/s]
+Training...: 9049it [1:05:05,  4.84it/s]
+Training...: 9099it [1:05:25,  5.31it/s]
+Training...: 9149it [1:05:46,  5.07it/s]
+Training...: 9199it [1:06:06,  3.92it/s]
+Training...: 9249it [1:06:27,  5.23it/s]
+Training...: 9300it [1:06:56,  3.00s/it]
+Training...: 9349it [1:07:06,  4.30it/s]
+Training...: 9401it [1:07:36,  2.25s/it]
+Training...: 9451it [1:07:57,  2.31s/it]
+Training...: 9500it [1:08:17,  3.12s/it]
+Training...: 9549it [1:08:28,  5.13it/s]
+Training...: 9599it [1:08:47,  5.11it/s]
+Training...: 9649it [1:09:07,  4.70it/s]
+Training...: 9699it [1:09:28,  4.87it/s]
+Training...: 9749it [1:09:47,  4.71it/s]
+Training...: 9800it [1:10:18,  3.26s/it]
+Training...: 9851it [1:10:38,  2.35s/it]
+Training...: 9900it [1:10:58,  2.97s/it]
+Training...: 9950it [1:11:18,  2.99s/it]
+Training...: 9999it [1:11:39,  4.76it/s]
+Step... (340000 | Loss: 2.0486693382263184, Learning Rate: 2.29339420911856e-05)
+Training...: 10049it [1:13:11,  3.97it/s]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10099it [1:13:31,  3.23it/s]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10151it [1:14:01,  2.26s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10203it [1:14:22,  1.23s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10252it [1:14:42,  1.59s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10301it [1:15:02,  2.32s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10353it [1:15:22,  1.25s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10402it [1:15:42,  1.64s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10449it [1:15:53,  4.42it/s]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]
+Training...: 10499it [1:16:12,  4.34it/s]████████████████████���████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00,  7.90it/s]

wandb/run-20210716_223350-8eukt20m/files/requirements.txt ADDED Viewed

	@@ -0,0 +1,95 @@

+absl-py==0.13.0
+aiohttp==3.7.4.post0
+astunparse==1.6.3
+async-timeout==3.0.1
+attrs==21.2.0
+cachetools==4.2.2
+certifi==2021.5.30
+chardet==4.0.0
+charset-normalizer==2.0.1
+chex==0.0.8
+click==8.0.1
+configparser==5.0.2
+cycler==0.10.0
+datasets==1.9.1.dev0
+dill==0.3.4
+dm-tree==0.1.6
+docker-pycreds==0.4.0
+filelock==3.0.12
+flatbuffers==1.12
+flax==0.3.4
+fsspec==2021.7.0
+gast==0.4.0
+gitdb==4.0.7
+gitpython==3.1.18
+google-auth-oauthlib==0.4.4
+google-auth==1.32.1
+google-pasta==0.2.0
+grpcio==1.34.1
+h5py==3.1.0
+huggingface-hub==0.0.12
+idna==3.2
+install==1.3.4
+jax==0.2.17
+jaxlib==0.1.68
+joblib==1.0.1
+keras-nightly==2.5.0.dev2021032900
+keras-preprocessing==1.1.2
+kiwisolver==1.3.1
+libtpu-nightly==0.1.dev20210615
+markdown==3.3.4
+matplotlib==3.4.2
+msgpack==1.0.2
+multidict==5.1.0
+multiprocess==0.70.12.2
+numpy==1.19.5
+oauthlib==3.1.1
+opt-einsum==3.3.0
+optax==0.0.9
+packaging==21.0
+pandas==1.3.0
+pathtools==0.1.2
+pillow==8.3.1
+pip==20.0.2
+pkg-resources==0.0.0
+promise==2.3
+protobuf==3.17.3
+psutil==5.8.0
+pyarrow==4.0.1
+pyasn1-modules==0.2.8
+pyasn1==0.4.8
+pyparsing==2.4.7
+python-dateutil==2.8.1
+pytz==2021.1
+pyyaml==5.4.1
+regex==2021.7.6
+requests-oauthlib==1.3.0
+requests==2.26.0
+rsa==4.7.2
+sacremoses==0.0.45
+scipy==1.7.0
+sentry-sdk==1.3.0
+setuptools==44.0.0
+shortuuid==1.0.1
+six==1.15.0
+smmap==4.0.0
+subprocess32==3.5.4
+tensorboard-data-server==0.6.1
+tensorboard-plugin-wit==1.8.0
+tensorboard==2.5.0
+tensorflow-estimator==2.5.0
+tensorflow==2.5.0
+termcolor==1.1.0
+tokenizers==0.10.3
+toolz==0.11.1
+torch==1.9.0
+tqdm==4.61.2
+transformers==4.9.0.dev0
+typing-extensions==3.7.4.3
+urllib3==1.26.6
+wandb==0.10.33
+werkzeug==2.0.1
+wheel==0.36.2
+wrapt==1.12.1
+xxhash==2.0.2
+yarl==1.6.3

wandb/run-20210716_223350-8eukt20m/files/wandb-metadata.json ADDED Viewed

	@@ -0,0 +1,45 @@

+{
+    "os": "Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29",
+    "python": "3.8.10",
+    "heartbeatAt": "2021-07-16T22:33:52.760670",
+    "startedAt": "2021-07-16T22:33:50.716895",
+    "docker": null,
+    "cpu_count": 96,
+    "cuda": null,
+    "args": [
+        "--push_to_hub",
+        "--output_dir=./",
+        "--model_type=big_bird",
+        "--config_name=./",
+        "--tokenizer_name=./",
+        "--max_seq_length=4096",
+        "--weight_decay=0.0095",
+        "--warmup_steps=10000",
+        "--overwrite_output_dir",
+        "--adam_beta1=0.9",
+        "--adam_beta2=0.98",
+        "--logging_steps=50",
+        "--eval_steps=10000",
+        "--num_train_epochs=4",
+        "--preprocessing_num_workers=96",
+        "--save_steps=15000",
+        "--learning_rate=3e-5",
+        "--per_device_train_batch_size=1",
+        "--per_device_eval_batch_size=1",
+        "--save_total_limit=50",
+        "--max_eval_samples=4000",
+        "--resume_from_checkpoint=./"
+    ],
+    "state": "running",
+    "program": "./run_mlm_flax_no_accum.py",
+    "codePath": "run_mlm_flax_no_accum.py",
+    "git": {
+        "remote": "https://huggingface.co/flax-community/pino-roberta-base",
+        "commit": "def9a456105f36b517155343f42ff643df2d20ce"
+    },
+    "email": null,
+    "root": "/home/dat/pino-roberta-base",
+    "host": "t1v-n-f5c06ea1-w-0",
+    "username": "dat",
+    "executable": "/home/dat/pino/bin/python"
+}

wandb/run-20210716_223350-8eukt20m/files/wandb-summary.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"training_step": 340500, "learning_rate": 2.2923233700566925e-05, "train_loss": 1.8997719287872314, "_runtime": 4935, "_timestamp": 1626479765, "_step": 210, "eval_step": 340000, "eval_accuracy": 0.6408576965332031, "eval_loss": 1.8213207721710205}

wandb/run-20210716_223350-8eukt20m/logs/debug-internal.log ADDED Viewed

The diff for this file is too large to render. See raw diff

wandb/run-20210716_223350-8eukt20m/logs/debug.log ADDED Viewed

	@@ -0,0 +1,26 @@

+2021-07-16 22:33:50,718 INFO    MainThread:798495 [wandb_setup.py:_flush():69] setting env: {}
+2021-07-16 22:33:50,718 INFO    MainThread:798495 [wandb_setup.py:_flush():69] setting login settings: {}
+2021-07-16 22:33:50,718 INFO    MainThread:798495 [wandb_init.py:_log_setup():337] Logging user logs to /home/dat/pino-roberta-base/wandb/run-20210716_223350-8eukt20m/logs/debug.log
+2021-07-16 22:33:50,718 INFO    MainThread:798495 [wandb_init.py:_log_setup():338] Logging internal logs to /home/dat/pino-roberta-base/wandb/run-20210716_223350-8eukt20m/logs/debug-internal.log
+2021-07-16 22:33:50,718 INFO    MainThread:798495 [wandb_init.py:init():370] calling init triggers
+2021-07-16 22:33:50,719 INFO    MainThread:798495 [wandb_init.py:init():375] wandb.init called with sweep_config: {}
+config: {}
+2021-07-16 22:33:50,719 INFO    MainThread:798495 [wandb_init.py:init():419] starting backend
+2021-07-16 22:33:50,719 INFO    MainThread:798495 [backend.py:_multiprocessing_setup():70] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
+2021-07-16 22:33:50,767 INFO    MainThread:798495 [backend.py:ensure_launched():135] starting backend process...
+2021-07-16 22:33:50,816 INFO    MainThread:798495 [backend.py:ensure_launched():139] started backend process with pid: 799749
+2021-07-16 22:33:50,818 INFO    MainThread:798495 [wandb_init.py:init():424] backend started and connected
+2021-07-16 22:33:50,821 INFO    MainThread:798495 [wandb_init.py:init():472] updated telemetry
+2021-07-16 22:33:50,822 INFO    MainThread:798495 [wandb_init.py:init():491] communicating current version
+2021-07-16 22:33:51,460 INFO    MainThread:798495 [wandb_init.py:init():496] got version response upgrade_message: "wandb version 0.11.0 is available!  To upgrade, please run:\n $ pip install wandb --upgrade"
+2021-07-16 22:33:51,460 INFO    MainThread:798495 [wandb_init.py:init():504] communicating run to backend with 30 second timeout
+2021-07-16 22:33:51,635 INFO    MainThread:798495 [wandb_init.py:init():529] starting run threads in backend
+2021-07-16 22:33:52,798 INFO    MainThread:798495 [wandb_run.py:_console_start():1623] atexit reg
+2021-07-16 22:33:52,799 INFO    MainThread:798495 [wandb_run.py:_redirect():1497] redirect: SettingsConsole.REDIRECT
+2021-07-16 22:33:52,799 INFO    MainThread:798495 [wandb_run.py:_redirect():1502] Redirecting console.
+2021-07-16 22:33:52,801 INFO    MainThread:798495 [wandb_run.py:_redirect():1558] Redirects installed.
+2021-07-16 22:33:52,801 INFO    MainThread:798495 [wandb_init.py:init():554] run started, returning control to user process
+2021-07-16 22:33:52,807 INFO    MainThread:798495 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 4.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_22-33-42_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 50, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
+2021-07-16 22:33:52,809 INFO    MainThread:798495 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
+2021-07-16 22:33:52,811 INFO    MainThread:798495 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}

wandb/run-20210716_223350-8eukt20m/run-8eukt20m.wandb ADDED Viewed

Binary file (225 kB). View file