dat commited on
Commit
25565f9
1 Parent(s): def9a45

update readme and pt model

Browse files
Files changed (40) hide show
  1. README.md +64 -0
  2. events.out.tfevents.1626429561.t1v-n-f5c06ea1-w-0.782479.3.v2 +2 -2
  3. events.out.tfevents.1626474327.t1v-n-f5c06ea1-w-0.794570.3.v2 +3 -0
  4. events.out.tfevents.1626474410.t1v-n-f5c06ea1-w-0.796231.3.v2 +3 -0
  5. events.out.tfevents.1626474829.t1v-n-f5c06ea1-w-0.798495.3.v2 +3 -0
  6. pytorch_model.bin +1 -1
  7. run.sh +3 -3
  8. run_mlm_flax_no_accum.py +3 -2
  9. wandb/debug-internal.log +1 -1
  10. wandb/debug.log +1 -1
  11. wandb/latest-run +1 -1
  12. wandb/run-20210716_095921-13hxxunp/files/output.log +17 -0
  13. wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json +1 -1
  14. wandb/run-20210716_095921-13hxxunp/logs/debug-internal.log +64 -0
  15. wandb/run-20210716_095921-13hxxunp/logs/debug.log +2 -0
  16. wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb +0 -0
  17. wandb/run-20210716_222528-3qk3dij4/files/config.yaml +308 -0
  18. wandb/run-20210716_222528-3qk3dij4/files/output.log +6 -0
  19. wandb/run-20210716_222528-3qk3dij4/files/requirements.txt +95 -0
  20. wandb/run-20210716_222528-3qk3dij4/files/wandb-metadata.json +45 -0
  21. wandb/run-20210716_222528-3qk3dij4/files/wandb-summary.json +1 -0
  22. wandb/run-20210716_222528-3qk3dij4/logs/debug-internal.log +54 -0
  23. wandb/run-20210716_222528-3qk3dij4/logs/debug.log +28 -0
  24. wandb/run-20210716_222528-3qk3dij4/run-3qk3dij4.wandb +0 -0
  25. wandb/run-20210716_222651-1lrzcta0/files/config.yaml +308 -0
  26. wandb/run-20210716_222651-1lrzcta0/files/output.log +8 -0
  27. wandb/run-20210716_222651-1lrzcta0/files/requirements.txt +95 -0
  28. wandb/run-20210716_222651-1lrzcta0/files/wandb-metadata.json +45 -0
  29. wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json +1 -0
  30. wandb/run-20210716_222651-1lrzcta0/logs/debug-internal.log +111 -0
  31. wandb/run-20210716_222651-1lrzcta0/logs/debug.log +28 -0
  32. wandb/run-20210716_222651-1lrzcta0/run-1lrzcta0.wandb +0 -0
  33. wandb/run-20210716_223350-8eukt20m/files/config.yaml +308 -0
  34. wandb/run-20210716_223350-8eukt20m/files/output.log +1646 -0
  35. wandb/run-20210716_223350-8eukt20m/files/requirements.txt +95 -0
  36. wandb/run-20210716_223350-8eukt20m/files/wandb-metadata.json +45 -0
  37. wandb/run-20210716_223350-8eukt20m/files/wandb-summary.json +1 -0
  38. wandb/run-20210716_223350-8eukt20m/logs/debug-internal.log +0 -0
  39. wandb/run-20210716_223350-8eukt20m/logs/debug.log +26 -0
  40. wandb/run-20210716_223350-8eukt20m/run-8eukt20m.wandb +0 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: nl
3
+ datasets:
4
+ - mC4
5
+ - Dutch_news
6
+ ---
7
+
8
+ # Pino (BigBird) base model
9
+
10
+ Dat Nguyen & Yeb Havinga
11
+
12
+ BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.
13
+
14
+ It is a pretrained model on Dutch language using a masked language modeling (MLM) objective. It was introduced in this [paper](https://arxiv.org/abs/2007.14062) and first released in this [repository](https://github.com/google-research/bigbird).
15
+
16
+ ## Model description
17
+
18
+ BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts.
19
+
20
+ ## How to use
21
+
22
+ Here is how to use this model to get the features of a given text in PyTorch:
23
+
24
+ ```python
25
+ from transformers import BigBirdModel
26
+
27
+ # by default its in `block_sparse` mode with num_random_blocks=3, block_size=64
28
+ model = BigBirdModel.from_pretrained("flax-community/pino-roberta-base")
29
+
30
+ # you can change `attention_type` to full attention like this:
31
+ model = BigBirdModel.from_pretrained("flax-community/pino-roberta-base", attention_type="original_full")
32
+
33
+ # you can change `block_size` & `num_random_blocks` like this:
34
+ model = BigBirdModel.from_pretrained("flax-community/pino-roberta-base", block_size=16, num_random_blocks=2)
35
+
36
+
37
+ ```
38
+
39
+ ## Training Data
40
+
41
+ This model is pre-trained on four publicly available datasets: **mC4**, and scraped **Dutch news** from NRC en Nu.nl. It uses the the fast universal Byte-level BPE (BBPE) in contrast to the sentence piece tokenizer and vocabulary as RoBERTa (which is in turn borrowed from GPT2).
42
+
43
+ ## Training Procedure
44
+ The data is cleaned as follows:
45
+ Remove texts containing HTML codes / javascript codes / loremipsum / policies
46
+ Remove lines without end mark.
47
+ Remove too short texts, words
48
+ Remove too long texts, words
49
+ Remove bad words
50
+
51
+
52
+
53
+ ## BibTeX entry and citation info
54
+
55
+ ```tex
56
+ @misc{zaheer2021big,
57
+ title={Big Bird: Transformers for Longer Sequences},
58
+ author={Manzil Zaheer and Guru Guruganesh and Avinava Dubey and Joshua Ainslie and Chris Alberti and Santiago Ontanon and Philip Pham and Anirudh Ravula and Qifan Wang and Li Yang and Amr Ahmed},
59
+ year={2021},
60
+ eprint={2007.14062},
61
+ archivePrefix={arXiv},
62
+ primaryClass={cs.LG}
63
+ }
64
+ ```
events.out.tfevents.1626429561.t1v-n-f5c06ea1-w-0.782479.3.v2 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:968e47ce5036297240debd5f269c8afd988281dc089a30a1cbea0d3083893fc3
3
- size 15794596
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d746ea6c7a1002bb44f887ef5f4e46ed8344c8164b769381c0b9da5a54fbacdc
3
+ size 15817156
events.out.tfevents.1626474327.t1v-n-f5c06ea1-w-0.794570.3.v2 ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c3120bcdd6eb007b39508ccc87d58e9f28da9c4173915e10832c085a9525130
3
+ size 40
events.out.tfevents.1626474410.t1v-n-f5c06ea1-w-0.796231.3.v2 ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c3f249799957fa144a13564f964d2d639a475b7bc1e5567aba7616209a57bb8
3
+ size 40
events.out.tfevents.1626474829.t1v-n-f5c06ea1-w-0.798495.3.v2 ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4a8aebbd12cebb7e110121ed87c8f1b9dcdef4c391cbb90c97f8e37653c5d3d0
3
+ size 1579452
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2bb67c5dbe6876a3e485f3c060c6381b0c583763f51c53e14dfed57d103ca218
3
  size 512555623
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b790aa29e6afdb9ee10d9eb4a2b45f5db49b0f6ebfac95e8c220af4c5c68954f
3
  size 512555623
run.sh CHANGED
@@ -15,14 +15,14 @@ python ./run_mlm_flax_no_accum.py \
15
  --adam_beta1="0.9" \
16
  --adam_beta2="0.98" \
17
  --logging_steps="50" \
18
- --eval_steps="6000" \
19
- --num_train_epochs="5"\
20
  --preprocessing_num_workers="96" \
21
  --save_steps="15000" \
22
  --learning_rate="3e-5" \
23
  --per_device_train_batch_size="1" \
24
  --per_device_eval_batch_size="1" \
25
- --save_total_limit="20"\
26
  --max_eval_samples="4000"\
27
  --resume_from_checkpoint="./"\
28
  #--gradient_accumulation_steps="4"\
15
  --adam_beta1="0.9" \
16
  --adam_beta2="0.98" \
17
  --logging_steps="50" \
18
+ --eval_steps="10000" \
19
+ --num_train_epochs="4"\
20
  --preprocessing_num_workers="96" \
21
  --save_steps="15000" \
22
  --learning_rate="3e-5" \
23
  --per_device_train_batch_size="1" \
24
  --per_device_eval_batch_size="1" \
25
+ --save_total_limit="50"\
26
  --max_eval_samples="4000"\
27
  --resume_from_checkpoint="./"\
28
  #--gradient_accumulation_steps="4"\
run_mlm_flax_no_accum.py CHANGED
@@ -422,9 +422,9 @@ if __name__ == "__main__":
422
  tokenized_datasets = DatasetDict.load_from_disk("/data/tokenized_data")
423
  logger.info("Setting max validation examples to ")
424
  print(f"Number of validation examples {data_args.max_eval_samples}")
425
- tokenized_datasets["train"]= tokenized_datasets["train"].select(range(int(0.35*len(tokenized_datasets["train"]))))
426
  if data_args.max_eval_samples is not None:
427
- tokenized_datasets["validation"] = tokenized_datasets["validation"].select(range(data_args.max_eval_samples))
428
  else:
429
  if training_args.do_train:
430
  column_names = datasets["train"].column_names
@@ -703,6 +703,7 @@ if __name__ == "__main__":
703
  cur_step = epoch * (num_train_samples // train_batch_size) + step
704
  if cur_step == resume_step:
705
  logging.info('Initial compilation completed.')
 
706
  #if cur_step < resume_step:
707
  # continue
708
 
422
  tokenized_datasets = DatasetDict.load_from_disk("/data/tokenized_data")
423
  logger.info("Setting max validation examples to ")
424
  print(f"Number of validation examples {data_args.max_eval_samples}")
425
+ tokenized_datasets["train"]= tokenized_datasets["train"].select(range(int(0.35*len(tokenized_datasets["train"])),int(0.7*len(tokenized_datasets["train"]))))
426
  if data_args.max_eval_samples is not None:
427
+ tokenized_datasets["validation"] = tokenized_datasets["validation"].select(range(data_args.max_eval_samples,2 * data_args.max_eval_samples))
428
  else:
429
  if training_args.do_train:
430
  column_names = datasets["train"].column_names
703
  cur_step = epoch * (num_train_samples // train_batch_size) + step
704
  if cur_step == resume_step:
705
  logging.info('Initial compilation completed.')
706
+ resume_step = 0
707
  #if cur_step < resume_step:
708
  # continue
709
 
wandb/debug-internal.log CHANGED
@@ -1 +1 @@
1
- run-20210716_095921-13hxxunp/logs/debug-internal.log
1
+ run-20210716_223350-8eukt20m/logs/debug-internal.log
wandb/debug.log CHANGED
@@ -1 +1 @@
1
- run-20210716_095921-13hxxunp/logs/debug.log
1
+ run-20210716_223350-8eukt20m/logs/debug.log
wandb/latest-run CHANGED
@@ -1 +1 @@
1
- run-20210716_095921-13hxxunp
1
+ run-20210716_223350-8eukt20m
wandb/run-20210716_095921-13hxxunp/files/output.log CHANGED
@@ -12129,3 +12129,20 @@ Training...: 104999it [12:12:55, 2.71it/s]████████████
12129
  [22:19:21] - INFO - absl - Saved checkpoint at checkpoint_330000█████████████████████████████| 500/500 [00:59<00:00, 7.90it/s]
12130
  [22:19:22] - INFO - huggingface_hub.repository - git version 2.25.1
12131
  git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12129
  [22:19:21] - INFO - absl - Saved checkpoint at checkpoint_330000█████████████████████████████| 500/500 [00:59<00:00, 7.90it/s]
12130
  [22:19:22] - INFO - huggingface_hub.repository - git version 2.25.1
12131
  git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
12132
+ [22:19:23] - DEBUG - huggingface_hub.repository - [Repository] is a valid git repo
12133
+ [22:20:34] - INFO - huggingface_hub.repository - Uploading LFS objects: 100% (3/3), 2.1 GB | 46 MB/s, done.
12134
+
12135
+
12136
+
12137
+ Training...: 105049it [12:15:27, 2.77it/s]
12138
+
12139
+
12140
+
12141
+
12142
+ Training...: 105099it [12:15:48, 2.91it/s]
12143
+
12144
+
12145
+
12146
+
12147
+ Training...: 105149it [12:16:08, 2.81it/s]
12148
+
wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json CHANGED
@@ -1 +1 @@
1
- {"training_step": 330000, "learning_rate": 2.4526265406166203e-05, "train_loss": 2.006321430206299, "_runtime": 44391, "_timestamp": 1626473953, "_step": 2117, "eval_step": 330000, "eval_accuracy": 0.6395835280418396, "eval_loss": 1.8301599025726318}
1
+ {"training_step": 330150, "learning_rate": 2.45236988121178e-05, "train_loss": 1.8688039779663086, "_runtime": 44533, "_timestamp": 1626474095, "_step": 2120, "eval_step": 330000, "eval_accuracy": 0.6395835280418396, "eval_loss": 1.8301599025726318}
wandb/run-20210716_095921-13hxxunp/logs/debug-internal.log CHANGED
@@ -26741,3 +26741,67 @@
26741
  2021-07-16 22:19:21,321 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26742
  2021-07-16 22:19:24,036 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26743
  2021-07-16 22:19:26,037 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26741
  2021-07-16 22:19:21,321 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26742
  2021-07-16 22:19:24,036 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26743
  2021-07-16 22:19:26,037 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26744
+ 2021-07-16 22:19:36,453 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26745
+ 2021-07-16 22:19:36,565 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26746
+ 2021-07-16 22:19:45,717 DEBUG SenderThread:783720 [sender.py:send():179] send: stats
26747
+ 2021-07-16 22:19:51,701 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26748
+ 2021-07-16 22:19:51,702 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26749
+ 2021-07-16 22:20:06,833 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26750
+ 2021-07-16 22:20:06,833 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26751
+ 2021-07-16 22:20:15,796 DEBUG SenderThread:783720 [sender.py:send():179] send: stats
26752
+ 2021-07-16 22:20:21,963 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26753
+ 2021-07-16 22:20:21,964 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26754
+ 2021-07-16 22:20:36,065 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26755
+ 2021-07-16 22:20:37,107 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26756
+ 2021-07-16 22:20:37,107 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26757
+ 2021-07-16 22:20:38,066 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26758
+ 2021-07-16 22:20:40,067 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26759
+ 2021-07-16 22:20:42,068 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26760
+ 2021-07-16 22:20:44,068 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26761
+ 2021-07-16 22:20:45,875 DEBUG SenderThread:783720 [sender.py:send():179] send: stats
26762
+ 2021-07-16 22:20:52,269 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26763
+ 2021-07-16 22:20:52,269 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26764
+ 2021-07-16 22:20:55,801 DEBUG SenderThread:783720 [sender.py:send():179] send: history
26765
+ 2021-07-16 22:20:55,802 DEBUG SenderThread:783720 [sender.py:send():179] send: summary
26766
+ 2021-07-16 22:20:55,802 INFO SenderThread:783720 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
26767
+ 2021-07-16 22:20:56,074 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json
26768
+ 2021-07-16 22:20:58,075 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26769
+ 2021-07-16 22:21:00,076 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26770
+ 2021-07-16 22:21:02,076 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26771
+ 2021-07-16 22:21:04,077 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26772
+ 2021-07-16 22:21:07,399 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26773
+ 2021-07-16 22:21:07,400 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26774
+ 2021-07-16 22:21:15,889 DEBUG SenderThread:783720 [sender.py:send():179] send: history
26775
+ 2021-07-16 22:21:15,889 DEBUG SenderThread:783720 [sender.py:send():179] send: summary
26776
+ 2021-07-16 22:21:15,892 INFO SenderThread:783720 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
26777
+ 2021-07-16 22:21:15,956 DEBUG SenderThread:783720 [sender.py:send():179] send: stats
26778
+ 2021-07-16 22:21:16,083 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json
26779
+ 2021-07-16 22:21:18,084 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26780
+ 2021-07-16 22:21:20,084 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26781
+ 2021-07-16 22:21:22,085 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26782
+ 2021-07-16 22:21:22,531 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26783
+ 2021-07-16 22:21:22,531 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26784
+ 2021-07-16 22:21:24,086 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26785
+ 2021-07-16 22:21:35,930 DEBUG SenderThread:783720 [sender.py:send():179] send: history
26786
+ 2021-07-16 22:21:35,931 DEBUG SenderThread:783720 [sender.py:send():179] send: summary
26787
+ 2021-07-16 22:21:35,931 INFO SenderThread:783720 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
26788
+ 2021-07-16 22:21:36,091 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json
26789
+ 2021-07-16 22:21:37,691 DEBUG HandlerThread:783720 [handler.py:handle_request():124] handle_request: stop_status
26790
+ 2021-07-16 22:21:37,691 DEBUG SenderThread:783720 [sender.py:send_request():193] send_request: stop_status
26791
+ 2021-07-16 22:21:38,092 INFO Thread-8 :783720 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log
26792
+ 2021-07-16 22:21:39,177 WARNING MainThread:783720 [internal.py:wandb_internal():147] Internal process interrupt: 1
26793
+ 2021-07-16 22:21:39,673 WARNING MainThread:783720 [internal.py:wandb_internal():147] Internal process interrupt: 2
26794
+ 2021-07-16 22:21:39,674 ERROR MainThread:783720 [internal.py:wandb_internal():150] Internal process interrupted.
26795
+ 2021-07-16 22:21:39,833 INFO SenderThread:783720 [sender.py:finish():945] shutting down sender
26796
+ 2021-07-16 22:21:39,833 INFO WriterThread:783720 [datastore.py:close():288] close: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb
26797
+ 2021-07-16 22:21:39,833 INFO SenderThread:783720 [dir_watcher.py:finish():282] shutting down directory watcher
26798
+ 2021-07-16 22:21:39,834 INFO HandlerThread:783720 [handler.py:finish():638] shutting down handler
26799
+ 2021-07-16 22:21:40,093 INFO SenderThread:783720 [dir_watcher.py:finish():312] scan: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files
26800
+ 2021-07-16 22:21:40,093 INFO SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/requirements.txt requirements.txt
26801
+ 2021-07-16 22:21:40,093 INFO SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/output.log output.log
26802
+ 2021-07-16 22:21:40,094 INFO SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-metadata.json wandb-metadata.json
26803
+ 2021-07-16 22:21:40,098 INFO SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/config.yaml config.yaml
26804
+ 2021-07-16 22:21:40,098 INFO SenderThread:783720 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_095921-13hxxunp/files/wandb-summary.json wandb-summary.json
26805
+ 2021-07-16 22:21:40,098 INFO SenderThread:783720 [file_pusher.py:finish():177] shutting down file pusher
26806
+ 2021-07-16 22:21:40,098 INFO SenderThread:783720 [file_pusher.py:join():182] waiting for file pusher
26807
+ 2021-07-16 22:21:40,111 INFO MainThread:783720 [internal.py:handle_exit():78] Internal process exited
wandb/run-20210716_095921-13hxxunp/logs/debug.log CHANGED
@@ -24,3 +24,5 @@ config: {}
24
  2021-07-16 09:59:24,061 INFO MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 5.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_09-59-13_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 20, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 6000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
25
  2021-07-16 09:59:24,063 INFO MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
26
  2021-07-16 09:59:24,065 INFO MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}
 
 
24
  2021-07-16 09:59:24,061 INFO MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 5.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_09-59-13_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 20, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 6000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
25
  2021-07-16 09:59:24,063 INFO MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
26
  2021-07-16 09:59:24,065 INFO MainThread:782479 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}
27
+ 2021-07-16 22:21:39,193 INFO MainThread:782479 [wandb_run.py:_atexit_cleanup():1593] got exitcode: 255
28
+ 2021-07-16 22:21:39,193 INFO MainThread:782479 [wandb_run.py:_restore():1565] restore
wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb CHANGED
Binary files a/wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb and b/wandb/run-20210716_095921-13hxxunp/run-13hxxunp.wandb differ
wandb/run-20210716_222528-3qk3dij4/files/config.yaml ADDED
@@ -0,0 +1,308 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ wandb_version: 1
2
+
3
+ __cached__setup_devices:
4
+ desc: null
5
+ value: cpu
6
+ _n_gpu:
7
+ desc: null
8
+ value: 0
9
+ _wandb:
10
+ desc: null
11
+ value:
12
+ cli_version: 0.10.33
13
+ framework: huggingface
14
+ huggingface_version: 4.9.0.dev0
15
+ is_jupyter_run: false
16
+ is_kaggle_kernel: false
17
+ python_version: 3.8.10
18
+ t:
19
+ 1:
20
+ - 1
21
+ - 3
22
+ - 11
23
+ 4: 3.8.10
24
+ 5: 0.10.33
25
+ 6: 4.9.0.dev0
26
+ 8:
27
+ - 5
28
+ adafactor:
29
+ desc: null
30
+ value: false
31
+ adam_beta1:
32
+ desc: null
33
+ value: 0.9
34
+ adam_beta2:
35
+ desc: null
36
+ value: 0.98
37
+ adam_epsilon:
38
+ desc: null
39
+ value: 1.0e-08
40
+ cache_dir:
41
+ desc: null
42
+ value: null
43
+ config_name:
44
+ desc: null
45
+ value: ./
46
+ dataloader_drop_last:
47
+ desc: null
48
+ value: false
49
+ dataloader_num_workers:
50
+ desc: null
51
+ value: 0
52
+ dataloader_pin_memory:
53
+ desc: null
54
+ value: true
55
+ dataset_config_name:
56
+ desc: null
57
+ value: null
58
+ dataset_name:
59
+ desc: null
60
+ value: null
61
+ ddp_find_unused_parameters:
62
+ desc: null
63
+ value: null
64
+ debug:
65
+ desc: null
66
+ value: []
67
+ deepspeed:
68
+ desc: null
69
+ value: null
70
+ disable_tqdm:
71
+ desc: null
72
+ value: false
73
+ do_eval:
74
+ desc: null
75
+ value: false
76
+ do_predict:
77
+ desc: null
78
+ value: false
79
+ do_train:
80
+ desc: null
81
+ value: false
82
+ dtype:
83
+ desc: null
84
+ value: float32
85
+ eval_accumulation_steps:
86
+ desc: null
87
+ value: null
88
+ eval_steps:
89
+ desc: null
90
+ value: 10000
91
+ evaluation_strategy:
92
+ desc: null
93
+ value: IntervalStrategy.NO
94
+ fp16:
95
+ desc: null
96
+ value: false
97
+ fp16_backend:
98
+ desc: null
99
+ value: auto
100
+ fp16_full_eval:
101
+ desc: null
102
+ value: false
103
+ fp16_opt_level:
104
+ desc: null
105
+ value: O1
106
+ gradient_accumulation_steps:
107
+ desc: null
108
+ value: 1
109
+ greater_is_better:
110
+ desc: null
111
+ value: null
112
+ group_by_length:
113
+ desc: null
114
+ value: false
115
+ ignore_data_skip:
116
+ desc: null
117
+ value: false
118
+ label_names:
119
+ desc: null
120
+ value: null
121
+ label_smoothing_factor:
122
+ desc: null
123
+ value: 0.0
124
+ learning_rate:
125
+ desc: null
126
+ value: 3.0e-05
127
+ length_column_name:
128
+ desc: null
129
+ value: length
130
+ line_by_line:
131
+ desc: null
132
+ value: false
133
+ load_best_model_at_end:
134
+ desc: null
135
+ value: false
136
+ local_rank:
137
+ desc: null
138
+ value: -1
139
+ log_level:
140
+ desc: null
141
+ value: -1
142
+ log_level_replica:
143
+ desc: null
144
+ value: -1
145
+ log_on_each_node:
146
+ desc: null
147
+ value: true
148
+ logging_dir:
149
+ desc: null
150
+ value: ./runs/Jul16_22-25-20_t1v-n-f5c06ea1-w-0
151
+ logging_first_step:
152
+ desc: null
153
+ value: false
154
+ logging_steps:
155
+ desc: null
156
+ value: 50
157
+ logging_strategy:
158
+ desc: null
159
+ value: IntervalStrategy.STEPS
160
+ lr_scheduler_type:
161
+ desc: null
162
+ value: SchedulerType.LINEAR
163
+ max_eval_samples:
164
+ desc: null
165
+ value: 4000
166
+ max_grad_norm:
167
+ desc: null
168
+ value: 1.0
169
+ max_seq_length:
170
+ desc: null
171
+ value: 4096
172
+ max_steps:
173
+ desc: null
174
+ value: -1
175
+ metric_for_best_model:
176
+ desc: null
177
+ value: null
178
+ mlm_probability:
179
+ desc: null
180
+ value: 0.15
181
+ model_name_or_path:
182
+ desc: null
183
+ value: null
184
+ model_type:
185
+ desc: null
186
+ value: big_bird
187
+ mp_parameters:
188
+ desc: null
189
+ value: ''
190
+ no_cuda:
191
+ desc: null
192
+ value: false
193
+ num_train_epochs:
194
+ desc: null
195
+ value: 5.0
196
+ output_dir:
197
+ desc: null
198
+ value: ./
199
+ overwrite_cache:
200
+ desc: null
201
+ value: false
202
+ overwrite_output_dir:
203
+ desc: null
204
+ value: true
205
+ pad_to_max_length:
206
+ desc: null
207
+ value: false
208
+ past_index:
209
+ desc: null
210
+ value: -1
211
+ per_device_eval_batch_size:
212
+ desc: null
213
+ value: 1
214
+ per_device_train_batch_size:
215
+ desc: null
216
+ value: 1
217
+ per_gpu_eval_batch_size:
218
+ desc: null
219
+ value: null
220
+ per_gpu_train_batch_size:
221
+ desc: null
222
+ value: null
223
+ prediction_loss_only:
224
+ desc: null
225
+ value: false
226
+ preprocessing_num_workers:
227
+ desc: null
228
+ value: 96
229
+ push_to_hub:
230
+ desc: null
231
+ value: true
232
+ push_to_hub_model_id:
233
+ desc: null
234
+ value: ''
235
+ push_to_hub_organization:
236
+ desc: null
237
+ value: null
238
+ push_to_hub_token:
239
+ desc: null
240
+ value: null
241
+ remove_unused_columns:
242
+ desc: null
243
+ value: true
244
+ report_to:
245
+ desc: null
246
+ value:
247
+ - tensorboard
248
+ - wandb
249
+ resume_from_checkpoint:
250
+ desc: null
251
+ value: ./
252
+ run_name:
253
+ desc: null
254
+ value: ./
255
+ save_on_each_node:
256
+ desc: null
257
+ value: false
258
+ save_steps:
259
+ desc: null
260
+ value: 15000
261
+ save_strategy:
262
+ desc: null
263
+ value: IntervalStrategy.STEPS
264
+ save_total_limit:
265
+ desc: null
266
+ value: 50
267
+ seed:
268
+ desc: null
269
+ value: 42
270
+ sharded_ddp:
271
+ desc: null
272
+ value: []
273
+ skip_memory_metrics:
274
+ desc: null
275
+ value: true
276
+ tokenizer_name:
277
+ desc: null
278
+ value: ./
279
+ tpu_metrics_debug:
280
+ desc: null
281
+ value: false
282
+ tpu_num_cores:
283
+ desc: null
284
+ value: null
285
+ train_ref_file:
286
+ desc: null
287
+ value: null
288
+ use_fast_tokenizer:
289
+ desc: null
290
+ value: true
291
+ use_legacy_prediction_loop:
292
+ desc: null
293
+ value: false
294
+ validation_ref_file:
295
+ desc: null
296
+ value: null
297
+ validation_split_percentage:
298
+ desc: null
299
+ value: 5
300
+ warmup_ratio:
301
+ desc: null
302
+ value: 0.0
303
+ warmup_steps:
304
+ desc: null
305
+ value: 10000
306
+ weight_decay:
307
+ desc: null
308
+ value: 0.0095
wandb/run-20210716_222528-3qk3dij4/files/output.log ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
1
+ [22:25:43] - INFO - absl - Restoring checkpoint from ./checkpoint_330000
2
+ tcmalloc: large alloc 1530273792 bytes == 0x9d9a6000 @ 0x7fd1459cb680 0x7fd1459ec824 0x5b9a14 0x50b2ae 0x50cb1b 0x5a6f17 0x5f3010 0x56fd36 0x568d9a 0x5f5b33 0x56aadf 0x568d9a 0x68cdc7 0x67e161 0x67e1df 0x67e281 0x67e627 0x6b6e62 0x6b71ed 0x7fd1457e00b3 0x5f96de
3
+ /home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
4
+ warnings.warn(
5
+ /home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
6
+ warnings.warn(
wandb/run-20210716_222528-3qk3dij4/files/requirements.txt ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ absl-py==0.13.0
2
+ aiohttp==3.7.4.post0
3
+ astunparse==1.6.3
4
+ async-timeout==3.0.1
5
+ attrs==21.2.0
6
+ cachetools==4.2.2
7
+ certifi==2021.5.30
8
+ chardet==4.0.0
9
+ charset-normalizer==2.0.1
10
+ chex==0.0.8
11
+ click==8.0.1
12
+ configparser==5.0.2
13
+ cycler==0.10.0
14
+ datasets==1.9.1.dev0
15
+ dill==0.3.4
16
+ dm-tree==0.1.6
17
+ docker-pycreds==0.4.0
18
+ filelock==3.0.12
19
+ flatbuffers==1.12
20
+ flax==0.3.4
21
+ fsspec==2021.7.0
22
+ gast==0.4.0
23
+ gitdb==4.0.7
24
+ gitpython==3.1.18
25
+ google-auth-oauthlib==0.4.4
26
+ google-auth==1.32.1
27
+ google-pasta==0.2.0
28
+ grpcio==1.34.1
29
+ h5py==3.1.0
30
+ huggingface-hub==0.0.12
31
+ idna==3.2
32
+ install==1.3.4
33
+ jax==0.2.17
34
+ jaxlib==0.1.68
35
+ joblib==1.0.1
36
+ keras-nightly==2.5.0.dev2021032900
37
+ keras-preprocessing==1.1.2
38
+ kiwisolver==1.3.1
39
+ libtpu-nightly==0.1.dev20210615
40
+ markdown==3.3.4
41
+ matplotlib==3.4.2
42
+ msgpack==1.0.2
43
+ multidict==5.1.0
44
+ multiprocess==0.70.12.2
45
+ numpy==1.19.5
46
+ oauthlib==3.1.1
47
+ opt-einsum==3.3.0
48
+ optax==0.0.9
49
+ packaging==21.0
50
+ pandas==1.3.0
51
+ pathtools==0.1.2
52
+ pillow==8.3.1
53
+ pip==20.0.2
54
+ pkg-resources==0.0.0
55
+ promise==2.3
56
+ protobuf==3.17.3
57
+ psutil==5.8.0
58
+ pyarrow==4.0.1
59
+ pyasn1-modules==0.2.8
60
+ pyasn1==0.4.8
61
+ pyparsing==2.4.7
62
+ python-dateutil==2.8.1
63
+ pytz==2021.1
64
+ pyyaml==5.4.1
65
+ regex==2021.7.6
66
+ requests-oauthlib==1.3.0
67
+ requests==2.26.0
68
+ rsa==4.7.2
69
+ sacremoses==0.0.45
70
+ scipy==1.7.0
71
+ sentry-sdk==1.3.0
72
+ setuptools==44.0.0
73
+ shortuuid==1.0.1
74
+ six==1.15.0
75
+ smmap==4.0.0
76
+ subprocess32==3.5.4
77
+ tensorboard-data-server==0.6.1
78
+ tensorboard-plugin-wit==1.8.0
79
+ tensorboard==2.5.0
80
+ tensorflow-estimator==2.5.0
81
+ tensorflow==2.5.0
82
+ termcolor==1.1.0
83
+ tokenizers==0.10.3
84
+ toolz==0.11.1
85
+ torch==1.9.0
86
+ tqdm==4.61.2
87
+ transformers==4.9.0.dev0
88
+ typing-extensions==3.7.4.3
89
+ urllib3==1.26.6
90
+ wandb==0.10.33
91
+ werkzeug==2.0.1
92
+ wheel==0.36.2
93
+ wrapt==1.12.1
94
+ xxhash==2.0.2
95
+ yarl==1.6.3
wandb/run-20210716_222528-3qk3dij4/files/wandb-metadata.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "os": "Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29",
3
+ "python": "3.8.10",
4
+ "heartbeatAt": "2021-07-16T22:25:30.712229",
5
+ "startedAt": "2021-07-16T22:25:28.616115",
6
+ "docker": null,
7
+ "cpu_count": 96,
8
+ "cuda": null,
9
+ "args": [
10
+ "--push_to_hub",
11
+ "--output_dir=./",
12
+ "--model_type=big_bird",
13
+ "--config_name=./",
14
+ "--tokenizer_name=./",
15
+ "--max_seq_length=4096",
16
+ "--weight_decay=0.0095",
17
+ "--warmup_steps=10000",
18
+ "--overwrite_output_dir",
19
+ "--adam_beta1=0.9",
20
+ "--adam_beta2=0.98",
21
+ "--logging_steps=50",
22
+ "--eval_steps=10000",
23
+ "--num_train_epochs=5",
24
+ "--preprocessing_num_workers=96",
25
+ "--save_steps=15000",
26
+ "--learning_rate=3e-5",
27
+ "--per_device_train_batch_size=1",
28
+ "--per_device_eval_batch_size=1",
29
+ "--save_total_limit=50",
30
+ "--max_eval_samples=4000",
31
+ "--resume_from_checkpoint=./"
32
+ ],
33
+ "state": "running",
34
+ "program": "./run_mlm_flax_no_accum.py",
35
+ "codePath": "run_mlm_flax_no_accum.py",
36
+ "git": {
37
+ "remote": "https://huggingface.co/flax-community/pino-roberta-base",
38
+ "commit": "def9a456105f36b517155343f42ff643df2d20ce"
39
+ },
40
+ "email": null,
41
+ "root": "/home/dat/pino-roberta-base",
42
+ "host": "t1v-n-f5c06ea1-w-0",
43
+ "username": "dat",
44
+ "executable": "/home/dat/pino/bin/python"
45
+ }
wandb/run-20210716_222528-3qk3dij4/files/wandb-summary.json ADDED
@@ -0,0 +1 @@
 
1
+ {}
wandb/run-20210716_222528-3qk3dij4/logs/debug-internal.log ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2021-07-16 22:25:29,348 INFO MainThread:795833 [internal.py:wandb_internal():88] W&B internal server running at pid: 795833, started at: 2021-07-16 22:25:29.348306
2
+ 2021-07-16 22:25:29,350 DEBUG HandlerThread:795833 [handler.py:handle_request():124] handle_request: check_version
3
+ 2021-07-16 22:25:29,351 INFO WriterThread:795833 [datastore.py:open_for_write():80] open: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/run-3qk3dij4.wandb
4
+ 2021-07-16 22:25:29,352 DEBUG SenderThread:795833 [sender.py:send():179] send: header
5
+ 2021-07-16 22:25:29,352 DEBUG SenderThread:795833 [sender.py:send_request():193] send_request: check_version
6
+ 2021-07-16 22:25:29,393 DEBUG SenderThread:795833 [sender.py:send():179] send: run
7
+ 2021-07-16 22:25:29,571 INFO SenderThread:795833 [dir_watcher.py:__init__():168] watching files in: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files
8
+ 2021-07-16 22:25:29,571 INFO SenderThread:795833 [sender.py:_start_run_threads():716] run started: 3qk3dij4 with start time 1626474328
9
+ 2021-07-16 22:25:29,571 DEBUG SenderThread:795833 [sender.py:send():179] send: summary
10
+ 2021-07-16 22:25:29,572 DEBUG HandlerThread:795833 [handler.py:handle_request():124] handle_request: run_start
11
+ 2021-07-16 22:25:29,573 INFO SenderThread:795833 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
12
+ 2021-07-16 22:25:30,574 INFO Thread-8 :795833 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/wandb-summary.json
13
+ 2021-07-16 22:25:30,711 DEBUG HandlerThread:795833 [meta.py:__init__():39] meta init
14
+ 2021-07-16 22:25:30,712 DEBUG HandlerThread:795833 [meta.py:__init__():53] meta init done
15
+ 2021-07-16 22:25:30,712 DEBUG HandlerThread:795833 [meta.py:probe():210] probe
16
+ 2021-07-16 22:25:30,713 DEBUG HandlerThread:795833 [meta.py:_setup_git():200] setup git
17
+ 2021-07-16 22:25:30,743 DEBUG HandlerThread:795833 [meta.py:_setup_git():207] setup git done
18
+ 2021-07-16 22:25:30,744 DEBUG HandlerThread:795833 [meta.py:_save_pip():57] save pip
19
+ 2021-07-16 22:25:30,744 DEBUG HandlerThread:795833 [meta.py:_save_pip():71] save pip done
20
+ 2021-07-16 22:25:30,744 DEBUG HandlerThread:795833 [meta.py:probe():252] probe done
21
+ 2021-07-16 22:25:30,748 DEBUG SenderThread:795833 [sender.py:send():179] send: files
22
+ 2021-07-16 22:25:30,748 INFO SenderThread:795833 [sender.py:_save_file():841] saving file wandb-metadata.json with policy now
23
+ 2021-07-16 22:25:30,756 DEBUG HandlerThread:795833 [handler.py:handle_request():124] handle_request: stop_status
24
+ 2021-07-16 22:25:30,756 DEBUG SenderThread:795833 [sender.py:send_request():193] send_request: stop_status
25
+ 2021-07-16 22:25:30,883 DEBUG SenderThread:795833 [sender.py:send():179] send: config
26
+ 2021-07-16 22:25:30,884 DEBUG SenderThread:795833 [sender.py:send():179] send: config
27
+ 2021-07-16 22:25:30,884 DEBUG SenderThread:795833 [sender.py:send():179] send: config
28
+ 2021-07-16 22:25:31,207 INFO Thread-11 :795833 [upload_job.py:push():137] Uploaded file /tmp/tmp88lb3201wandb/38nccn88-wandb-metadata.json
29
+ 2021-07-16 22:25:31,573 INFO Thread-8 :795833 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/wandb-metadata.json
30
+ 2021-07-16 22:25:31,573 INFO Thread-8 :795833 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/requirements.txt
31
+ 2021-07-16 22:25:31,574 INFO Thread-8 :795833 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/output.log
32
+ 2021-07-16 22:25:45,579 INFO Thread-8 :795833 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/output.log
33
+ 2021-07-16 22:25:45,929 DEBUG HandlerThread:795833 [handler.py:handle_request():124] handle_request: stop_status
34
+ 2021-07-16 22:25:45,929 DEBUG SenderThread:795833 [sender.py:send_request():193] send_request: stop_status
35
+ 2021-07-16 22:25:47,580 INFO Thread-8 :795833 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/output.log
36
+ 2021-07-16 22:25:58,797 DEBUG SenderThread:795833 [sender.py:send():179] send: stats
37
+ 2021-07-16 22:26:00,585 INFO Thread-8 :795833 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/config.yaml
38
+ 2021-07-16 22:26:01,115 DEBUG HandlerThread:795833 [handler.py:handle_request():124] handle_request: stop_status
39
+ 2021-07-16 22:26:01,115 DEBUG SenderThread:795833 [sender.py:send_request():193] send_request: stop_status
40
+ 2021-07-16 22:26:02,749 WARNING MainThread:795833 [internal.py:wandb_internal():147] Internal process interrupt: 1
41
+ 2021-07-16 22:26:02,938 WARNING MainThread:795833 [internal.py:wandb_internal():147] Internal process interrupt: 2
42
+ 2021-07-16 22:26:02,938 ERROR MainThread:795833 [internal.py:wandb_internal():150] Internal process interrupted.
43
+ 2021-07-16 22:26:03,122 INFO HandlerThread:795833 [handler.py:finish():638] shutting down handler
44
+ 2021-07-16 22:26:03,245 INFO SenderThread:795833 [sender.py:finish():945] shutting down sender
45
+ 2021-07-16 22:26:03,245 INFO SenderThread:795833 [dir_watcher.py:finish():282] shutting down directory watcher
46
+ 2021-07-16 22:26:03,587 INFO SenderThread:795833 [dir_watcher.py:finish():312] scan: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files
47
+ 2021-07-16 22:26:03,587 INFO SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/requirements.txt requirements.txt
48
+ 2021-07-16 22:26:03,587 INFO SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/output.log output.log
49
+ 2021-07-16 22:26:03,587 INFO SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/wandb-metadata.json wandb-metadata.json
50
+ 2021-07-16 22:26:03,588 INFO SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/config.yaml config.yaml
51
+ 2021-07-16 22:26:03,588 INFO SenderThread:795833 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/files/wandb-summary.json wandb-summary.json
52
+ 2021-07-16 22:26:03,588 INFO SenderThread:795833 [file_pusher.py:finish():177] shutting down file pusher
53
+ 2021-07-16 22:26:03,588 INFO SenderThread:795833 [file_pusher.py:join():182] waiting for file pusher
54
+ 2021-07-16 22:26:03,685 INFO MainThread:795833 [internal.py:handle_exit():78] Internal process exited
wandb/run-20210716_222528-3qk3dij4/logs/debug.log ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2021-07-16 22:25:28,617 INFO MainThread:794570 [wandb_setup.py:_flush():69] setting env: {}
2
+ 2021-07-16 22:25:28,617 INFO MainThread:794570 [wandb_setup.py:_flush():69] setting login settings: {}
3
+ 2021-07-16 22:25:28,617 INFO MainThread:794570 [wandb_init.py:_log_setup():337] Logging user logs to /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/logs/debug.log
4
+ 2021-07-16 22:25:28,617 INFO MainThread:794570 [wandb_init.py:_log_setup():338] Logging internal logs to /home/dat/pino-roberta-base/wandb/run-20210716_222528-3qk3dij4/logs/debug-internal.log
5
+ 2021-07-16 22:25:28,618 INFO MainThread:794570 [wandb_init.py:init():370] calling init triggers
6
+ 2021-07-16 22:25:28,618 INFO MainThread:794570 [wandb_init.py:init():375] wandb.init called with sweep_config: {}
7
+ config: {}
8
+ 2021-07-16 22:25:28,618 INFO MainThread:794570 [wandb_init.py:init():419] starting backend
9
+ 2021-07-16 22:25:28,618 INFO MainThread:794570 [backend.py:_multiprocessing_setup():70] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
10
+ 2021-07-16 22:25:28,669 INFO MainThread:794570 [backend.py:ensure_launched():135] starting backend process...
11
+ 2021-07-16 22:25:28,721 INFO MainThread:794570 [backend.py:ensure_launched():139] started backend process with pid: 795833
12
+ 2021-07-16 22:25:28,723 INFO MainThread:794570 [wandb_init.py:init():424] backend started and connected
13
+ 2021-07-16 22:25:28,727 INFO MainThread:794570 [wandb_init.py:init():472] updated telemetry
14
+ 2021-07-16 22:25:28,728 INFO MainThread:794570 [wandb_init.py:init():491] communicating current version
15
+ 2021-07-16 22:25:29,392 INFO MainThread:794570 [wandb_init.py:init():496] got version response upgrade_message: "wandb version 0.11.0 is available! To upgrade, please run:\n $ pip install wandb --upgrade"
16
+
17
+ 2021-07-16 22:25:29,392 INFO MainThread:794570 [wandb_init.py:init():504] communicating run to backend with 30 second timeout
18
+ 2021-07-16 22:25:29,571 INFO MainThread:794570 [wandb_init.py:init():529] starting run threads in backend
19
+ 2021-07-16 22:25:30,752 INFO MainThread:794570 [wandb_run.py:_console_start():1623] atexit reg
20
+ 2021-07-16 22:25:30,752 INFO MainThread:794570 [wandb_run.py:_redirect():1497] redirect: SettingsConsole.REDIRECT
21
+ 2021-07-16 22:25:30,753 INFO MainThread:794570 [wandb_run.py:_redirect():1502] Redirecting console.
22
+ 2021-07-16 22:25:30,755 INFO MainThread:794570 [wandb_run.py:_redirect():1558] Redirects installed.
23
+ 2021-07-16 22:25:30,755 INFO MainThread:794570 [wandb_init.py:init():554] run started, returning control to user process
24
+ 2021-07-16 22:25:30,761 INFO MainThread:794570 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 5.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_22-25-20_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 50, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
25
+ 2021-07-16 22:25:30,763 INFO MainThread:794570 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
26
+ 2021-07-16 22:25:30,764 INFO MainThread:794570 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}
27
+ 2021-07-16 22:26:02,781 INFO MainThread:794570 [wandb_run.py:_atexit_cleanup():1593] got exitcode: 255
28
+ 2021-07-16 22:26:02,781 INFO MainThread:794570 [wandb_run.py:_restore():1565] restore
wandb/run-20210716_222528-3qk3dij4/run-3qk3dij4.wandb ADDED
File without changes
wandb/run-20210716_222651-1lrzcta0/files/config.yaml ADDED
@@ -0,0 +1,308 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ wandb_version: 1
2
+
3
+ __cached__setup_devices:
4
+ desc: null
5
+ value: cpu
6
+ _n_gpu:
7
+ desc: null
8
+ value: 0
9
+ _wandb:
10
+ desc: null
11
+ value:
12
+ cli_version: 0.10.33
13
+ framework: huggingface
14
+ huggingface_version: 4.9.0.dev0
15
+ is_jupyter_run: false
16
+ is_kaggle_kernel: false
17
+ python_version: 3.8.10
18
+ t:
19
+ 1:
20
+ - 1
21
+ - 3
22
+ - 11
23
+ 4: 3.8.10
24
+ 5: 0.10.33
25
+ 6: 4.9.0.dev0
26
+ 8:
27
+ - 5
28
+ adafactor:
29
+ desc: null
30
+ value: false
31
+ adam_beta1:
32
+ desc: null
33
+ value: 0.9
34
+ adam_beta2:
35
+ desc: null
36
+ value: 0.98
37
+ adam_epsilon:
38
+ desc: null
39
+ value: 1.0e-08
40
+ cache_dir:
41
+ desc: null
42
+ value: null
43
+ config_name:
44
+ desc: null
45
+ value: ./
46
+ dataloader_drop_last:
47
+ desc: null
48
+ value: false
49
+ dataloader_num_workers:
50
+ desc: null
51
+ value: 0
52
+ dataloader_pin_memory:
53
+ desc: null
54
+ value: true
55
+ dataset_config_name:
56
+ desc: null
57
+ value: null
58
+ dataset_name:
59
+ desc: null
60
+ value: null
61
+ ddp_find_unused_parameters:
62
+ desc: null
63
+ value: null
64
+ debug:
65
+ desc: null
66
+ value: []
67
+ deepspeed:
68
+ desc: null
69
+ value: null
70
+ disable_tqdm:
71
+ desc: null
72
+ value: false
73
+ do_eval:
74
+ desc: null
75
+ value: false
76
+ do_predict:
77
+ desc: null
78
+ value: false
79
+ do_train:
80
+ desc: null
81
+ value: false
82
+ dtype:
83
+ desc: null
84
+ value: float32
85
+ eval_accumulation_steps:
86
+ desc: null
87
+ value: null
88
+ eval_steps:
89
+ desc: null
90
+ value: 10000
91
+ evaluation_strategy:
92
+ desc: null
93
+ value: IntervalStrategy.NO
94
+ fp16:
95
+ desc: null
96
+ value: false
97
+ fp16_backend:
98
+ desc: null
99
+ value: auto
100
+ fp16_full_eval:
101
+ desc: null
102
+ value: false
103
+ fp16_opt_level:
104
+ desc: null
105
+ value: O1
106
+ gradient_accumulation_steps:
107
+ desc: null
108
+ value: 1
109
+ greater_is_better:
110
+ desc: null
111
+ value: null
112
+ group_by_length:
113
+ desc: null
114
+ value: false
115
+ ignore_data_skip:
116
+ desc: null
117
+ value: false
118
+ label_names:
119
+ desc: null
120
+ value: null
121
+ label_smoothing_factor:
122
+ desc: null
123
+ value: 0.0
124
+ learning_rate:
125
+ desc: null
126
+ value: 3.0e-05
127
+ length_column_name:
128
+ desc: null
129
+ value: length
130
+ line_by_line:
131
+ desc: null
132
+ value: false
133
+ load_best_model_at_end:
134
+ desc: null
135
+ value: false
136
+ local_rank:
137
+ desc: null
138
+ value: -1
139
+ log_level:
140
+ desc: null
141
+ value: -1
142
+ log_level_replica:
143
+ desc: null
144
+ value: -1
145
+ log_on_each_node:
146
+ desc: null
147
+ value: true
148
+ logging_dir:
149
+ desc: null
150
+ value: ./runs/Jul16_22-26-42_t1v-n-f5c06ea1-w-0
151
+ logging_first_step:
152
+ desc: null
153
+ value: false
154
+ logging_steps:
155
+ desc: null
156
+ value: 50
157
+ logging_strategy:
158
+ desc: null
159
+ value: IntervalStrategy.STEPS
160
+ lr_scheduler_type:
161
+ desc: null
162
+ value: SchedulerType.LINEAR
163
+ max_eval_samples:
164
+ desc: null
165
+ value: 4000
166
+ max_grad_norm:
167
+ desc: null
168
+ value: 1.0
169
+ max_seq_length:
170
+ desc: null
171
+ value: 4096
172
+ max_steps:
173
+ desc: null
174
+ value: -1
175
+ metric_for_best_model:
176
+ desc: null
177
+ value: null
178
+ mlm_probability:
179
+ desc: null
180
+ value: 0.15
181
+ model_name_or_path:
182
+ desc: null
183
+ value: null
184
+ model_type:
185
+ desc: null
186
+ value: big_bird
187
+ mp_parameters:
188
+ desc: null
189
+ value: ''
190
+ no_cuda:
191
+ desc: null
192
+ value: false
193
+ num_train_epochs:
194
+ desc: null
195
+ value: 4.0
196
+ output_dir:
197
+ desc: null
198
+ value: ./
199
+ overwrite_cache:
200
+ desc: null
201
+ value: false
202
+ overwrite_output_dir:
203
+ desc: null
204
+ value: true
205
+ pad_to_max_length:
206
+ desc: null
207
+ value: false
208
+ past_index:
209
+ desc: null
210
+ value: -1
211
+ per_device_eval_batch_size:
212
+ desc: null
213
+ value: 1
214
+ per_device_train_batch_size:
215
+ desc: null
216
+ value: 1
217
+ per_gpu_eval_batch_size:
218
+ desc: null
219
+ value: null
220
+ per_gpu_train_batch_size:
221
+ desc: null
222
+ value: null
223
+ prediction_loss_only:
224
+ desc: null
225
+ value: false
226
+ preprocessing_num_workers:
227
+ desc: null
228
+ value: 96
229
+ push_to_hub:
230
+ desc: null
231
+ value: true
232
+ push_to_hub_model_id:
233
+ desc: null
234
+ value: ''
235
+ push_to_hub_organization:
236
+ desc: null
237
+ value: null
238
+ push_to_hub_token:
239
+ desc: null
240
+ value: null
241
+ remove_unused_columns:
242
+ desc: null
243
+ value: true
244
+ report_to:
245
+ desc: null
246
+ value:
247
+ - tensorboard
248
+ - wandb
249
+ resume_from_checkpoint:
250
+ desc: null
251
+ value: ./
252
+ run_name:
253
+ desc: null
254
+ value: ./
255
+ save_on_each_node:
256
+ desc: null
257
+ value: false
258
+ save_steps:
259
+ desc: null
260
+ value: 15000
261
+ save_strategy:
262
+ desc: null
263
+ value: IntervalStrategy.STEPS
264
+ save_total_limit:
265
+ desc: null
266
+ value: 50
267
+ seed:
268
+ desc: null
269
+ value: 42
270
+ sharded_ddp:
271
+ desc: null
272
+ value: []
273
+ skip_memory_metrics:
274
+ desc: null
275
+ value: true
276
+ tokenizer_name:
277
+ desc: null
278
+ value: ./
279
+ tpu_metrics_debug:
280
+ desc: null
281
+ value: false
282
+ tpu_num_cores:
283
+ desc: null
284
+ value: null
285
+ train_ref_file:
286
+ desc: null
287
+ value: null
288
+ use_fast_tokenizer:
289
+ desc: null
290
+ value: true
291
+ use_legacy_prediction_loop:
292
+ desc: null
293
+ value: false
294
+ validation_ref_file:
295
+ desc: null
296
+ value: null
297
+ validation_split_percentage:
298
+ desc: null
299
+ value: 5
300
+ warmup_ratio:
301
+ desc: null
302
+ value: 0.0
303
+ warmup_steps:
304
+ desc: null
305
+ value: 10000
306
+ weight_decay:
307
+ desc: null
308
+ value: 0.0095
wandb/run-20210716_222651-1lrzcta0/files/output.log ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
1
+
2
+ [22:27:05] - INFO - absl - Restoring checkpoint from ./checkpoint_330000
3
+ tcmalloc: large alloc 1530273792 bytes == 0x9c650000 @ 0x7f9d1bb89680 0x7f9d1bbaa824 0x5b9a14 0x50b2ae 0x50cb1b 0x5a6f17 0x5f3010 0x56fd36 0x568d9a 0x5f5b33 0x56aadf 0x568d9a 0x68cdc7 0x67e161 0x67e1df 0x67e281 0x67e627 0x6b6e62 0x6b71ed 0x7f9d1b99e0b3 0x5f96de
4
+ /home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
5
+ warnings.warn(
6
+ /home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
7
+ warnings.warn(
8
+ Epoch ... (1/4): 0%| | 0/4 [00:00<?, ?it/s]
wandb/run-20210716_222651-1lrzcta0/files/requirements.txt ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ absl-py==0.13.0
2
+ aiohttp==3.7.4.post0
3
+ astunparse==1.6.3
4
+ async-timeout==3.0.1
5
+ attrs==21.2.0
6
+ cachetools==4.2.2
7
+ certifi==2021.5.30
8
+ chardet==4.0.0
9
+ charset-normalizer==2.0.1
10
+ chex==0.0.8
11
+ click==8.0.1
12
+ configparser==5.0.2
13
+ cycler==0.10.0
14
+ datasets==1.9.1.dev0
15
+ dill==0.3.4
16
+ dm-tree==0.1.6
17
+ docker-pycreds==0.4.0
18
+ filelock==3.0.12
19
+ flatbuffers==1.12
20
+ flax==0.3.4
21
+ fsspec==2021.7.0
22
+ gast==0.4.0
23
+ gitdb==4.0.7
24
+ gitpython==3.1.18
25
+ google-auth-oauthlib==0.4.4
26
+ google-auth==1.32.1
27
+ google-pasta==0.2.0
28
+ grpcio==1.34.1
29
+ h5py==3.1.0
30
+ huggingface-hub==0.0.12
31
+ idna==3.2
32
+ install==1.3.4
33
+ jax==0.2.17
34
+ jaxlib==0.1.68
35
+ joblib==1.0.1
36
+ keras-nightly==2.5.0.dev2021032900
37
+ keras-preprocessing==1.1.2
38
+ kiwisolver==1.3.1
39
+ libtpu-nightly==0.1.dev20210615
40
+ markdown==3.3.4
41
+ matplotlib==3.4.2
42
+ msgpack==1.0.2
43
+ multidict==5.1.0
44
+ multiprocess==0.70.12.2
45
+ numpy==1.19.5
46
+ oauthlib==3.1.1
47
+ opt-einsum==3.3.0
48
+ optax==0.0.9
49
+ packaging==21.0
50
+ pandas==1.3.0
51
+ pathtools==0.1.2
52
+ pillow==8.3.1
53
+ pip==20.0.2
54
+ pkg-resources==0.0.0
55
+ promise==2.3
56
+ protobuf==3.17.3
57
+ psutil==5.8.0
58
+ pyarrow==4.0.1
59
+ pyasn1-modules==0.2.8
60
+ pyasn1==0.4.8
61
+ pyparsing==2.4.7
62
+ python-dateutil==2.8.1
63
+ pytz==2021.1
64
+ pyyaml==5.4.1
65
+ regex==2021.7.6
66
+ requests-oauthlib==1.3.0
67
+ requests==2.26.0
68
+ rsa==4.7.2
69
+ sacremoses==0.0.45
70
+ scipy==1.7.0
71
+ sentry-sdk==1.3.0
72
+ setuptools==44.0.0
73
+ shortuuid==1.0.1
74
+ six==1.15.0
75
+ smmap==4.0.0
76
+ subprocess32==3.5.4
77
+ tensorboard-data-server==0.6.1
78
+ tensorboard-plugin-wit==1.8.0
79
+ tensorboard==2.5.0
80
+ tensorflow-estimator==2.5.0
81
+ tensorflow==2.5.0
82
+ termcolor==1.1.0
83
+ tokenizers==0.10.3
84
+ toolz==0.11.1
85
+ torch==1.9.0
86
+ tqdm==4.61.2
87
+ transformers==4.9.0.dev0
88
+ typing-extensions==3.7.4.3
89
+ urllib3==1.26.6
90
+ wandb==0.10.33
91
+ werkzeug==2.0.1
92
+ wheel==0.36.2
93
+ wrapt==1.12.1
94
+ xxhash==2.0.2
95
+ yarl==1.6.3
wandb/run-20210716_222651-1lrzcta0/files/wandb-metadata.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "os": "Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29",
3
+ "python": "3.8.10",
4
+ "heartbeatAt": "2021-07-16T22:26:53.104485",
5
+ "startedAt": "2021-07-16T22:26:51.031817",
6
+ "docker": null,
7
+ "cpu_count": 96,
8
+ "cuda": null,
9
+ "args": [
10
+ "--push_to_hub",
11
+ "--output_dir=./",
12
+ "--model_type=big_bird",
13
+ "--config_name=./",
14
+ "--tokenizer_name=./",
15
+ "--max_seq_length=4096",
16
+ "--weight_decay=0.0095",
17
+ "--warmup_steps=10000",
18
+ "--overwrite_output_dir",
19
+ "--adam_beta1=0.9",
20
+ "--adam_beta2=0.98",
21
+ "--logging_steps=50",
22
+ "--eval_steps=10000",
23
+ "--num_train_epochs=4",
24
+ "--preprocessing_num_workers=96",
25
+ "--save_steps=15000",
26
+ "--learning_rate=3e-5",
27
+ "--per_device_train_batch_size=1",
28
+ "--per_device_eval_batch_size=1",
29
+ "--save_total_limit=50",
30
+ "--max_eval_samples=4000",
31
+ "--resume_from_checkpoint=./"
32
+ ],
33
+ "state": "running",
34
+ "program": "./run_mlm_flax_no_accum.py",
35
+ "codePath": "run_mlm_flax_no_accum.py",
36
+ "git": {
37
+ "remote": "https://huggingface.co/flax-community/pino-roberta-base",
38
+ "commit": "def9a456105f36b517155343f42ff643df2d20ce"
39
+ },
40
+ "email": null,
41
+ "root": "/home/dat/pino-roberta-base",
42
+ "host": "t1v-n-f5c06ea1-w-0",
43
+ "username": "dat",
44
+ "executable": "/home/dat/pino/bin/python"
45
+ }
wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json ADDED
@@ -0,0 +1 @@
 
1
+ {}
wandb/run-20210716_222651-1lrzcta0/logs/debug-internal.log ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2021-07-16 22:26:51,733 INFO MainThread:797496 [internal.py:wandb_internal():88] W&B internal server running at pid: 797496, started at: 2021-07-16 22:26:51.733071
2
+ 2021-07-16 22:26:51,735 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: check_version
3
+ 2021-07-16 22:26:51,735 INFO WriterThread:797496 [datastore.py:open_for_write():80] open: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/run-1lrzcta0.wandb
4
+ 2021-07-16 22:26:51,736 DEBUG SenderThread:797496 [sender.py:send():179] send: header
5
+ 2021-07-16 22:26:51,736 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: check_version
6
+ 2021-07-16 22:26:51,775 DEBUG SenderThread:797496 [sender.py:send():179] send: run
7
+ 2021-07-16 22:26:51,967 INFO SenderThread:797496 [dir_watcher.py:__init__():168] watching files in: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files
8
+ 2021-07-16 22:26:51,968 INFO SenderThread:797496 [sender.py:_start_run_threads():716] run started: 1lrzcta0 with start time 1626474411
9
+ 2021-07-16 22:26:51,968 DEBUG SenderThread:797496 [sender.py:send():179] send: summary
10
+ 2021-07-16 22:26:51,968 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: run_start
11
+ 2021-07-16 22:26:51,969 INFO SenderThread:797496 [sender.py:_save_file():841] saving file wandb-summary.json with policy end
12
+ 2021-07-16 22:26:52,972 INFO Thread-8 :797496 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json
13
+ 2021-07-16 22:26:53,104 DEBUG HandlerThread:797496 [meta.py:__init__():39] meta init
14
+ 2021-07-16 22:26:53,104 DEBUG HandlerThread:797496 [meta.py:__init__():53] meta init done
15
+ 2021-07-16 22:26:53,104 DEBUG HandlerThread:797496 [meta.py:probe():210] probe
16
+ 2021-07-16 22:26:53,105 DEBUG HandlerThread:797496 [meta.py:_setup_git():200] setup git
17
+ 2021-07-16 22:26:53,134 DEBUG HandlerThread:797496 [meta.py:_setup_git():207] setup git done
18
+ 2021-07-16 22:26:53,135 DEBUG HandlerThread:797496 [meta.py:_save_pip():57] save pip
19
+ 2021-07-16 22:26:53,135 DEBUG HandlerThread:797496 [meta.py:_save_pip():71] save pip done
20
+ 2021-07-16 22:26:53,135 DEBUG HandlerThread:797496 [meta.py:probe():252] probe done
21
+ 2021-07-16 22:26:53,138 DEBUG SenderThread:797496 [sender.py:send():179] send: files
22
+ 2021-07-16 22:26:53,138 INFO SenderThread:797496 [sender.py:_save_file():841] saving file wandb-metadata.json with policy now
23
+ 2021-07-16 22:26:53,146 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
24
+ 2021-07-16 22:26:53,147 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
25
+ 2021-07-16 22:26:53,278 DEBUG SenderThread:797496 [sender.py:send():179] send: config
26
+ 2021-07-16 22:26:53,280 DEBUG SenderThread:797496 [sender.py:send():179] send: config
27
+ 2021-07-16 22:26:53,280 DEBUG SenderThread:797496 [sender.py:send():179] send: config
28
+ 2021-07-16 22:26:53,574 INFO Thread-11 :797496 [upload_job.py:push():137] Uploaded file /tmp/tmpq24rtm31wandb/2zocqi83-wandb-metadata.json
29
+ 2021-07-16 22:26:53,971 INFO Thread-8 :797496 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/requirements.txt
30
+ 2021-07-16 22:26:53,972 INFO Thread-8 :797496 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-metadata.json
31
+ 2021-07-16 22:26:53,972 INFO Thread-8 :797496 [dir_watcher.py:_on_file_created():216] file/dir created: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
32
+ 2021-07-16 22:27:07,978 INFO Thread-8 :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
33
+ 2021-07-16 22:27:08,331 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
34
+ 2021-07-16 22:27:08,331 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
35
+ 2021-07-16 22:27:09,979 INFO Thread-8 :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
36
+ 2021-07-16 22:27:11,980 INFO Thread-8 :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
37
+ 2021-07-16 22:27:21,188 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
38
+ 2021-07-16 22:27:22,985 INFO Thread-8 :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/config.yaml
39
+ 2021-07-16 22:27:23,496 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
40
+ 2021-07-16 22:27:23,496 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
41
+ 2021-07-16 22:27:38,629 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
42
+ 2021-07-16 22:27:38,629 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
43
+ 2021-07-16 22:27:51,269 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
44
+ 2021-07-16 22:27:53,761 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
45
+ 2021-07-16 22:27:53,762 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
46
+ 2021-07-16 22:28:08,896 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
47
+ 2021-07-16 22:28:08,896 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
48
+ 2021-07-16 22:28:21,343 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
49
+ 2021-07-16 22:28:24,028 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
50
+ 2021-07-16 22:28:24,029 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
51
+ 2021-07-16 22:28:39,163 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
52
+ 2021-07-16 22:28:39,163 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
53
+ 2021-07-16 22:28:51,416 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
54
+ 2021-07-16 22:28:54,295 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
55
+ 2021-07-16 22:28:54,295 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
56
+ 2021-07-16 22:29:09,427 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
57
+ 2021-07-16 22:29:09,428 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
58
+ 2021-07-16 22:29:21,488 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
59
+ 2021-07-16 22:29:24,558 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
60
+ 2021-07-16 22:29:24,559 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
61
+ 2021-07-16 22:29:39,688 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
62
+ 2021-07-16 22:29:39,689 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
63
+ 2021-07-16 22:29:51,560 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
64
+ 2021-07-16 22:29:54,818 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
65
+ 2021-07-16 22:29:54,818 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
66
+ 2021-07-16 22:30:09,948 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
67
+ 2021-07-16 22:30:09,948 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
68
+ 2021-07-16 22:30:21,629 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
69
+ 2021-07-16 22:30:25,078 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
70
+ 2021-07-16 22:30:25,078 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
71
+ 2021-07-16 22:30:40,210 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
72
+ 2021-07-16 22:30:40,211 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
73
+ 2021-07-16 22:30:51,688 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
74
+ 2021-07-16 22:30:55,351 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
75
+ 2021-07-16 22:30:55,351 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
76
+ 2021-07-16 22:31:10,485 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
77
+ 2021-07-16 22:31:10,485 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
78
+ 2021-07-16 22:31:21,758 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
79
+ 2021-07-16 22:31:25,617 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
80
+ 2021-07-16 22:31:25,617 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
81
+ 2021-07-16 22:31:40,750 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
82
+ 2021-07-16 22:31:40,750 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
83
+ 2021-07-16 22:31:51,829 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
84
+ 2021-07-16 22:31:55,880 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
85
+ 2021-07-16 22:31:55,881 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
86
+ 2021-07-16 22:32:11,013 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
87
+ 2021-07-16 22:32:11,014 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
88
+ 2021-07-16 22:32:21,898 DEBUG SenderThread:797496 [sender.py:send():179] send: stats
89
+ 2021-07-16 22:32:26,146 DEBUG HandlerThread:797496 [handler.py:handle_request():124] handle_request: stop_status
90
+ 2021-07-16 22:32:26,147 DEBUG SenderThread:797496 [sender.py:send_request():193] send_request: stop_status
91
+ 2021-07-16 22:32:39,632 WARNING MainThread:797496 [internal.py:wandb_internal():147] Internal process interrupt: 1
92
+ 2021-07-16 22:32:40,112 INFO Thread-8 :797496 [dir_watcher.py:_on_file_modified():229] file/dir modified: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
93
+ 2021-07-16 22:32:40,157 WARNING MainThread:797496 [internal.py:wandb_internal():147] Internal process interrupt: 2
94
+ 2021-07-16 22:32:40,157 ERROR MainThread:797496 [internal.py:wandb_internal():150] Internal process interrupted.
95
+ 2021-07-16 22:32:40,814 INFO WriterThread:797496 [datastore.py:close():288] close: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/run-1lrzcta0.wandb
96
+ 2021-07-16 22:32:40,815 INFO SenderThread:797496 [sender.py:finish():945] shutting down sender
97
+ 2021-07-16 22:32:40,815 INFO SenderThread:797496 [dir_watcher.py:finish():282] shutting down directory watcher
98
+ 2021-07-16 22:32:40,815 INFO HandlerThread:797496 [handler.py:finish():638] shutting down handler
99
+ 2021-07-16 22:32:41,113 INFO SenderThread:797496 [dir_watcher.py:finish():312] scan: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files
100
+ 2021-07-16 22:32:41,113 INFO SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/requirements.txt requirements.txt
101
+ 2021-07-16 22:32:41,114 INFO SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log output.log
102
+ 2021-07-16 22:32:41,114 INFO SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-metadata.json wandb-metadata.json
103
+ 2021-07-16 22:32:41,117 INFO SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/config.yaml config.yaml
104
+ 2021-07-16 22:32:41,118 INFO SenderThread:797496 [dir_watcher.py:finish():318] scan save: /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json wandb-summary.json
105
+ 2021-07-16 22:32:41,121 INFO SenderThread:797496 [file_pusher.py:finish():177] shutting down file pusher
106
+ 2021-07-16 22:32:41,121 INFO SenderThread:797496 [file_pusher.py:join():182] waiting for file pusher
107
+ 2021-07-16 22:32:41,606 INFO Thread-14 :797496 [upload_job.py:push():137] Uploaded file /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/config.yaml
108
+ 2021-07-16 22:32:41,611 INFO Thread-13 :797496 [upload_job.py:push():137] Uploaded file /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/output.log
109
+ 2021-07-16 22:32:41,674 INFO Thread-12 :797496 [upload_job.py:push():137] Uploaded file /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/requirements.txt
110
+ 2021-07-16 22:32:41,777 INFO Thread-15 :797496 [upload_job.py:push():137] Uploaded file /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/files/wandb-summary.json
111
+ 2021-07-16 22:32:42,440 INFO MainThread:797496 [internal.py:handle_exit():78] Internal process exited
wandb/run-20210716_222651-1lrzcta0/logs/debug.log ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2021-07-16 22:26:51,033 INFO MainThread:796231 [wandb_setup.py:_flush():69] setting env: {}
2
+ 2021-07-16 22:26:51,033 INFO MainThread:796231 [wandb_setup.py:_flush():69] setting login settings: {}
3
+ 2021-07-16 22:26:51,033 INFO MainThread:796231 [wandb_init.py:_log_setup():337] Logging user logs to /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/logs/debug.log
4
+ 2021-07-16 22:26:51,033 INFO MainThread:796231 [wandb_init.py:_log_setup():338] Logging internal logs to /home/dat/pino-roberta-base/wandb/run-20210716_222651-1lrzcta0/logs/debug-internal.log
5
+ 2021-07-16 22:26:51,033 INFO MainThread:796231 [wandb_init.py:init():370] calling init triggers
6
+ 2021-07-16 22:26:51,034 INFO MainThread:796231 [wandb_init.py:init():375] wandb.init called with sweep_config: {}
7
+ config: {}
8
+ 2021-07-16 22:26:51,034 INFO MainThread:796231 [wandb_init.py:init():419] starting backend
9
+ 2021-07-16 22:26:51,034 INFO MainThread:796231 [backend.py:_multiprocessing_setup():70] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
10
+ 2021-07-16 22:26:51,081 INFO MainThread:796231 [backend.py:ensure_launched():135] starting backend process...
11
+ 2021-07-16 22:26:51,130 INFO MainThread:796231 [backend.py:ensure_launched():139] started backend process with pid: 797496
12
+ 2021-07-16 22:26:51,132 INFO MainThread:796231 [wandb_init.py:init():424] backend started and connected
13
+ 2021-07-16 22:26:51,135 INFO MainThread:796231 [wandb_init.py:init():472] updated telemetry
14
+ 2021-07-16 22:26:51,136 INFO MainThread:796231 [wandb_init.py:init():491] communicating current version
15
+ 2021-07-16 22:26:51,773 INFO MainThread:796231 [wandb_init.py:init():496] got version response upgrade_message: "wandb version 0.11.0 is available! To upgrade, please run:\n $ pip install wandb --upgrade"
16
+
17
+ 2021-07-16 22:26:51,773 INFO MainThread:796231 [wandb_init.py:init():504] communicating run to backend with 30 second timeout
18
+ 2021-07-16 22:26:51,967 INFO MainThread:796231 [wandb_init.py:init():529] starting run threads in backend
19
+ 2021-07-16 22:26:53,141 INFO MainThread:796231 [wandb_run.py:_console_start():1623] atexit reg
20
+ 2021-07-16 22:26:53,142 INFO MainThread:796231 [wandb_run.py:_redirect():1497] redirect: SettingsConsole.REDIRECT
21
+ 2021-07-16 22:26:53,142 INFO MainThread:796231 [wandb_run.py:_redirect():1502] Redirecting console.
22
+ 2021-07-16 22:26:53,144 INFO MainThread:796231 [wandb_run.py:_redirect():1558] Redirects installed.
23
+ 2021-07-16 22:26:53,145 INFO MainThread:796231 [wandb_init.py:init():554] run started, returning control to user process
24
+ 2021-07-16 22:26:53,151 INFO MainThread:796231 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 4.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_22-26-42_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 50, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
25
+ 2021-07-16 22:26:53,153 INFO MainThread:796231 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
26
+ 2021-07-16 22:26:53,154 INFO MainThread:796231 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}
27
+ 2021-07-16 22:32:39,717 INFO MainThread:796231 [wandb_run.py:_atexit_cleanup():1593] got exitcode: 255
28
+ 2021-07-16 22:32:39,718 INFO MainThread:796231 [wandb_run.py:_restore():1565] restore
wandb/run-20210716_222651-1lrzcta0/run-1lrzcta0.wandb ADDED
Binary file (6.6 kB). View file
wandb/run-20210716_223350-8eukt20m/files/config.yaml ADDED
@@ -0,0 +1,308 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ wandb_version: 1
2
+
3
+ __cached__setup_devices:
4
+ desc: null
5
+ value: cpu
6
+ _n_gpu:
7
+ desc: null
8
+ value: 0
9
+ _wandb:
10
+ desc: null
11
+ value:
12
+ cli_version: 0.10.33
13
+ framework: huggingface
14
+ huggingface_version: 4.9.0.dev0
15
+ is_jupyter_run: false
16
+ is_kaggle_kernel: false
17
+ python_version: 3.8.10
18
+ t:
19
+ 1:
20
+ - 1
21
+ - 3
22
+ - 11
23
+ 4: 3.8.10
24
+ 5: 0.10.33
25
+ 6: 4.9.0.dev0
26
+ 8:
27
+ - 5
28
+ adafactor:
29
+ desc: null
30
+ value: false
31
+ adam_beta1:
32
+ desc: null
33
+ value: 0.9
34
+ adam_beta2:
35
+ desc: null
36
+ value: 0.98
37
+ adam_epsilon:
38
+ desc: null
39
+ value: 1.0e-08
40
+ cache_dir:
41
+ desc: null
42
+ value: null
43
+ config_name:
44
+ desc: null
45
+ value: ./
46
+ dataloader_drop_last:
47
+ desc: null
48
+ value: false
49
+ dataloader_num_workers:
50
+ desc: null
51
+ value: 0
52
+ dataloader_pin_memory:
53
+ desc: null
54
+ value: true
55
+ dataset_config_name:
56
+ desc: null
57
+ value: null
58
+ dataset_name:
59
+ desc: null
60
+ value: null
61
+ ddp_find_unused_parameters:
62
+ desc: null
63
+ value: null
64
+ debug:
65
+ desc: null
66
+ value: []
67
+ deepspeed:
68
+ desc: null
69
+ value: null
70
+ disable_tqdm:
71
+ desc: null
72
+ value: false
73
+ do_eval:
74
+ desc: null
75
+ value: false
76
+ do_predict:
77
+ desc: null
78
+ value: false
79
+ do_train:
80
+ desc: null
81
+ value: false
82
+ dtype:
83
+ desc: null
84
+ value: float32
85
+ eval_accumulation_steps:
86
+ desc: null
87
+ value: null
88
+ eval_steps:
89
+ desc: null
90
+ value: 10000
91
+ evaluation_strategy:
92
+ desc: null
93
+ value: IntervalStrategy.NO
94
+ fp16:
95
+ desc: null
96
+ value: false
97
+ fp16_backend:
98
+ desc: null
99
+ value: auto
100
+ fp16_full_eval:
101
+ desc: null
102
+ value: false
103
+ fp16_opt_level:
104
+ desc: null
105
+ value: O1
106
+ gradient_accumulation_steps:
107
+ desc: null
108
+ value: 1
109
+ greater_is_better:
110
+ desc: null
111
+ value: null
112
+ group_by_length:
113
+ desc: null
114
+ value: false
115
+ ignore_data_skip:
116
+ desc: null
117
+ value: false
118
+ label_names:
119
+ desc: null
120
+ value: null
121
+ label_smoothing_factor:
122
+ desc: null
123
+ value: 0.0
124
+ learning_rate:
125
+ desc: null
126
+ value: 3.0e-05
127
+ length_column_name:
128
+ desc: null
129
+ value: length
130
+ line_by_line:
131
+ desc: null
132
+ value: false
133
+ load_best_model_at_end:
134
+ desc: null
135
+ value: false
136
+ local_rank:
137
+ desc: null
138
+ value: -1
139
+ log_level:
140
+ desc: null
141
+ value: -1
142
+ log_level_replica:
143
+ desc: null
144
+ value: -1
145
+ log_on_each_node:
146
+ desc: null
147
+ value: true
148
+ logging_dir:
149
+ desc: null
150
+ value: ./runs/Jul16_22-33-42_t1v-n-f5c06ea1-w-0
151
+ logging_first_step:
152
+ desc: null
153
+ value: false
154
+ logging_steps:
155
+ desc: null
156
+ value: 50
157
+ logging_strategy:
158
+ desc: null
159
+ value: IntervalStrategy.STEPS
160
+ lr_scheduler_type:
161
+ desc: null
162
+ value: SchedulerType.LINEAR
163
+ max_eval_samples:
164
+ desc: null
165
+ value: 4000
166
+ max_grad_norm:
167
+ desc: null
168
+ value: 1.0
169
+ max_seq_length:
170
+ desc: null
171
+ value: 4096
172
+ max_steps:
173
+ desc: null
174
+ value: -1
175
+ metric_for_best_model:
176
+ desc: null
177
+ value: null
178
+ mlm_probability:
179
+ desc: null
180
+ value: 0.15
181
+ model_name_or_path:
182
+ desc: null
183
+ value: null
184
+ model_type:
185
+ desc: null
186
+ value: big_bird
187
+ mp_parameters:
188
+ desc: null
189
+ value: ''
190
+ no_cuda:
191
+ desc: null
192
+ value: false
193
+ num_train_epochs:
194
+ desc: null
195
+ value: 4.0
196
+ output_dir:
197
+ desc: null
198
+ value: ./
199
+ overwrite_cache:
200
+ desc: null
201
+ value: false
202
+ overwrite_output_dir:
203
+ desc: null
204
+ value: true
205
+ pad_to_max_length:
206
+ desc: null
207
+ value: false
208
+ past_index:
209
+ desc: null
210
+ value: -1
211
+ per_device_eval_batch_size:
212
+ desc: null
213
+ value: 1
214
+ per_device_train_batch_size:
215
+ desc: null
216
+ value: 1
217
+ per_gpu_eval_batch_size:
218
+ desc: null
219
+ value: null
220
+ per_gpu_train_batch_size:
221
+ desc: null
222
+ value: null
223
+ prediction_loss_only:
224
+ desc: null
225
+ value: false
226
+ preprocessing_num_workers:
227
+ desc: null
228
+ value: 96
229
+ push_to_hub:
230
+ desc: null
231
+ value: true
232
+ push_to_hub_model_id:
233
+ desc: null
234
+ value: ''
235
+ push_to_hub_organization:
236
+ desc: null
237
+ value: null
238
+ push_to_hub_token:
239
+ desc: null
240
+ value: null
241
+ remove_unused_columns:
242
+ desc: null
243
+ value: true
244
+ report_to:
245
+ desc: null
246
+ value:
247
+ - tensorboard
248
+ - wandb
249
+ resume_from_checkpoint:
250
+ desc: null
251
+ value: ./
252
+ run_name:
253
+ desc: null
254
+ value: ./
255
+ save_on_each_node:
256
+ desc: null
257
+ value: false
258
+ save_steps:
259
+ desc: null
260
+ value: 15000
261
+ save_strategy:
262
+ desc: null
263
+ value: IntervalStrategy.STEPS
264
+ save_total_limit:
265
+ desc: null
266
+ value: 50
267
+ seed:
268
+ desc: null
269
+ value: 42
270
+ sharded_ddp:
271
+ desc: null
272
+ value: []
273
+ skip_memory_metrics:
274
+ desc: null
275
+ value: true
276
+ tokenizer_name:
277
+ desc: null
278
+ value: ./
279
+ tpu_metrics_debug:
280
+ desc: null
281
+ value: false
282
+ tpu_num_cores:
283
+ desc: null
284
+ value: null
285
+ train_ref_file:
286
+ desc: null
287
+ value: null
288
+ use_fast_tokenizer:
289
+ desc: null
290
+ value: true
291
+ use_legacy_prediction_loop:
292
+ desc: null
293
+ value: false
294
+ validation_ref_file:
295
+ desc: null
296
+ value: null
297
+ validation_split_percentage:
298
+ desc: null
299
+ value: 5
300
+ warmup_ratio:
301
+ desc: null
302
+ value: 0.0
303
+ warmup_steps:
304
+ desc: null
305
+ value: 10000
306
+ weight_decay:
307
+ desc: null
308
+ value: 0.0095
wandb/run-20210716_223350-8eukt20m/files/output.log ADDED
@@ -0,0 +1,1646 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [22:34:05] - INFO - absl - Restoring checkpoint from ./checkpoint_330000
2
+ tcmalloc: large alloc 1530273792 bytes == 0x9c4ae000 @ 0x7f2f3656a680 0x7f2f3658b824 0x5b9a14 0x50b2ae 0x50cb1b 0x5a6f17 0x5f3010 0x56fd36 0x568d9a 0x5f5b33 0x56aadf 0x568d9a 0x68cdc7 0x67e161 0x67e1df 0x67e281 0x67e627 0x6b6e62 0x6b71ed 0x7f2f3637f0b3 0x5f96de
3
+ /home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
4
+ warnings.warn(
5
+ /home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
6
+ warnings.warn(
7
+ Epoch ... (1/4): 0%| | 0/4 [00:00<?, ?it/s]
8
+ Training...: 0it [00:00, ?it/s]
9
+
10
+
11
+
12
+
13
+
14
+
15
+
16
+
17
+ Training...: 49it [04:31, 3.54it/s]
18
+
19
+
20
+
21
+
22
+
23
+
24
+
25
+
26
+ Training...: 100it [04:58, 2.04s/it]
27
+
28
+
29
+
30
+
31
+
32
+
33
+
34
+
35
+ Training...: 150it [05:18, 2.15s/it]
36
+
37
+
38
+
39
+
40
+
41
+
42
+
43
+
44
+ Training...: 202it [05:38, 1.19s/it]
45
+
46
+
47
+
48
+
49
+
50
+
51
+
52
+ Training...: 249it [05:52, 3.26it/s]
53
+
54
+
55
+
56
+
57
+
58
+
59
+
60
+
61
+ Training...: 299it [06:12, 3.37it/s]
62
+
63
+
64
+
65
+
66
+
67
+
68
+
69
+
70
+ Training...: 349it [06:32, 3.95it/s]
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+ Training...: 399it [06:52, 3.88it/s]
79
+
80
+
81
+
82
+
83
+
84
+
85
+
86
+
87
+ Training...: 449it [07:13, 3.26it/s]
88
+
89
+
90
+
91
+
92
+
93
+
94
+
95
+
96
+ Training...: 500it [07:39, 2.27s/it]
97
+
98
+
99
+
100
+
101
+
102
+
103
+
104
+
105
+ Training...: 551it [08:00, 1.69s/it]
106
+
107
+
108
+
109
+
110
+
111
+
112
+
113
+
114
+ Training...: 603it [08:21, 1.02it/s]
115
+
116
+
117
+
118
+
119
+
120
+
121
+
122
+
123
+ Training...: 653it [08:41, 1.04it/s]
124
+
125
+
126
+
127
+
128
+
129
+
130
+
131
+ Training...: 699it [08:54, 3.99it/s]
132
+
133
+
134
+
135
+
136
+
137
+
138
+
139
+ Training...: 749it [09:13, 3.65it/s]
140
+
141
+
142
+
143
+
144
+
145
+
146
+
147
+ Training...: 799it [09:34, 3.86it/s]
148
+
149
+
150
+
151
+
152
+
153
+
154
+
155
+ Training...: 849it [09:54, 4.29it/s]
156
+
157
+
158
+
159
+
160
+
161
+
162
+
163
+ Training...: 899it [10:14, 4.42it/s]
164
+
165
+
166
+
167
+
168
+
169
+
170
+
171
+
172
+ Training...: 950it [10:41, 2.24s/it]
173
+
174
+
175
+
176
+
177
+
178
+
179
+
180
+
181
+ Training...: 1001it [11:02, 1.78s/it]
182
+
183
+
184
+
185
+
186
+
187
+
188
+
189
+ Training...: 1049it [11:15, 3.52it/s]
190
+
191
+
192
+
193
+
194
+
195
+
196
+
197
+ Training...: 1099it [11:34, 3.59it/s]
198
+
199
+
200
+
201
+
202
+
203
+
204
+
205
+ Training...: 1149it [11:55, 3.88it/s]
206
+
207
+
208
+
209
+
210
+
211
+
212
+
213
+ Training...: 1199it [12:16, 3.57it/s]
214
+
215
+
216
+
217
+
218
+
219
+
220
+
221
+ Training...: 1249it [12:35, 3.90it/s]
222
+
223
+
224
+
225
+
226
+
227
+
228
+
229
+
230
+ Training...: 1300it [13:03, 2.49s/it]
231
+
232
+
233
+
234
+
235
+
236
+
237
+
238
+
239
+ Training...: 1350it [13:23, 2.47s/it]
240
+
241
+
242
+
243
+
244
+
245
+
246
+ Training...: 1399it [13:36, 3.15it/s]
247
+
248
+
249
+
250
+
251
+
252
+
253
+
254
+
255
+ Training...: 1449it [13:57, 3.80it/s]
256
+
257
+
258
+
259
+
260
+
261
+
262
+
263
+
264
+ Training...: 1499it [14:18, 3.49it/s]
265
+
266
+
267
+
268
+
269
+
270
+
271
+
272
+ Training...: 1549it [14:37, 4.04it/s]
273
+
274
+
275
+
276
+
277
+
278
+
279
+
280
+ Training...: 1599it [14:58, 3.46it/s]
281
+
282
+
283
+
284
+
285
+
286
+
287
+
288
+ Training...: 1649it [15:17, 3.58it/s]
289
+
290
+
291
+
292
+
293
+
294
+
295
+
296
+
297
+ Training...: 1700it [15:45, 2.38s/it]
298
+
299
+
300
+
301
+
302
+
303
+
304
+
305
+
306
+ Training...: 1750it [16:05, 2.42s/it]
307
+
308
+
309
+
310
+
311
+
312
+
313
+
314
+ Training...: 1802it [16:26, 1.40s/it]
315
+
316
+
317
+
318
+
319
+
320
+
321
+ Training...: 1849it [16:39, 3.88it/s]
322
+
323
+
324
+
325
+
326
+
327
+
328
+
329
+ Training...: 1899it [16:58, 3.93it/s]
330
+
331
+
332
+
333
+
334
+
335
+
336
+
337
+ Training...: 1949it [17:18, 3.92it/s]
338
+
339
+
340
+
341
+
342
+
343
+
344
+
345
+ Training...: 1999it [17:39, 3.55it/s]
346
+
347
+
348
+
349
+
350
+
351
+
352
+ Training...: 2049it [17:58, 4.16it/s]
353
+
354
+
355
+
356
+
357
+
358
+
359
+
360
+ Training...: 2099it [18:19, 4.03it/s]
361
+
362
+
363
+
364
+
365
+
366
+
367
+
368
+ Training...: 2150it [18:47, 2.55s/it]
369
+
370
+
371
+
372
+
373
+
374
+
375
+
376
+
377
+ Training...: 2202it [19:08, 1.25s/it]
378
+
379
+
380
+
381
+
382
+
383
+
384
+
385
+ Training...: 2251it [19:28, 1.91s/it]
386
+
387
+
388
+
389
+
390
+
391
+
392
+ Training...: 2299it [19:41, 3.47it/s]
393
+
394
+
395
+
396
+
397
+
398
+
399
+
400
+ Training...: 2349it [20:00, 4.14it/s]
401
+
402
+
403
+
404
+
405
+
406
+
407
+
408
+ Training...: 2399it [20:21, 4.23it/s]
409
+
410
+
411
+
412
+
413
+
414
+
415
+
416
+ Training...: 2449it [20:41, 4.51it/s]
417
+
418
+
419
+
420
+
421
+
422
+
423
+
424
+ Training...: 2499it [21:01, 4.35it/s]
425
+
426
+
427
+
428
+
429
+
430
+
431
+
432
+ Training...: 2549it [21:22, 4.17it/s]
433
+
434
+
435
+
436
+
437
+
438
+
439
+ Training...: 2599it [21:41, 5.19it/s]
440
+
441
+
442
+
443
+
444
+
445
+
446
+
447
+ Training...: 2649it [22:02, 3.83it/s]
448
+
449
+
450
+
451
+
452
+
453
+
454
+
455
+ Training...: 2699it [22:22, 3.85it/s]
456
+
457
+
458
+
459
+
460
+
461
+
462
+
463
+ Training...: 2749it [22:42, 3.32it/s]
464
+
465
+
466
+
467
+
468
+
469
+
470
+ Training...: 2799it [23:02, 3.71it/s]
471
+
472
+
473
+
474
+
475
+
476
+
477
+ Training...: 2849it [23:22, 3.98it/s]
478
+
479
+
480
+
481
+
482
+
483
+
484
+
485
+ Training...: 2899it [23:43, 4.62it/s]
486
+
487
+
488
+
489
+
490
+
491
+
492
+ Training...: 2949it [24:02, 4.47it/s]
493
+
494
+
495
+
496
+
497
+
498
+
499
+
500
+
501
+ Training...: 3001it [24:31, 1.80s/it]
502
+
503
+
504
+
505
+
506
+
507
+
508
+
509
+
510
+ Training...: 3050it [24:51, 2.53s/it]
511
+
512
+
513
+
514
+
515
+
516
+
517
+
518
+ Training...: 3099it [25:04, 3.12it/s]
519
+
520
+
521
+
522
+
523
+
524
+
525
+
526
+ Training...: 3149it [25:24, 4.20it/s]
527
+
528
+
529
+
530
+
531
+
532
+
533
+
534
+ Training...: 3199it [25:44, 4.03it/s]
535
+
536
+
537
+
538
+
539
+
540
+
541
+
542
+ Training...: 3249it [26:04, 4.35it/s]
543
+
544
+
545
+
546
+
547
+
548
+
549
+
550
+ Training...: 3299it [26:25, 4.08it/s]
551
+
552
+
553
+
554
+
555
+
556
+
557
+
558
+
559
+ Training...: 3355it [26:54, 1.51it/s]
560
+
561
+
562
+
563
+
564
+
565
+
566
+
567
+ Training...: 3405it [27:14, 1.65it/s]
568
+
569
+
570
+
571
+
572
+
573
+
574
+
575
+ Training...: 3455it [27:35, 1.53it/s]
576
+
577
+
578
+
579
+
580
+
581
+
582
+ Training...: 3499it [27:45, 4.18it/s]
583
+
584
+
585
+
586
+
587
+
588
+
589
+ Training...: 3549it [28:06, 4.75it/s]
590
+
591
+
592
+
593
+
594
+
595
+
596
+
597
+
598
+ Training...: 3600it [28:34, 2.52s/it]
599
+
600
+
601
+
602
+
603
+
604
+
605
+ Training...: 3649it [28:46, 4.11it/s]
606
+
607
+
608
+
609
+
610
+
611
+
612
+
613
+
614
+ Training...: 3703it [29:15, 1.08s/it]
615
+
616
+
617
+
618
+
619
+
620
+
621
+
622
+ Training...: 3755it [29:36, 1.55it/s]
623
+
624
+
625
+
626
+
627
+
628
+
629
+
630
+ Training...: 3805it [29:56, 1.46it/s]
631
+
632
+
633
+
634
+
635
+
636
+
637
+
638
+ Training...: 3856it [30:17, 1.92it/s]
639
+
640
+
641
+
642
+
643
+
644
+ Training...: 3899it [30:27, 4.17it/s]
645
+
646
+
647
+
648
+
649
+
650
+
651
+
652
+ Training...: 3950it [30:55, 2.71s/it]
653
+
654
+
655
+
656
+
657
+
658
+
659
+
660
+ Training...: 4001it [31:16, 2.00s/it]
661
+
662
+
663
+
664
+
665
+
666
+
667
+ Training...: 4049it [31:28, 4.39it/s]
668
+
669
+
670
+
671
+
672
+
673
+
674
+
675
+
676
+ Training...: 4102it [31:57, 1.50s/it]
677
+
678
+
679
+
680
+
681
+
682
+
683
+
684
+ Training...: 4153it [32:17, 1.03s/it]
685
+
686
+
687
+
688
+
689
+
690
+
691
+ Training...: 4199it [32:28, 3.92it/s]
692
+
693
+
694
+
695
+
696
+
697
+
698
+
699
+
700
+ Training...: 4255it [32:58, 1.53it/s]
701
+
702
+
703
+
704
+
705
+
706
+
707
+ Training...: 4299it [33:09, 4.63it/s]
708
+
709
+
710
+
711
+
712
+
713
+
714
+
715
+ Training...: 4356it [33:39, 1.95it/s]
716
+
717
+
718
+
719
+
720
+
721
+
722
+ Training...: 4407it [33:59, 2.24it/s]
723
+
724
+
725
+
726
+
727
+
728
+
729
+ Training...: 4449it [34:10, 3.99it/s]
730
+
731
+
732
+
733
+
734
+
735
+
736
+
737
+
738
+ Training...: 4501it [34:38, 2.05s/it]
739
+
740
+
741
+
742
+
743
+
744
+
745
+
746
+ Training...: 4551it [34:58, 1.99s/it]
747
+
748
+
749
+
750
+
751
+
752
+
753
+ Training...: 4599it [35:10, 4.03it/s]
754
+
755
+
756
+
757
+
758
+
759
+
760
+
761
+ Training...: 4649it [35:30, 3.36it/s]
762
+
763
+
764
+
765
+
766
+
767
+
768
+
769
+ Training...: 4699it [35:50, 3.58it/s]
770
+
771
+
772
+
773
+
774
+
775
+
776
+
777
+ Training...: 4749it [36:11, 3.65it/s]
778
+
779
+
780
+
781
+
782
+
783
+
784
+ Training...: 4799it [36:30, 4.12it/s]
785
+
786
+
787
+
788
+
789
+
790
+
791
+ Training...: 4849it [36:51, 4.20it/s]
792
+
793
+
794
+
795
+
796
+
797
+
798
+
799
+ Training...: 4901it [37:19, 1.95s/it]
800
+
801
+
802
+
803
+
804
+
805
+
806
+ Training...: 4949it [37:31, 4.48it/s]
807
+
808
+
809
+
810
+
811
+
812
+
813
+
814
+ Training...: 4999it [37:51, 4.76it/s]
815
+
816
+
817
+
818
+
819
+
820
+
821
+
822
+ Training...: 5049it [38:11, 4.47it/s]
823
+
824
+
825
+
826
+
827
+
828
+
829
+ Training...: 5099it [38:31, 4.10it/s]
830
+
831
+
832
+
833
+
834
+
835
+
836
+ Training...: 5149it [38:51, 5.07it/s]
837
+
838
+
839
+
840
+
841
+
842
+
843
+
844
+ Training...: 5199it [39:12, 4.14it/s]
845
+
846
+
847
+
848
+
849
+
850
+
851
+
852
+ Training...: 5249it [39:32, 4.37it/s]
853
+
854
+
855
+
856
+
857
+
858
+
859
+
860
+ Training...: 5300it [40:01, 2.86s/it]
861
+
862
+
863
+
864
+
865
+
866
+
867
+ Training...: 5349it [40:13, 4.06it/s]
868
+
869
+
870
+
871
+
872
+
873
+
874
+
875
+ Training...: 5399it [40:33, 4.06it/s]
876
+
877
+
878
+
879
+
880
+
881
+
882
+
883
+ Training...: 5449it [40:53, 4.52it/s]
884
+
885
+
886
+
887
+
888
+
889
+
890
+
891
+ Training...: 5499it [41:13, 4.03it/s]
892
+
893
+
894
+
895
+
896
+
897
+
898
+
899
+ Training...: 5549it [41:33, 4.36it/s]
900
+
901
+
902
+
903
+
904
+
905
+
906
+ Training...: 5599it [41:53, 4.64it/s]
907
+
908
+
909
+
910
+
911
+
912
+
913
+
914
+ Training...: 5649it [42:14, 3.88it/s]
915
+
916
+
917
+
918
+
919
+
920
+
921
+ Training...: 5699it [42:34, 4.44it/s]
922
+
923
+
924
+
925
+
926
+
927
+
928
+
929
+ Training...: 5758it [43:05, 2.51it/s]
930
+
931
+
932
+
933
+
934
+
935
+ Training...: 5799it [43:15, 4.46it/s]
936
+
937
+
938
+
939
+
940
+
941
+
942
+ Training...: 5849it [43:34, 3.52it/s]
943
+
944
+
945
+
946
+
947
+
948
+
949
+ Training...: 5899it [43:54, 3.85it/s]
950
+
951
+
952
+
953
+
954
+
955
+
956
+
957
+ Training...: 5949it [44:15, 3.37it/s]
958
+
959
+
960
+
961
+
962
+
963
+
964
+
965
+ Training...: 5999it [44:35, 4.29it/s]
966
+
967
+
968
+
969
+
970
+
971
+
972
+
973
+ Training...: 6054it [45:05, 1.10it/s]
974
+
975
+
976
+
977
+
978
+
979
+ Training...: 6099it [45:15, 4.25it/s]
980
+
981
+
982
+
983
+
984
+
985
+
986
+
987
+ Training...: 6149it [45:36, 3.81it/s]
988
+
989
+
990
+
991
+
992
+
993
+
994
+
995
+ Training...: 6206it [46:06, 1.92it/s]
996
+
997
+
998
+
999
+
1000
+
1001
+ Training...: 6249it [46:16, 4.23it/s]
1002
+
1003
+
1004
+
1005
+
1006
+
1007
+
1008
+
1009
+ Training...: 6300it [46:45, 2.89s/it]
1010
+
1011
+
1012
+
1013
+
1014
+
1015
+ Training...: 6349it [46:55, 5.06it/s]
1016
+
1017
+
1018
+
1019
+
1020
+
1021
+
1022
+ Training...: 6399it [47:17, 4.44it/s]
1023
+
1024
+
1025
+
1026
+
1027
+
1028
+
1029
+ Training...: 6449it [47:36, 4.63it/s]
1030
+
1031
+
1032
+
1033
+
1034
+
1035
+
1036
+
1037
+ Training...: 6499it [47:57, 5.20it/s]
1038
+
1039
+
1040
+
1041
+
1042
+
1043
+
1044
+ Training...: 6549it [48:17, 4.47it/s]
1045
+
1046
+
1047
+
1048
+
1049
+
1050
+
1051
+ Training...: 6599it [48:37, 4.40it/s]
1052
+
1053
+
1054
+
1055
+
1056
+
1057
+
1058
+
1059
+ Training...: 6649it [48:58, 3.63it/s]
1060
+
1061
+
1062
+
1063
+
1064
+
1065
+
1066
+ Training...: 6699it [49:17, 5.70it/s]
1067
+
1068
+
1069
+
1070
+
1071
+
1072
+
1073
+ Training...: 6749it [49:37, 4.09it/s]
1074
+
1075
+
1076
+
1077
+
1078
+
1079
+
1080
+ Training...: 6799it [49:58, 5.26it/s]
1081
+
1082
+
1083
+
1084
+
1085
+
1086
+
1087
+ Training...: 6849it [50:18, 4.74it/s]
1088
+
1089
+
1090
+
1091
+
1092
+
1093
+
1094
+ Training...: 6899it [50:38, 4.45it/s]
1095
+
1096
+
1097
+
1098
+
1099
+
1100
+
1101
+ Training...: 6949it [50:58, 4.79it/s]
1102
+
1103
+
1104
+
1105
+
1106
+
1107
+
1108
+ Training...: 6999it [51:18, 4.90it/s]
1109
+
1110
+
1111
+
1112
+
1113
+
1114
+
1115
+
1116
+ Training...: 7050it [51:48, 2.98s/it]
1117
+
1118
+
1119
+
1120
+
1121
+
1122
+
1123
+
1124
+ Training...: 7103it [52:08, 1.06s/it]
1125
+
1126
+
1127
+
1128
+
1129
+
1130
+
1131
+ Training...: 7151it [52:28, 2.23s/it]
1132
+
1133
+
1134
+
1135
+
1136
+
1137
+
1138
+ Training...: 7199it [52:39, 4.10it/s]
1139
+
1140
+
1141
+
1142
+
1143
+
1144
+
1145
+ Training...: 7249it [52:59, 4.83it/s]
1146
+
1147
+
1148
+
1149
+
1150
+
1151
+
1152
+ Training...: 7299it [53:19, 3.87it/s]
1153
+
1154
+
1155
+
1156
+
1157
+
1158
+
1159
+ Training...: 7349it [53:39, 3.92it/s]
1160
+
1161
+
1162
+
1163
+
1164
+
1165
+
1166
+ Training...: 7399it [53:59, 5.25it/s]
1167
+
1168
+
1169
+
1170
+
1171
+
1172
+
1173
+ Training...: 7449it [54:20, 4.91it/s]
1174
+
1175
+
1176
+
1177
+
1178
+
1179
+
1180
+ Training...: 7499it [54:41, 4.09it/s]
1181
+
1182
+
1183
+
1184
+
1185
+
1186
+
1187
+ Training...: 7549it [55:01, 4.48it/s]
1188
+
1189
+
1190
+
1191
+
1192
+
1193
+
1194
+ Training...: 7599it [55:21, 4.50it/s]
1195
+
1196
+
1197
+
1198
+
1199
+
1200
+
1201
+ Training...: 7649it [55:42, 3.76it/s]
1202
+
1203
+
1204
+
1205
+
1206
+
1207
+
1208
+ Training...: 7699it [56:01, 4.86it/s]
1209
+
1210
+
1211
+
1212
+
1213
+
1214
+
1215
+ Training...: 7749it [56:21, 4.62it/s]
1216
+
1217
+
1218
+
1219
+
1220
+
1221
+
1222
+
1223
+ Training...: 7800it [56:50, 2.99s/it]
1224
+
1225
+
1226
+
1227
+
1228
+
1229
+
1230
+
1231
+ Training...: 7851it [57:11, 2.11s/it]
1232
+
1233
+
1234
+
1235
+
1236
+
1237
+
1238
+
1239
+ Training...: 7901it [57:31, 2.21s/it]
1240
+
1241
+
1242
+
1243
+
1244
+
1245
+
1246
+ Training...: 7949it [57:42, 3.98it/s]
1247
+
1248
+
1249
+
1250
+
1251
+
1252
+
1253
+ Training...: 7999it [58:02, 4.47it/s]
1254
+
1255
+
1256
+
1257
+
1258
+
1259
+
1260
+ Training...: 8049it [58:22, 4.42it/s]
1261
+
1262
+
1263
+
1264
+
1265
+
1266
+
1267
+ Training...: 8099it [58:42, 5.26it/s]
1268
+
1269
+
1270
+
1271
+
1272
+
1273
+
1274
+ Training...: 8149it [59:03, 4.28it/s]
1275
+
1276
+
1277
+
1278
+
1279
+
1280
+
1281
+ Training...: 8199it [59:23, 4.92it/s]
1282
+
1283
+
1284
+
1285
+
1286
+
1287
+
1288
+ Training...: 8249it [59:43, 4.31it/s]
1289
+
1290
+
1291
+
1292
+
1293
+
1294
+
1295
+ Training...: 8299it [1:00:02, 4.23it/s]
1296
+
1297
+
1298
+
1299
+
1300
+
1301
+
1302
+ Training...: 8349it [1:00:24, 4.62it/s]
1303
+
1304
+
1305
+
1306
+
1307
+
1308
+
1309
+ Training...: 8399it [1:00:43, 4.49it/s]
1310
+
1311
+
1312
+
1313
+
1314
+
1315
+
1316
+
1317
+ Training...: 8450it [1:01:13, 3.12s/it]
1318
+
1319
+
1320
+
1321
+
1322
+
1323
+
1324
+ Training...: 8499it [1:01:24, 4.54it/s]
1325
+
1326
+
1327
+
1328
+
1329
+
1330
+
1331
+
1332
+ Training...: 8550it [1:01:53, 2.94s/it]
1333
+
1334
+
1335
+
1336
+
1337
+
1338
+
1339
+
1340
+ Training...: 8602it [1:02:14, 1.53s/it]
1341
+
1342
+
1343
+
1344
+
1345
+
1346
+ Training...: 8649it [1:02:24, 5.00it/s]
1347
+
1348
+
1349
+
1350
+
1351
+
1352
+
1353
+
1354
+ Training...: 8699it [1:02:45, 4.03it/s]
1355
+
1356
+
1357
+
1358
+
1359
+
1360
+
1361
+ Training...: 8749it [1:03:04, 4.99it/s]
1362
+
1363
+
1364
+
1365
+
1366
+
1367
+
1368
+ Training...: 8799it [1:03:24, 4.86it/s]
1369
+
1370
+
1371
+
1372
+
1373
+
1374
+
1375
+ Training...: 8849it [1:03:45, 4.86it/s]
1376
+
1377
+
1378
+
1379
+
1380
+
1381
+
1382
+ Training...: 8899it [1:04:04, 5.70it/s]
1383
+
1384
+
1385
+
1386
+
1387
+
1388
+
1389
+ Training...: 8949it [1:04:25, 4.74it/s]
1390
+
1391
+
1392
+
1393
+
1394
+
1395
+
1396
+ Training...: 8999it [1:04:45, 3.77it/s]
1397
+
1398
+
1399
+
1400
+
1401
+
1402
+
1403
+ Training...: 9049it [1:05:05, 4.84it/s]
1404
+
1405
+
1406
+
1407
+
1408
+
1409
+
1410
+ Training...: 9099it [1:05:25, 5.31it/s]
1411
+
1412
+
1413
+
1414
+
1415
+
1416
+
1417
+ Training...: 9149it [1:05:46, 5.07it/s]
1418
+
1419
+
1420
+
1421
+
1422
+
1423
+
1424
+ Training...: 9199it [1:06:06, 3.92it/s]
1425
+
1426
+
1427
+
1428
+
1429
+
1430
+
1431
+ Training...: 9249it [1:06:27, 5.23it/s]
1432
+
1433
+
1434
+
1435
+
1436
+
1437
+
1438
+
1439
+ Training...: 9300it [1:06:56, 3.00s/it]
1440
+
1441
+
1442
+
1443
+
1444
+
1445
+
1446
+ Training...: 9349it [1:07:06, 4.30it/s]
1447
+
1448
+
1449
+
1450
+
1451
+
1452
+
1453
+ Training...: 9401it [1:07:36, 2.25s/it]
1454
+
1455
+
1456
+
1457
+
1458
+
1459
+
1460
+ Training...: 9451it [1:07:57, 2.31s/it]
1461
+
1462
+
1463
+
1464
+
1465
+
1466
+
1467
+ Training...: 9500it [1:08:17, 3.12s/it]
1468
+
1469
+
1470
+
1471
+
1472
+
1473
+
1474
+ Training...: 9549it [1:08:28, 5.13it/s]
1475
+
1476
+
1477
+
1478
+
1479
+
1480
+
1481
+ Training...: 9599it [1:08:47, 5.11it/s]
1482
+
1483
+
1484
+
1485
+
1486
+
1487
+
1488
+ Training...: 9649it [1:09:07, 4.70it/s]
1489
+
1490
+
1491
+
1492
+
1493
+
1494
+
1495
+ Training...: 9699it [1:09:28, 4.87it/s]
1496
+
1497
+
1498
+
1499
+
1500
+
1501
+ Training...: 9749it [1:09:47, 4.71it/s]
1502
+
1503
+
1504
+
1505
+
1506
+
1507
+
1508
+ Training...: 9800it [1:10:18, 3.26s/it]
1509
+
1510
+
1511
+
1512
+
1513
+
1514
+
1515
+ Training...: 9851it [1:10:38, 2.35s/it]
1516
+
1517
+
1518
+
1519
+
1520
+
1521
+
1522
+
1523
+ Training...: 9900it [1:10:58, 2.97s/it]
1524
+
1525
+
1526
+
1527
+
1528
+
1529
+
1530
+
1531
+ Training...: 9950it [1:11:18, 2.99s/it]
1532
+
1533
+
1534
+
1535
+
1536
+
1537
+
1538
+
1539
+ Training...: 9999it [1:11:39, 4.76it/s]
1540
+ Step... (340000 | Loss: 2.0486693382263184, Learning Rate: 2.29339420911856e-05)
1541
+
1542
+
1543
+
1544
+
1545
+
1546
+
1547
+
1548
+
1549
+
1550
+
1551
+
1552
+
1553
+
1554
+
1555
+
1556
+
1557
+
1558
+
1559
+
1560
+
1561
+
1562
+
1563
+
1564
+
1565
+
1566
+
1567
+
1568
+
1569
+
1570
+
1571
+
1572
+
1573
+
1574
+
1575
+
1576
+
1577
+ Training...: 10049it [1:13:11, 3.97it/s]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1578
+
1579
+
1580
+
1581
+
1582
+
1583
+
1584
+ Training...: 10099it [1:13:31, 3.23it/s]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1585
+
1586
+
1587
+
1588
+
1589
+
1590
+
1591
+ Training...: 10151it [1:14:01, 2.26s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1592
+
1593
+
1594
+
1595
+
1596
+
1597
+
1598
+ Training...: 10203it [1:14:22, 1.23s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1599
+
1600
+
1601
+
1602
+
1603
+
1604
+
1605
+
1606
+ Training...: 10252it [1:14:42, 1.59s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1607
+
1608
+
1609
+
1610
+
1611
+
1612
+
1613
+ Training...: 10301it [1:15:02, 2.32s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1614
+
1615
+
1616
+
1617
+
1618
+
1619
+
1620
+ Training...: 10353it [1:15:22, 1.25s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1621
+
1622
+
1623
+
1624
+
1625
+
1626
+
1627
+ Training...: 10402it [1:15:42, 1.64s/it]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1628
+
1629
+
1630
+
1631
+
1632
+
1633
+ Training...: 10449it [1:15:53, 4.42it/s]█████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1634
+
1635
+
1636
+
1637
+
1638
+
1639
+
1640
+ Training...: 10499it [1:16:12, 4.34it/s]████████████████████���████████████████████████████████████████████████████████████████████████████| 500/500 [01:17<00:00, 7.90it/s]
1641
+
1642
+
1643
+
1644
+
1645
+
1646
+
wandb/run-20210716_223350-8eukt20m/files/requirements.txt ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ absl-py==0.13.0
2
+ aiohttp==3.7.4.post0
3
+ astunparse==1.6.3
4
+ async-timeout==3.0.1
5
+ attrs==21.2.0
6
+ cachetools==4.2.2
7
+ certifi==2021.5.30
8
+ chardet==4.0.0
9
+ charset-normalizer==2.0.1
10
+ chex==0.0.8
11
+ click==8.0.1
12
+ configparser==5.0.2
13
+ cycler==0.10.0
14
+ datasets==1.9.1.dev0
15
+ dill==0.3.4
16
+ dm-tree==0.1.6
17
+ docker-pycreds==0.4.0
18
+ filelock==3.0.12
19
+ flatbuffers==1.12
20
+ flax==0.3.4
21
+ fsspec==2021.7.0
22
+ gast==0.4.0
23
+ gitdb==4.0.7
24
+ gitpython==3.1.18
25
+ google-auth-oauthlib==0.4.4
26
+ google-auth==1.32.1
27
+ google-pasta==0.2.0
28
+ grpcio==1.34.1
29
+ h5py==3.1.0
30
+ huggingface-hub==0.0.12
31
+ idna==3.2
32
+ install==1.3.4
33
+ jax==0.2.17
34
+ jaxlib==0.1.68
35
+ joblib==1.0.1
36
+ keras-nightly==2.5.0.dev2021032900
37
+ keras-preprocessing==1.1.2
38
+ kiwisolver==1.3.1
39
+ libtpu-nightly==0.1.dev20210615
40
+ markdown==3.3.4
41
+ matplotlib==3.4.2
42
+ msgpack==1.0.2
43
+ multidict==5.1.0
44
+ multiprocess==0.70.12.2
45
+ numpy==1.19.5
46
+ oauthlib==3.1.1
47
+ opt-einsum==3.3.0
48
+ optax==0.0.9
49
+ packaging==21.0
50
+ pandas==1.3.0
51
+ pathtools==0.1.2
52
+ pillow==8.3.1
53
+ pip==20.0.2
54
+ pkg-resources==0.0.0
55
+ promise==2.3
56
+ protobuf==3.17.3
57
+ psutil==5.8.0
58
+ pyarrow==4.0.1
59
+ pyasn1-modules==0.2.8
60
+ pyasn1==0.4.8
61
+ pyparsing==2.4.7
62
+ python-dateutil==2.8.1
63
+ pytz==2021.1
64
+ pyyaml==5.4.1
65
+ regex==2021.7.6
66
+ requests-oauthlib==1.3.0
67
+ requests==2.26.0
68
+ rsa==4.7.2
69
+ sacremoses==0.0.45
70
+ scipy==1.7.0
71
+ sentry-sdk==1.3.0
72
+ setuptools==44.0.0
73
+ shortuuid==1.0.1
74
+ six==1.15.0
75
+ smmap==4.0.0
76
+ subprocess32==3.5.4
77
+ tensorboard-data-server==0.6.1
78
+ tensorboard-plugin-wit==1.8.0
79
+ tensorboard==2.5.0
80
+ tensorflow-estimator==2.5.0
81
+ tensorflow==2.5.0
82
+ termcolor==1.1.0
83
+ tokenizers==0.10.3
84
+ toolz==0.11.1
85
+ torch==1.9.0
86
+ tqdm==4.61.2
87
+ transformers==4.9.0.dev0
88
+ typing-extensions==3.7.4.3
89
+ urllib3==1.26.6
90
+ wandb==0.10.33
91
+ werkzeug==2.0.1
92
+ wheel==0.36.2
93
+ wrapt==1.12.1
94
+ xxhash==2.0.2
95
+ yarl==1.6.3
wandb/run-20210716_223350-8eukt20m/files/wandb-metadata.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "os": "Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29",
3
+ "python": "3.8.10",
4
+ "heartbeatAt": "2021-07-16T22:33:52.760670",
5
+ "startedAt": "2021-07-16T22:33:50.716895",
6
+ "docker": null,
7
+ "cpu_count": 96,
8
+ "cuda": null,
9
+ "args": [
10
+ "--push_to_hub",
11
+ "--output_dir=./",
12
+ "--model_type=big_bird",
13
+ "--config_name=./",
14
+ "--tokenizer_name=./",
15
+ "--max_seq_length=4096",
16
+ "--weight_decay=0.0095",
17
+ "--warmup_steps=10000",
18
+ "--overwrite_output_dir",
19
+ "--adam_beta1=0.9",
20
+ "--adam_beta2=0.98",
21
+ "--logging_steps=50",
22
+ "--eval_steps=10000",
23
+ "--num_train_epochs=4",
24
+ "--preprocessing_num_workers=96",
25
+ "--save_steps=15000",
26
+ "--learning_rate=3e-5",
27
+ "--per_device_train_batch_size=1",
28
+ "--per_device_eval_batch_size=1",
29
+ "--save_total_limit=50",
30
+ "--max_eval_samples=4000",
31
+ "--resume_from_checkpoint=./"
32
+ ],
33
+ "state": "running",
34
+ "program": "./run_mlm_flax_no_accum.py",
35
+ "codePath": "run_mlm_flax_no_accum.py",
36
+ "git": {
37
+ "remote": "https://huggingface.co/flax-community/pino-roberta-base",
38
+ "commit": "def9a456105f36b517155343f42ff643df2d20ce"
39
+ },
40
+ "email": null,
41
+ "root": "/home/dat/pino-roberta-base",
42
+ "host": "t1v-n-f5c06ea1-w-0",
43
+ "username": "dat",
44
+ "executable": "/home/dat/pino/bin/python"
45
+ }
wandb/run-20210716_223350-8eukt20m/files/wandb-summary.json ADDED
@@ -0,0 +1 @@
 
1
+ {"training_step": 340500, "learning_rate": 2.2923233700566925e-05, "train_loss": 1.8997719287872314, "_runtime": 4935, "_timestamp": 1626479765, "_step": 210, "eval_step": 340000, "eval_accuracy": 0.6408576965332031, "eval_loss": 1.8213207721710205}
wandb/run-20210716_223350-8eukt20m/logs/debug-internal.log ADDED
The diff for this file is too large to render. See raw diff
wandb/run-20210716_223350-8eukt20m/logs/debug.log ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2021-07-16 22:33:50,718 INFO MainThread:798495 [wandb_setup.py:_flush():69] setting env: {}
2
+ 2021-07-16 22:33:50,718 INFO MainThread:798495 [wandb_setup.py:_flush():69] setting login settings: {}
3
+ 2021-07-16 22:33:50,718 INFO MainThread:798495 [wandb_init.py:_log_setup():337] Logging user logs to /home/dat/pino-roberta-base/wandb/run-20210716_223350-8eukt20m/logs/debug.log
4
+ 2021-07-16 22:33:50,718 INFO MainThread:798495 [wandb_init.py:_log_setup():338] Logging internal logs to /home/dat/pino-roberta-base/wandb/run-20210716_223350-8eukt20m/logs/debug-internal.log
5
+ 2021-07-16 22:33:50,718 INFO MainThread:798495 [wandb_init.py:init():370] calling init triggers
6
+ 2021-07-16 22:33:50,719 INFO MainThread:798495 [wandb_init.py:init():375] wandb.init called with sweep_config: {}
7
+ config: {}
8
+ 2021-07-16 22:33:50,719 INFO MainThread:798495 [wandb_init.py:init():419] starting backend
9
+ 2021-07-16 22:33:50,719 INFO MainThread:798495 [backend.py:_multiprocessing_setup():70] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
10
+ 2021-07-16 22:33:50,767 INFO MainThread:798495 [backend.py:ensure_launched():135] starting backend process...
11
+ 2021-07-16 22:33:50,816 INFO MainThread:798495 [backend.py:ensure_launched():139] started backend process with pid: 799749
12
+ 2021-07-16 22:33:50,818 INFO MainThread:798495 [wandb_init.py:init():424] backend started and connected
13
+ 2021-07-16 22:33:50,821 INFO MainThread:798495 [wandb_init.py:init():472] updated telemetry
14
+ 2021-07-16 22:33:50,822 INFO MainThread:798495 [wandb_init.py:init():491] communicating current version
15
+ 2021-07-16 22:33:51,460 INFO MainThread:798495 [wandb_init.py:init():496] got version response upgrade_message: "wandb version 0.11.0 is available! To upgrade, please run:\n $ pip install wandb --upgrade"
16
+
17
+ 2021-07-16 22:33:51,460 INFO MainThread:798495 [wandb_init.py:init():504] communicating run to backend with 30 second timeout
18
+ 2021-07-16 22:33:51,635 INFO MainThread:798495 [wandb_init.py:init():529] starting run threads in backend
19
+ 2021-07-16 22:33:52,798 INFO MainThread:798495 [wandb_run.py:_console_start():1623] atexit reg
20
+ 2021-07-16 22:33:52,799 INFO MainThread:798495 [wandb_run.py:_redirect():1497] redirect: SettingsConsole.REDIRECT
21
+ 2021-07-16 22:33:52,799 INFO MainThread:798495 [wandb_run.py:_redirect():1502] Redirecting console.
22
+ 2021-07-16 22:33:52,801 INFO MainThread:798495 [wandb_run.py:_redirect():1558] Redirects installed.
23
+ 2021-07-16 22:33:52,801 INFO MainThread:798495 [wandb_init.py:init():554] run started, returning control to user process
24
+ 2021-07-16 22:33:52,807 INFO MainThread:798495 [wandb_run.py:_config_callback():872] config_cb None None {'output_dir': './', 'overwrite_output_dir': True, 'do_train': False, 'do_eval': False, 'do_predict': False, 'evaluation_strategy': 'IntervalStrategy.NO', 'prediction_loss_only': False, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 3e-05, 'weight_decay': 0.0095, 'adam_beta1': 0.9, 'adam_beta2': 0.98, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 4.0, 'max_steps': -1, 'lr_scheduler_type': 'SchedulerType.LINEAR', 'warmup_ratio': 0.0, 'warmup_steps': 10000, 'log_level': -1, 'log_level_replica': -1, 'log_on_each_node': True, 'logging_dir': './runs/Jul16_22-33-42_t1v-n-f5c06ea1-w-0', 'logging_strategy': 'IntervalStrategy.STEPS', 'logging_first_step': False, 'logging_steps': 50, 'save_strategy': 'IntervalStrategy.STEPS', 'save_steps': 15000, 'save_total_limit': 50, 'save_on_each_node': False, 'no_cuda': False, 'seed': 42, 'fp16': False, 'fp16_opt_level': 'O1', 'fp16_backend': 'auto', 'fp16_full_eval': False, 'local_rank': -1, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': 10000, 'dataloader_num_workers': 0, 'past_index': -1, 'run_name': './', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': False, 'metric_for_best_model': None, 'greater_is_better': None, 'ignore_data_skip': False, 'sharded_ddp': [], 'deepspeed': None, 'label_smoothing_factor': 0.0, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'dataloader_pin_memory': True, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': True, 'resume_from_checkpoint': './', 'push_to_hub_model_id': '', 'push_to_hub_organization': None, 'push_to_hub_token': None, 'mp_parameters': '', '_n_gpu': 0, '__cached__setup_devices': 'cpu'}
25
+ 2021-07-16 22:33:52,809 INFO MainThread:798495 [wandb_run.py:_config_callback():872] config_cb None None {'model_name_or_path': None, 'model_type': 'big_bird', 'config_name': './', 'tokenizer_name': './', 'cache_dir': None, 'use_fast_tokenizer': True, 'dtype': 'float32'}
26
+ 2021-07-16 22:33:52,811 INFO MainThread:798495 [wandb_run.py:_config_callback():872] config_cb None None {'dataset_name': None, 'dataset_config_name': None, 'train_ref_file': None, 'validation_ref_file': None, 'overwrite_cache': False, 'validation_split_percentage': 5, 'max_seq_length': 4096, 'preprocessing_num_workers': 96, 'mlm_probability': 0.15, 'pad_to_max_length': False, 'line_by_line': False, 'max_eval_samples': 4000}
wandb/run-20210716_223350-8eukt20m/run-8eukt20m.wandb ADDED
Binary file (225 kB). View file