Muennighoff commited on
Commit
928e583
1 Parent(s): 5131329
logs/2105757.err ADDED
@@ -0,0 +1,589 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 0: loading file tokenizer.json from cache at /users/muennighoff/.cache/huggingface/hub/models--bigscience--tokenizer/snapshots/d43158eabd9ae01d7cc562a364a87f79b09e46f7/tokenizer.json
2
+ 0: loading file added_tokens.json from cache at None
3
+ 0: loading file special_tokens_map.json from cache at /users/muennighoff/.cache/huggingface/hub/models--bigscience--tokenizer/snapshots/d43158eabd9ae01d7cc562a364a87f79b09e46f7/special_tokens_map.json
4
+ 0: loading file tokenizer_config.json from cache at /users/muennighoff/.cache/huggingface/hub/models--bigscience--tokenizer/snapshots/d43158eabd9ae01d7cc562a364a87f79b09e46f7/tokenizer_config.json
5
+ 0: Successfully preprocessed all matching files.
6
+ 0: Detected CUDA files, patching ldflags
7
+ 0: Emitting ninja build file /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-mtf/Megatron-DeepSpeed/megatron/fused_kernels/build/build.ninja...
8
+ 0: Building extension module scaled_upper_triang_masked_softmax_cuda...
9
+ 0: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
10
+ 0: Loading extension module scaled_upper_triang_masked_softmax_cuda...
11
+ 0: Successfully preprocessed all matching files.
12
+ 0: Detected CUDA files, patching ldflags
13
+ 0: Emitting ninja build file /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-mtf/Megatron-DeepSpeed/megatron/fused_kernels/build/build.ninja...
14
+ 0: Building extension module scaled_masked_softmax_cuda...
15
+ 0: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
16
+ 0: Loading extension module scaled_masked_softmax_cuda...
17
+ 0: Successfully preprocessed all matching files.
18
+ 0: Detected CUDA files, patching ldflags
19
+ 0: Emitting ninja build file /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-mtf/Megatron-DeepSpeed/megatron/fused_kernels/build/build.ninja...
20
+ 0: Building extension module fused_mix_prec_layer_norm_cuda...
21
+ 0: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
22
+ 0: Loading extension module fused_mix_prec_layer_norm_cuda...
23
+ 0: Successfully preprocessed all matching files.
24
+ 0: Successfully preprocessed all matching files.
25
+ 0: Successfully preprocessed all matching files.
26
+ 0: Successfully preprocessed all matching files.
27
+ 0: Successfully preprocessed all matching files.
28
+ 0: Successfully preprocessed all matching files.
29
+ 5: Successfully preprocessed all matching files.
30
+ 5: Successfully preprocessed all matching files.
31
+ 2: Successfully preprocessed all matching files.
32
+ 2: Successfully preprocessed all matching files.
33
+ 2: Successfully preprocessed all matching files.
34
+ 3: Successfully preprocessed all matching files.
35
+ 6: Successfully preprocessed all matching files.
36
+ 4: Successfully preprocessed all matching files.
37
+ 1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
38
+ 1: warnings.warn(
39
+ 1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
40
+ 1: warnings.warn(
41
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
42
+ 4: warnings.warn(
43
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
44
+ 4: warnings.warn(
45
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
46
+ 4: warnings.warn(
47
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
48
+ 4: warnings.warn(
49
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
50
+ 4: warnings.warn(
51
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
52
+ 4: warnings.warn(
53
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
54
+ 4: warnings.warn(
55
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
56
+ 4: warnings.warn(
57
+ 1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
58
+ 1: warnings.warn(
59
+ 1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
60
+ 1: warnings.warn(
61
+ 1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
62
+ 1: warnings.warn(
63
+ 1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
64
+ 1: warnings.warn(
65
+ 6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
66
+ 6: warnings.warn(
67
+ 1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
68
+ 1: warnings.warn(
69
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
70
+ 7: warnings.warn(
71
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
72
+ 7: warnings.warn(
73
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
74
+ 7: warnings.warn(
75
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
76
+ 7: warnings.warn(
77
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
78
+ 7: warnings.warn(
79
+ 6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
80
+ 6: warnings.warn(
81
+ 1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
82
+ 1: warnings.warn(
83
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
84
+ 0: warnings.warn(
85
+ 6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
86
+ 6: warnings.warn(
87
+ 6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
88
+ 6: warnings.warn(
89
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
90
+ 0: warnings.warn(
91
+ 6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
92
+ 6: warnings.warn(
93
+ 6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
94
+ 6: warnings.warn(
95
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
96
+ 0: warnings.warn(
97
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
98
+ 7: warnings.warn(
99
+ 6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
100
+ 6: warnings.warn(
101
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
102
+ 0: warnings.warn(
103
+ 2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
104
+ 2: warnings.warn(
105
+ 6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
106
+ 6: warnings.warn(
107
+ 5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
108
+ 5: warnings.warn(
109
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
110
+ 7: warnings.warn(
111
+ 5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
112
+ 5: warnings.warn(
113
+ 5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
114
+ 5: warnings.warn(
115
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
116
+ 7: warnings.warn(
117
+ 5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
118
+ 5: warnings.warn(
119
+ 5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
120
+ 5: warnings.warn(
121
+ 5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
122
+ 5: warnings.warn(
123
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
124
+ 0: warnings.warn(
125
+ 5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
126
+ 5: warnings.warn(
127
+ 5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
128
+ 5: warnings.warn(
129
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
130
+ 0: warnings.warn(
131
+ 2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
132
+ 2: warnings.warn(
133
+ 2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
134
+ 2: warnings.warn(
135
+ 2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
136
+ 2: warnings.warn(
137
+ 2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
138
+ 2: warnings.warn(
139
+ 3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
140
+ 3: warnings.warn(
141
+ 2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
142
+ 2: warnings.warn(
143
+ 3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
144
+ 3: warnings.warn(
145
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
146
+ 0: warnings.warn(
147
+ 3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
148
+ 3: warnings.warn(
149
+ 2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
150
+ 2: warnings.warn(
151
+ 3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
152
+ 3: warnings.warn(
153
+ 3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
154
+ 3: warnings.warn(
155
+ 3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
156
+ 3: warnings.warn(
157
+ 3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
158
+ 3: warnings.warn(
159
+ 3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
160
+ 3: warnings.warn(
161
+ 2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
162
+ 2: warnings.warn(
163
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
164
+ 0: warnings.warn(
165
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
166
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
167
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
168
+ 4: Emitting ninja build file /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu/utils/build.ninja...
169
+ 4: Building extension module utils...
170
+ 4: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
171
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
172
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
173
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
174
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
175
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
176
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
177
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
178
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
179
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
180
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
181
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
182
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
183
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
184
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
185
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
186
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
187
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
188
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
189
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
190
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
191
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
192
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
193
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
194
+ 3:
195
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
196
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
197
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
198
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
199
+ 3:
200
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
201
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
202
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
203
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
204
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
205
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
206
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
207
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
208
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
209
+ 5:
210
+ 5:
211
+ 5:
212
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
213
+ 5:
214
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
215
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
216
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
217
+ 6:
218
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
219
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
220
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
221
+ 6:
222
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
223
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
224
+ 7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
225
+ 7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
226
+ 7:
227
+ 7:
228
+ 7:
229
+ 7:
230
+ 7:
231
+ 7:
232
+ 4: Loading extension module utils...
233
+ 7: Emitting ninja build file /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu/utils/build.ninja...
234
+ 7: Building extension module utils...
235
+ 7: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
236
+ 7: Loading extension module utils...
237
+ 0: Loading extension module utils...
238
+ 0: Loading extension module utils...
239
+ 0: Loading extension module utils...
240
+ 0: Loading extension module utils...
241
+ 0: Loading extension module utils...
242
+ 1: Loading extension module utils...
243
+ 0: Loading extension module utils...
244
+ 0: Loading extension module utils...
245
+ 1: Loading extension module utils...
246
+ 1: Loading extension module utils...
247
+ 1: Loading extension module utils...
248
+ 1: Loading extension module utils...
249
+ 1: Loading extension module utils...
250
+ 1: Loading extension module utils...
251
+ 4: Loading extension module utils...
252
+ 4: Loading extension module utils...
253
+ 4: Loading extension module utils...
254
+ 4: Loading extension module utils...
255
+ 4: Loading extension module utils...
256
+ 4: Loading extension module utils...
257
+ 4: Loading extension module utils...
258
+ 2: Loading extension module utils...
259
+ 2: Loading extension module utils...
260
+ 3: Loading extension module utils...
261
+ 2: Loading extension module utils...
262
+ 3: Loading extension module utils...
263
+ 2: Loading extension module utils...
264
+ 3: Loading extension module utils...
265
+ 2: Loading extension module utils...
266
+ 3: Loading extension module utils...
267
+ 2: Loading extension module utils...
268
+ 3: Loading extension module utils...
269
+ 2: Loading extension module utils...
270
+ 3: Loading extension module utils...
271
+ 3: Loading extension module utils...
272
+ 2: Loading extension module utils...
273
+ 3: Loading extension module utils...
274
+ 5: Loading extension module utils...
275
+ 5: Loading extension module utils...
276
+ 5: Loading extension module utils...
277
+ 5: Loading extension module utils...
278
+ 5: Loading extension module utils...
279
+ 5: Loading extension module utils...
280
+ 5: Loading extension module utils...
281
+ 5: Loading extension module utils...
282
+ 6: Loading extension module utils...
283
+ 6: Loading extension module utils...
284
+ 6: Loading extension module utils...
285
+ 6: Loading extension module utils...
286
+ 6: Loading extension module utils...
287
+ 6: Loading extension module utils...
288
+ 6: Loading extension module utils...
289
+ 6: Loading extension module utils...
290
+ 7: Loading extension module utils...
291
+ 7: Loading extension module utils...
292
+ 7: Loading extension module utils...
293
+ 7: Loading extension module utils...
294
+ 7: Loading extension module utils...
295
+ 7: Loading extension module utils...
296
+ 7: Loading extension module utils...
297
+ 1: Loading extension module utils...
298
+ 0: Loading extension module utils...
299
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
300
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
301
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
302
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
303
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
304
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
305
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
306
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
307
+ 6:
308
+ 6:
309
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
310
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
311
+ 6:
312
+ 5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
313
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
314
+ 6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
315
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
316
+ 3:
317
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
318
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
319
+ 3:
320
+ 3:
321
+ 5: No modifications detected for re-loaded extension module utils, skipping build step...
322
+ 5: Loading extension module utils...
323
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
324
+ 3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
325
+ 6: No modifications detected for re-loaded extension module utils, skipping build step...
326
+ 6: Loading extension module utils...
327
+ 6: No modifications detected for re-loaded extension module utils, skipping build step...
328
+ 6: Loading extension module utils...
329
+ 3: No modifications detected for re-loaded extension module utils, skipping build step...
330
+ 3: Loading extension module utils...
331
+ 6: No modifications detected for re-loaded extension module utils, skipping build step...
332
+ 6: Loading extension module utils...
333
+ 5: No modifications detected for re-loaded extension module utils, skipping build step...
334
+ 5: Loading extension module utils...
335
+ 5: No modifications detected for re-loaded extension module utils, skipping build step...
336
+ 5: Loading extension module utils...
337
+ 5: No modifications detected for re-loaded extension module utils, skipping build step...
338
+ 5: Loading extension module utils...
339
+ 5: No modifications detected for re-loaded extension module utils, skipping build step...
340
+ 5: Loading extension module utils...
341
+ 5: No modifications detected for re-loaded extension module utils, skipping build step...
342
+ 5: Loading extension module utils...
343
+ 5: No modifications detected for re-loaded extension module utils, skipping build step...
344
+ 5: Loading extension module utils...
345
+ 5: No modifications detected for re-loaded extension module utils, skipping build step...
346
+ 5: Loading extension module utils...
347
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
348
+ 6: No modifications detected for re-loaded extension module utils, skipping build step...
349
+ 6: Loading extension module utils...
350
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
351
+ 6: No modifications detected for re-loaded extension module utils, skipping build step...
352
+ 6: Loading extension module utils...
353
+ 6: No modifications detected for re-loaded extension module utils, skipping build step...
354
+ 6: Loading extension module utils...
355
+ 6: No modifications detected for re-loaded extension module utils, skipping build step...
356
+ 6: Loading extension module utils...
357
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
358
+ 3: No modifications detected for re-loaded extension module utils, skipping build step...
359
+ 3: Loading extension module utils...
360
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
361
+ 6: No modifications detected for re-loaded extension module utils, skipping build step...
362
+ 3: No modifications detected for re-loaded extension module utils, skipping build step...
363
+ 3: Loading extension module utils...
364
+ 3: No modifications detected for re-loaded extension module utils, skipping build step...
365
+ 6: Loading extension module utils...
366
+ 3: Loading extension module utils...
367
+ 3: No modifications detected for re-loaded extension module utils, skipping build step...
368
+ 3: Loading extension module utils...
369
+ 3: No modifications detected for re-loaded extension module utils, skipping build step...
370
+ 3: Loading extension module utils...
371
+ 0: No modifications detected for re-loaded extension module utils, skipping build step...
372
+ 0: Loading extension module utils...
373
+ 3: No modifications detected for re-loaded extension module utils, skipping build step...
374
+ 3: Loading extension module utils...
375
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
376
+ 0: No modifications detected for re-loaded extension module utils, skipping build step...
377
+ 0: Loading extension module utils...
378
+ 3: No modifications detected for re-loaded extension module utils, skipping build step...
379
+ 3: Loading extension module utils...
380
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
381
+ 0: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...
382
+ 0:
383
+ 0: Loading extension module utils...Loading extension module utils...
384
+ 0:
385
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
386
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
387
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
388
+ 0: No modifications detected for re-loaded extension module utils, skipping build step...
389
+ 0: Loading extension module utils...
390
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
391
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
392
+ 2:
393
+ 2:
394
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
395
+ 0: No modifications detected for re-loaded extension module utils, skipping build step...
396
+ 0: Loading extension module utils...
397
+ 2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
398
+ 0: No modifications detected for re-loaded extension module utils, skipping build step...
399
+ 0: Loading extension module utils...
400
+ 2: No modifications detected for re-loaded extension module utils, skipping build step...
401
+ 2: Loading extension module utils...
402
+ 2: No modifications detected for re-loaded extension module utils, skipping build step...
403
+ 2: Loading extension module utils...
404
+ 2: No modifications detected for re-loaded extension module utils, skipping build step...
405
+ 2: Loading extension module utils...
406
+ 2: No modifications detected for re-loaded extension module utils, skipping build step...
407
+ 2: No modifications detected for re-loaded extension module utils, skipping build step...Loading extension module utils...
408
+ 2:
409
+ 2: Loading extension module utils...
410
+ 2: No modifications detected for re-loaded extension module utils, skipping build step...
411
+ 2: Loading extension module utils...
412
+ 2: No modifications detected for re-loaded extension module utils, skipping build step...
413
+ 2: Loading extension module utils...
414
+ 2: No modifications detected for re-loaded extension module utils, skipping build step...
415
+ 2: Loading extension module utils...
416
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
417
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
418
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
419
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
420
+ 1:
421
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
422
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
423
+ 1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
424
+ 1: No modifications detected for re-loaded extension module utils, skipping build step...
425
+ 1: Loading extension module utils...
426
+ 1: No modifications detected for re-loaded extension module utils, skipping build step...
427
+ 1: Loading extension module utils...
428
+ 1: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...
429
+ 1:
430
+ 1: Loading extension module utils...Loading extension module utils...
431
+ 1:
432
+ 1: No modifications detected for re-loaded extension module utils, skipping build step...
433
+ 1: Loading extension module utils...
434
+ 1: No modifications detected for re-loaded extension module utils, skipping build step...
435
+ 1: Loading extension module utils...
436
+ 1: No modifications detected for re-loaded extension module utils, skipping build step...
437
+ 1: Loading extension module utils...
438
+ 1: No modifications detected for re-loaded extension module utils, skipping build step...
439
+ 1: Loading extension module utils...
440
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
441
+ 4:
442
+ 4:
443
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
444
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
445
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
446
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
447
+ 4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
448
+ 7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
449
+ 4: No modifications detected for re-loaded extension module utils, skipping build step...
450
+ 4: Loading extension module utils...
451
+ 4: No modifications detected for re-loaded extension module utils, skipping build step...
452
+ 4: Loading extension module utils...
453
+ 7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
454
+ 7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
455
+ 7:
456
+ 4: No modifications detected for re-loaded extension module utils, skipping build step...
457
+ 4: Loading extension module utils...
458
+ 7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
459
+ 7:
460
+ 7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
461
+ 4: No modifications detected for re-loaded extension module utils, skipping build step...
462
+ 4: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...Loading extension module utils...
463
+ 4:
464
+ 4:
465
+ 4: Loading extension module utils...Loading extension module utils...
466
+ 4:
467
+ 4: No modifications detected for re-loaded extension module utils, skipping build step...
468
+ 4: Loading extension module utils...
469
+ 4: No modifications detected for re-loaded extension module utils, skipping build step...
470
+ 4: Loading extension module utils...
471
+ 7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
472
+ 7: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...
473
+ 7:
474
+ 7: Loading extension module utils...Loading extension module utils...
475
+ 7:
476
+ 7: No modifications detected for re-loaded extension module utils, skipping build step...
477
+ 7: Loading extension module utils...
478
+ 7: No modifications detected for re-loaded extension module utils, skipping build step...
479
+ 7: Loading extension module utils...
480
+ 7: No modifications detected for re-loaded extension module utils, skipping build step...
481
+ 7: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...Loading extension module utils...
482
+ 7:
483
+ 7:
484
+ 7: Loading extension module utils...
485
+ 7: Loading extension module utils...
486
+ 7: No modifications detected for re-loaded extension module utils, skipping build step...
487
+ 7: Loading extension module utils...
488
+ 0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
489
+ 0: No modifications detected for re-loaded extension module utils, skipping build step...
490
+ 0: Loading extension module utils...
491
+ 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-mtf/Megatron-DeepSpeed/megatron/utils.py:356: UserWarning: Parameter count with the embeddings will be inaccurate with PP > 1, as the first and last stage hold several copies of the embeddings
492
+ 0: warnings.warn("Parameter count with the embeddings will be inaccurate with PP > 1, as the first and last stage hold several copies of the embeddings")
493
+ 4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
494
+ 4: warnings.warn(
495
+ 7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
496
+ 7: warnings.warn(
497
+ 6: Fatal Python error: Bus error
498
+ 6:
499
+ 6: Thread 0x0000144db9784700 (most recent call first):
500
+ 6: <no Python frame>
501
+ 6:
502
+ 6: Thread 0x0000144db9583700 (most recent call first):
503
+ 6: <no Python frame>
504
+ 6:
505
+ 6: Thread 0x0000144db9382700 (most recent call first):
506
+ 6: <no Python frame>
507
+ 6:
508
+ 6: Thread 0x0000144db9181700 (most recent call first):
509
+ 6: <no Python frame>
510
+ 6:
511
+ 6: Thread 0x0000144db8f80700 (most recent call first):
512
+ 6: Memory access fault by GPU node-11 (Agent handle: 0x737af60) on address (nil)(may not be exact address). Reason: DRAM ECC failure.
513
+ 6: <no Python frame>
514
+ 6:
515
+ 6: Thread 0x0000144db8d7f700 (most recent call first):
516
+ 6: <no Python frame>
517
+ 6:
518
+ 6: Thread 0x0000144db8b7e700 (most recent call first):
519
+ 6: <no Python frame>
520
+ 6:
521
+ 6: Thread 0x0000144db9985700 (most recent call first):
522
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197 in Fatal Python error: backward
523
+ 6: Aborted File
524
+ 6:
525
+ 6: "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/deepspeed/runtime/activation_checkpointing/checkpointing.py", line 725 in backward
526
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/autograd/function.py", line 267 in WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83868 closing signal SIGTERM
527
+ 6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83869 closing signal SIGTERM
528
+ 6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83870 closing signal SIGTERM
529
+ 6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83871 closing signal SIGTERM
530
+ 6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83872 closing signal SIGTERM
531
+ 6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83873 closing signal SIGTERM
532
+ 6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83874 closing signal SIGTERM
533
+ 6: WARNING:torch.distributed.elastic.multiprocessing.api:Unable to shutdown process 83868 via 15, forcefully exitting via 9
534
+ 6: ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 7 (pid: 83875) of binary: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/bin/python
535
+ 6: ERROR:torch.distributed.elastic.agent.server.api:Error waiting on exit barrier. Elapsed: 314.5357172489166 seconds
536
+ 6: Traceback (most recent call last):
537
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/agent/server/api.py", line 906, in _exit_barrier
538
+ 6: store_util.barrier(
539
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/utils/store.py", line 78, in barrier
540
+ 6: synchronize(store, data, rank, world_size, key_prefix, barrier_timeout)
541
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/utils/store.py", line 64, in synchronize
542
+ 6: agent_data = get_all(store, rank, key_prefix, world_size)
543
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/utils/store.py", line 34, in get_all
544
+ 6: data = store.get(f"{prefix}{idx}")
545
+ 6: RuntimeError: Socket Timeout
546
+ 6: Traceback (most recent call last):
547
+ 6: File "/opt/cray/pe/python/3.9.12.1/lib/python3.9/runpy.py", line 197, in _run_module_as_main
548
+ 6: return _run_code(code, main_globals, None,
549
+ 6: File "/opt/cray/pe/python/3.9.12.1/lib/python3.9/runpy.py", line 87, in _run_code
550
+ 6: exec(code, run_globals)
551
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/run.py", line 766, in <module>
552
+ 6: main()
553
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
554
+ 6: return f(*args, **kwargs)
555
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/run.py", line 762, in main
556
+ 6: run(args)
557
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/run.py", line 753, in run
558
+ 6: elastic_launch(
559
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
560
+ 6: return launch_agent(self._config, self._entrypoint, list(args))
561
+ 6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
562
+ 6: raise ChildFailedError(
563
+ 6: torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
564
+ 6: ======================================================
565
+ 6: Megatron-DeepSpeed/finetune_t0.py FAILED
566
+ 6: ------------------------------------------------------
567
+ 6: Failures:
568
+ 6: <NO_OTHER_FAILURES>
569
+ 6: ------------------------------------------------------
570
+ 6: Root Cause (first observed failure):
571
+ 6: [0]:
572
+ 6: time : 2022-12-04_14:59:33
573
+ 6: host : nid007349
574
+ 6: rank : 55 (local_rank: 7)
575
+ 6: exitcode : -6 (pid: 83875)
576
+ 6: error_file: <N/A>
577
+ 6: traceback : Signal 6 (SIGABRT) received by PID 83875
578
+ 6: ======================================================
579
+ srun: error: nid007349: task 6: Exited with exit code 1
580
+ srun: launch/slurm: _step_signal: Terminating StepId=2105757.0
581
+ 0: slurmstepd: error: *** STEP 2105757.0 ON nid007343 CANCELLED AT 2022-12-04T15:05:57 ***
582
+ srun: error: nid007344: task 1: Terminated
583
+ srun: error: nid007346: task 3: Terminated
584
+ srun: error: nid007345: task 2: Terminated
585
+ srun: error: nid007350: task 7: Terminated
586
+ srun: error: nid007348: task 5: Terminated
587
+ srun: error: nid007343: task 0: Terminated
588
+ srun: error: nid007347: task 4: Terminated
589
+ srun: Force Terminated StepId=2105757.0
logs/2105757.out ADDED
The diff for this file is too large to render. See raw diff
 
tensorboard_7b1xp3ru/events.out.tfevents.1670096733.nid005410.107538.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b11c87fd7a66e5b9dd531fec1856d38e98b6a1389a31b3b5ba0ea8bd01224ee
3
+ size 44746
tensorboard_7b1xp3ru/events.out.tfevents.1670097634.nid007350.79011.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd602e24b40cd7b8073772299551c9bf0e951805ee6dd8a1705b923b7bc7bc9a
3
+ size 2283615