bloomz-7b1-4b-xp3ru / logs /2105757.err
Muennighoff's picture
Add logs
928e583
raw
history blame contribute delete
No virus
55 kB
0: loading file tokenizer.json from cache at /users/muennighoff/.cache/huggingface/hub/models--bigscience--tokenizer/snapshots/d43158eabd9ae01d7cc562a364a87f79b09e46f7/tokenizer.json
0: loading file added_tokens.json from cache at None
0: loading file special_tokens_map.json from cache at /users/muennighoff/.cache/huggingface/hub/models--bigscience--tokenizer/snapshots/d43158eabd9ae01d7cc562a364a87f79b09e46f7/special_tokens_map.json
0: loading file tokenizer_config.json from cache at /users/muennighoff/.cache/huggingface/hub/models--bigscience--tokenizer/snapshots/d43158eabd9ae01d7cc562a364a87f79b09e46f7/tokenizer_config.json
0: Successfully preprocessed all matching files.
0: Detected CUDA files, patching ldflags
0: Emitting ninja build file /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-mtf/Megatron-DeepSpeed/megatron/fused_kernels/build/build.ninja...
0: Building extension module scaled_upper_triang_masked_softmax_cuda...
0: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
0: Loading extension module scaled_upper_triang_masked_softmax_cuda...
0: Successfully preprocessed all matching files.
0: Detected CUDA files, patching ldflags
0: Emitting ninja build file /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-mtf/Megatron-DeepSpeed/megatron/fused_kernels/build/build.ninja...
0: Building extension module scaled_masked_softmax_cuda...
0: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
0: Loading extension module scaled_masked_softmax_cuda...
0: Successfully preprocessed all matching files.
0: Detected CUDA files, patching ldflags
0: Emitting ninja build file /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-mtf/Megatron-DeepSpeed/megatron/fused_kernels/build/build.ninja...
0: Building extension module fused_mix_prec_layer_norm_cuda...
0: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
0: Loading extension module fused_mix_prec_layer_norm_cuda...
0: Successfully preprocessed all matching files.
0: Successfully preprocessed all matching files.
0: Successfully preprocessed all matching files.
0: Successfully preprocessed all matching files.
0: Successfully preprocessed all matching files.
0: Successfully preprocessed all matching files.
5: Successfully preprocessed all matching files.
5: Successfully preprocessed all matching files.
2: Successfully preprocessed all matching files.
2: Successfully preprocessed all matching files.
2: Successfully preprocessed all matching files.
3: Successfully preprocessed all matching files.
6: Successfully preprocessed all matching files.
4: Successfully preprocessed all matching files.
1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
1: warnings.warn(
1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
1: warnings.warn(
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
1: warnings.warn(
1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
1: warnings.warn(
1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
1: warnings.warn(
1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
1: warnings.warn(
6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
6: warnings.warn(
1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
1: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
6: warnings.warn(
1: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
1: warnings.warn(
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
0: warnings.warn(
6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
6: warnings.warn(
6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
6: warnings.warn(
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
0: warnings.warn(
6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
6: warnings.warn(
6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
6: warnings.warn(
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
0: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
6: warnings.warn(
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
0: warnings.warn(
2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2: warnings.warn(
6: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
6: warnings.warn(
5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
5: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
5: warnings.warn(
5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
5: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
5: warnings.warn(
5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
5: warnings.warn(
5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
5: warnings.warn(
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
0: warnings.warn(
5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
5: warnings.warn(
5: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
5: warnings.warn(
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
0: warnings.warn(
2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2: warnings.warn(
2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2: warnings.warn(
2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2: warnings.warn(
2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2: warnings.warn(
3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
3: warnings.warn(
2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2: warnings.warn(
3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
3: warnings.warn(
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
0: warnings.warn(
3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
3: warnings.warn(
2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2: warnings.warn(
3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
3: warnings.warn(
3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
3: warnings.warn(
3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
3: warnings.warn(
3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
3: warnings.warn(
3: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
3: warnings.warn(
2: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2: warnings.warn(
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
0: warnings.warn(
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Emitting ninja build file /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu/utils/build.ninja...
4: Building extension module utils...
4: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3:
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3:
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5:
5:
5:
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5:
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6:
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6:
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
7:
7:
7:
7:
7:
7:
4: Loading extension module utils...
7: Emitting ninja build file /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu/utils/build.ninja...
7: Building extension module utils...
7: Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
7: Loading extension module utils...
0: Loading extension module utils...
0: Loading extension module utils...
0: Loading extension module utils...
0: Loading extension module utils...
0: Loading extension module utils...
1: Loading extension module utils...
0: Loading extension module utils...
0: Loading extension module utils...
1: Loading extension module utils...
1: Loading extension module utils...
1: Loading extension module utils...
1: Loading extension module utils...
1: Loading extension module utils...
1: Loading extension module utils...
4: Loading extension module utils...
4: Loading extension module utils...
4: Loading extension module utils...
4: Loading extension module utils...
4: Loading extension module utils...
4: Loading extension module utils...
4: Loading extension module utils...
2: Loading extension module utils...
2: Loading extension module utils...
3: Loading extension module utils...
2: Loading extension module utils...
3: Loading extension module utils...
2: Loading extension module utils...
3: Loading extension module utils...
2: Loading extension module utils...
3: Loading extension module utils...
2: Loading extension module utils...
3: Loading extension module utils...
2: Loading extension module utils...
3: Loading extension module utils...
3: Loading extension module utils...
2: Loading extension module utils...
3: Loading extension module utils...
5: Loading extension module utils...
5: Loading extension module utils...
5: Loading extension module utils...
5: Loading extension module utils...
5: Loading extension module utils...
5: Loading extension module utils...
5: Loading extension module utils...
5: Loading extension module utils...
6: Loading extension module utils...
6: Loading extension module utils...
6: Loading extension module utils...
6: Loading extension module utils...
6: Loading extension module utils...
6: Loading extension module utils...
6: Loading extension module utils...
6: Loading extension module utils...
7: Loading extension module utils...
7: Loading extension module utils...
7: Loading extension module utils...
7: Loading extension module utils...
7: Loading extension module utils...
7: Loading extension module utils...
7: Loading extension module utils...
1: Loading extension module utils...
0: Loading extension module utils...
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6:
6:
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6:
5: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3:
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3:
3:
5: No modifications detected for re-loaded extension module utils, skipping build step...
5: Loading extension module utils...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: No modifications detected for re-loaded extension module utils, skipping build step...
6: Loading extension module utils...
6: No modifications detected for re-loaded extension module utils, skipping build step...
6: Loading extension module utils...
3: No modifications detected for re-loaded extension module utils, skipping build step...
3: Loading extension module utils...
6: No modifications detected for re-loaded extension module utils, skipping build step...
6: Loading extension module utils...
5: No modifications detected for re-loaded extension module utils, skipping build step...
5: Loading extension module utils...
5: No modifications detected for re-loaded extension module utils, skipping build step...
5: Loading extension module utils...
5: No modifications detected for re-loaded extension module utils, skipping build step...
5: Loading extension module utils...
5: No modifications detected for re-loaded extension module utils, skipping build step...
5: Loading extension module utils...
5: No modifications detected for re-loaded extension module utils, skipping build step...
5: Loading extension module utils...
5: No modifications detected for re-loaded extension module utils, skipping build step...
5: Loading extension module utils...
5: No modifications detected for re-loaded extension module utils, skipping build step...
5: Loading extension module utils...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: No modifications detected for re-loaded extension module utils, skipping build step...
6: Loading extension module utils...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: No modifications detected for re-loaded extension module utils, skipping build step...
6: Loading extension module utils...
6: No modifications detected for re-loaded extension module utils, skipping build step...
6: Loading extension module utils...
6: No modifications detected for re-loaded extension module utils, skipping build step...
6: Loading extension module utils...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
3: No modifications detected for re-loaded extension module utils, skipping build step...
3: Loading extension module utils...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
6: No modifications detected for re-loaded extension module utils, skipping build step...
3: No modifications detected for re-loaded extension module utils, skipping build step...
3: Loading extension module utils...
3: No modifications detected for re-loaded extension module utils, skipping build step...
6: Loading extension module utils...
3: Loading extension module utils...
3: No modifications detected for re-loaded extension module utils, skipping build step...
3: Loading extension module utils...
3: No modifications detected for re-loaded extension module utils, skipping build step...
3: Loading extension module utils...
0: No modifications detected for re-loaded extension module utils, skipping build step...
0: Loading extension module utils...
3: No modifications detected for re-loaded extension module utils, skipping build step...
3: Loading extension module utils...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: No modifications detected for re-loaded extension module utils, skipping build step...
0: Loading extension module utils...
3: No modifications detected for re-loaded extension module utils, skipping build step...
3: Loading extension module utils...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...
0:
0: Loading extension module utils...Loading extension module utils...
0:
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: No modifications detected for re-loaded extension module utils, skipping build step...
0: Loading extension module utils...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
2:
2:
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: No modifications detected for re-loaded extension module utils, skipping build step...
0: Loading extension module utils...
2: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: No modifications detected for re-loaded extension module utils, skipping build step...
0: Loading extension module utils...
2: No modifications detected for re-loaded extension module utils, skipping build step...
2: Loading extension module utils...
2: No modifications detected for re-loaded extension module utils, skipping build step...
2: Loading extension module utils...
2: No modifications detected for re-loaded extension module utils, skipping build step...
2: Loading extension module utils...
2: No modifications detected for re-loaded extension module utils, skipping build step...
2: No modifications detected for re-loaded extension module utils, skipping build step...Loading extension module utils...
2:
2: Loading extension module utils...
2: No modifications detected for re-loaded extension module utils, skipping build step...
2: Loading extension module utils...
2: No modifications detected for re-loaded extension module utils, skipping build step...
2: Loading extension module utils...
2: No modifications detected for re-loaded extension module utils, skipping build step...
2: Loading extension module utils...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1:
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
1: No modifications detected for re-loaded extension module utils, skipping build step...
1: Loading extension module utils...
1: No modifications detected for re-loaded extension module utils, skipping build step...
1: Loading extension module utils...
1: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...
1:
1: Loading extension module utils...Loading extension module utils...
1:
1: No modifications detected for re-loaded extension module utils, skipping build step...
1: Loading extension module utils...
1: No modifications detected for re-loaded extension module utils, skipping build step...
1: Loading extension module utils...
1: No modifications detected for re-loaded extension module utils, skipping build step...
1: Loading extension module utils...
1: No modifications detected for re-loaded extension module utils, skipping build step...
1: Loading extension module utils...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4:
4:
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: No modifications detected for re-loaded extension module utils, skipping build step...
4: Loading extension module utils...
4: No modifications detected for re-loaded extension module utils, skipping build step...
4: Loading extension module utils...
7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
7:
4: No modifications detected for re-loaded extension module utils, skipping build step...
4: Loading extension module utils...
7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
7:
7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
4: No modifications detected for re-loaded extension module utils, skipping build step...
4: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...Loading extension module utils...
4:
4:
4: Loading extension module utils...Loading extension module utils...
4:
4: No modifications detected for re-loaded extension module utils, skipping build step...
4: Loading extension module utils...
4: No modifications detected for re-loaded extension module utils, skipping build step...
4: Loading extension module utils...
7: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
7: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...
7:
7: Loading extension module utils...Loading extension module utils...
7:
7: No modifications detected for re-loaded extension module utils, skipping build step...
7: Loading extension module utils...
7: No modifications detected for re-loaded extension module utils, skipping build step...
7: Loading extension module utils...
7: No modifications detected for re-loaded extension module utils, skipping build step...
7: No modifications detected for re-loaded extension module utils, skipping build step...No modifications detected for re-loaded extension module utils, skipping build step...Loading extension module utils...
7:
7:
7: Loading extension module utils...
7: Loading extension module utils...
7: No modifications detected for re-loaded extension module utils, skipping build step...
7: Loading extension module utils...
0: Using /pfs/lustrep4/users/muennighoff/.cache/torch_extensions/py39_cpu as PyTorch extensions root...
0: No modifications detected for re-loaded extension module utils, skipping build step...
0: Loading extension module utils...
0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-mtf/Megatron-DeepSpeed/megatron/utils.py:356: UserWarning: Parameter count with the embeddings will be inaccurate with PP > 1, as the first and last stage hold several copies of the embeddings
0: warnings.warn("Parameter count with the embeddings will be inaccurate with PP > 1, as the first and last stage hold several copies of the embeddings")
4: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
4: warnings.warn(
7: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
7: warnings.warn(
6: Fatal Python error: Bus error
6:
6: Thread 0x0000144db9784700 (most recent call first):
6: <no Python frame>
6:
6: Thread 0x0000144db9583700 (most recent call first):
6: <no Python frame>
6:
6: Thread 0x0000144db9382700 (most recent call first):
6: <no Python frame>
6:
6: Thread 0x0000144db9181700 (most recent call first):
6: <no Python frame>
6:
6: Thread 0x0000144db8f80700 (most recent call first):
6: Memory access fault by GPU node-11 (Agent handle: 0x737af60) on address (nil)(may not be exact address). Reason: DRAM ECC failure.
6: <no Python frame>
6:
6: Thread 0x0000144db8d7f700 (most recent call first):
6: <no Python frame>
6:
6: Thread 0x0000144db8b7e700 (most recent call first):
6: <no Python frame>
6:
6: Thread 0x0000144db9985700 (most recent call first):
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197 in Fatal Python error: backward
6: Aborted File
6:
6: "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/deepspeed/runtime/activation_checkpointing/checkpointing.py", line 725 in backward
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/autograd/function.py", line 267 in WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83868 closing signal SIGTERM
6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83869 closing signal SIGTERM
6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83870 closing signal SIGTERM
6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83871 closing signal SIGTERM
6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83872 closing signal SIGTERM
6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83873 closing signal SIGTERM
6: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 83874 closing signal SIGTERM
6: WARNING:torch.distributed.elastic.multiprocessing.api:Unable to shutdown process 83868 via 15, forcefully exitting via 9
6: ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 7 (pid: 83875) of binary: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/bin/python
6: ERROR:torch.distributed.elastic.agent.server.api:Error waiting on exit barrier. Elapsed: 314.5357172489166 seconds
6: Traceback (most recent call last):
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/agent/server/api.py", line 906, in _exit_barrier
6: store_util.barrier(
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/utils/store.py", line 78, in barrier
6: synchronize(store, data, rank, world_size, key_prefix, barrier_timeout)
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/utils/store.py", line 64, in synchronize
6: agent_data = get_all(store, rank, key_prefix, world_size)
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/utils/store.py", line 34, in get_all
6: data = store.get(f"{prefix}{idx}")
6: RuntimeError: Socket Timeout
6: Traceback (most recent call last):
6: File "/opt/cray/pe/python/3.9.12.1/lib/python3.9/runpy.py", line 197, in _run_module_as_main
6: return _run_code(code, main_globals, None,
6: File "/opt/cray/pe/python/3.9.12.1/lib/python3.9/runpy.py", line 87, in _run_code
6: exec(code, run_globals)
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/run.py", line 766, in <module>
6: main()
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
6: return f(*args, **kwargs)
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/run.py", line 762, in main
6: run(args)
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/run.py", line 753, in run
6: elastic_launch(
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
6: return launch_agent(self._config, self._entrypoint, list(args))
6: File "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
6: raise ChildFailedError(
6: torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
6: ======================================================
6: Megatron-DeepSpeed/finetune_t0.py FAILED
6: ------------------------------------------------------
6: Failures:
6: <NO_OTHER_FAILURES>
6: ------------------------------------------------------
6: Root Cause (first observed failure):
6: [0]:
6: time : 2022-12-04_14:59:33
6: host : nid007349
6: rank : 55 (local_rank: 7)
6: exitcode : -6 (pid: 83875)
6: error_file: <N/A>
6: traceback : Signal 6 (SIGABRT) received by PID 83875
6: ======================================================
srun: error: nid007349: task 6: Exited with exit code 1
srun: launch/slurm: _step_signal: Terminating StepId=2105757.0
0: slurmstepd: error: *** STEP 2105757.0 ON nid007343 CANCELLED AT 2022-12-04T15:05:57 ***
srun: error: nid007344: task 1: Terminated
srun: error: nid007346: task 3: Terminated
srun: error: nid007345: task 2: Terminated
srun: error: nid007350: task 7: Terminated
srun: error: nid007348: task 5: Terminated
srun: error: nid007343: task 0: Terminated
srun: error: nid007347: task 4: Terminated
srun: Force Terminated StepId=2105757.0