WHy cant i use LLama2 in MacOS Ventura 10.14

#13
by Yerrramsetty - opened

torchrun --nproc_per_node 1 example_text_completion.py --ckpt_dir llama-2-7b/ --tokenizer_path tokenizer.model --max_seq_len 128 --max_batch_size 4
NOTE: Redirects are currently not supported in Windows or MacOs.
Traceback (most recent call last):
File "/Users/jayasaikrishnayerramsetty/llama2/llama/example_text_completion.py", line 69, in
fire.Fire(main)
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jayasaikrishnayerramsetty/llama2/llama/example_text_completion.py", line 32, in main
generator = Llama.build(
^^^^^^^^^^^^
File "/Users/jayasaikrishnayerramsetty/llama2/llama/llama/generation.py", line 84, in build
torch.distributed.init_process_group("nccl")
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group
default_pg = _new_process_group_helper(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py", line 1013, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL " "built in")
RuntimeError: Distributed package doesn't have NCCL built in
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 74324) of binary: /Users/jayasaikrishnayerramsetty/anaconda3/bin/python
Traceback (most recent call last):
File "/Users/jayasaikrishnayerramsetty/anaconda3/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==2.0.1', 'console_scripts', 'torchrun')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jayasaikrishnayerramsetty/anaconda3/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

example_text_completion.py FAILED

Failures:

Root Cause (first observed failure):
[0]:
time : 2023-09-04_02:18:55
host : 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 74324)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Any response is much appcreiated

Sign up or log in to comment