run fail in my macos M3

#10

by Circle9 - opened Apr 10

Apr 10

Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/Users/bytedance/Public/Code/DragonBall/Octopus-v2/test.py", line 20, in
model = GemmaForCausalLM.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3531, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3958, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/transformers/modeling_utils.py", line 812, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/opt/homebrew/lib/python3.12/site-packages/accelerate/utils/modeling.py", line 399, in set_module_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
TypeError: BFloat16 is not supported on MPS

twhongyujiang

Apr 12

Try to change torch_dtype to Float32.

alexchen4ai

Nexa AI org Apr 12

We will provide solution to these need in the future. Sorry, we will work harder to accelerate this.

christianweyer

Apr 19

This helps, thanks @twhongyujiang .

I know get this error:

NotImplementedError: The operator 'aten::isin.Tensor_Tensor_out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

Which is sad...

zackli4ai

Nexa AI org May 6

•

edited May 6

@Circle9
We have GGUF converter (https://huggingface.co/spaces/NexaAIDev/gguf-convertor) and please try to convert safetensors to GGUF and use Ollama to run, we have an example for Octopus-V4 in GGUF:
https://huggingface.co/NexaAIDev/octopus-v4-gguf

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment