Text Generation
MLX
English
mistral
zephyr

M2 32 GB keeps crashing unfortunately. Not sure why.

#2
by BlenderSushi - opened

/Users/myname/MLX/mlx-examples/zephyr-7b-beta
[INFO] Loading model from disk.
[INFO] Starting generation...
My name islibc++abi: terminating due to uncaught exception of type std::runtime_error: [malloc_or_wait] Unable to allocate 524288000 bytes.
zsh: abort python llms/mistral/mistral.py --model_path zephyr-7b-beta --prompt

MLX Community org

Hmm weird it’s crashing with 32GB are you monitoring the MPS usage?

MLX Community org

I've doubled checked in a Mac with M2 Pro and 32GB and it's also crashing apparently, while https://huggingface.co/mlx-community/Mistral-7B-Instruct-v0.2 works fine, the issue is while generating the response, as the model does fit but the generation fails due to missing memory, not sure if there is any quick fix for it? I could reproduce the same issue using https://huggingface.co/mlx-community/mistral-7B-v0.1/tree/main in M2 Pro 32GB

cc @pcuenq

MLX Community org

It's already solved @BlenderSushi , I think it was due to an issue with the key mapping when converting the weights, but that should be resolved now!

alvarobartt changed discussion status to closed

Sign up or log in to comment