`std::runtime_error: [Matmul::eval_cpu] Currently only supports float32`

#2
by adhishthite - opened

Hello,

I am getting this output on my Mac M1 Pro on Sonoma 14.3. Python version used is 3.11. Used latest PyTorch.

Please visit this link for full output: https://app.warp.dev/block/NMbYuCAkwfcxcQ7zjhZv8n

In [5]: response = generate(model, tokenizer, prompt="<step>Source: user Fibonacci series in Python<step> Source: assistant Destination: user", verbose=True)
   ...:
   ...:
==========
Prompt: <step>Source: user Fibonacci series in Python<step> Source: assistant Destination: user

libc++abi: terminating due to uncaught exception of type std::runtime_error: [Matmul::eval_cpu] Currently only supports float32.
[1]    9782 abort      ipython
MLX Community org

How much RAM in your M1 Pro? This model requires a machine with at least 64GB to run with q4

MLX Community org
edited Feb 28

@ivanfioravanti I have the M1 Pro 16GB, but still got this error with a 7B Q4 quantized model.

Please see my github issue if you know how to fix it. Many thanks
https://github.com/ml-explore/mlx/issues/753

Sign up or log in to comment