BF16 in MPS from Apple

#9
by DAgir - opened

Hi guys. This is great news, thank you for your hard work. I will definitely try the 3B model.
But it would be noticeable if models with the BF16 tensor type ran without problems on MPS from Apple.
Pytorch 2.3 solved some problems, but there are still bottlenecks where the process is dumped on the CPU, and this bottleneck is why tokens/s sags.
Of course MLX is a great tool, but would you like to support seamless development (cloud is cuda).

wrong forum for this

Sign up or log in to comment