Silicon Macs support.

#66
by quantoser - opened

Does this model work on Macs with Silicon chips? I'm running it on a Mac Pro M1 and it gets stuck with:

UserWarning: Using the model-agnostic default max_length (=20) to control the generation length. We recommend setting max_new_tokens to control the maximum length of the generation.
warnings.warn(

The process just sits there eating up CPU and memory, but no output ever produced.

Google org

Hi @quantoser !
In which precision are you running the generation? the 7B model will need ~30GB RAM just to be loaded on the CPU in float32, can you perhaps try to load the model in bfloat16?

Sign up or log in to comment