How to get running using fastchat on a m1 mac?

#7
by kkostecky - opened

Hi there, can someone give me directions on how to get a ggml model like this running using fastchat on an m1 mac. I have the regular vicuna 7 and 13B models running, but these are not pytorch files. Thanks!

Fastchat doesn't support ggml as far as I know. You're gonna have to use either oobabooga or llama.cpp.

Also, since you're on an M1, make sure to get the q4_2 models. They're great on apple silicon.

Thanks. Yeah, I already have it running on llama.cpp and using the q4_2 model.

OK, fair enough regarding fastchat. Thank you!

Sign up or log in to comment