Memory requirement

#4
by jmjzz - opened

Hello, I’m trying to load the model in my server with 4 A4500 with 20G memory each, but always get oom error.

What’s the suggested memory requirement?

Hey are trying to load it for training or inference? The model itself is ~96GB. Loading in 4 bit works for inference on a single a a6000 card. I'm going to be pushing an update to the modeling_gemmoe file here in a couple of hours and that should make things a bit more stable.

Hey are trying to load it for training or inference? The model itself is ~96GB. Loading in 4 bit works for inference on a single a a6000 card. I'm going to be pushing an update to the modeling_gemmoe file here in a couple of hours and that should make things a bit more stable.

I see. Could you also give us the sample code of loading model correctly?

Yeah I'm currently writing up a whole document with all the code/info. Will be out by this evening.

Sign up or log in to comment