int8 model consumes the same GPU memory as default model.

#15
by Iamexperimenting - opened

Hi team, when I'm trying the load flan-t5-xl model I see the same GPU memory is getting consumed. Could you please help me here, I'm sagemaker studio with ml.g4dn.xlarge

for default - it consumes - 11448MiB/15109Mib
for float 16 - it consumes - 7532MiB/15109Mib
for int8 - it consumes - 11448MiB/15109Mib

Thanks

I just share a model that might be helpful to you.
https://huggingface.co/limcheekin/flan-t5-xl-ct2

Hi good day, may i how it consums on RAM when it use on cpu?

Sign up or log in to comment