Text Generation
Transformers
PyTorch
English
llama
Inference Endpoints
text-generation-inference

How the size of the model is ~275GB ?

#26
by chakibb - opened

Hi everyone,

According to the model card, the model should size 70Bx16bits = 140GB but the storage required is 275GB.
Is it a 32bits or a 16bits model parameters ?

Yes, my bet is if you load and then save in 16 bit nothing will be lost and filesize will match llama 2.

Would be nice to see this fixed though.

What a waste of storage and bandwidth !

Sign up or log in to comment