How much disk does each of the bloom models require?

#50
by dgaff - opened

Hey all - I'm looking for a listing of the new models by disk usage and I can't seem to find that - is there anywhere where I can find that?

BigScience Workshop org
edited Jul 18, 2022

A good rule of thumb for autoregressive transformers is 13x the number of parameters for training and 2x the number of parameters for inference.

It needed around 400GB just to fit the all the weights files. They list the sizes of the weights and checkpoints under the Training section.

BigScience Workshop org

Closing as this seem resolved.

TimeRobber changed discussion status to closed

Sign up or log in to comment