Can I load these weights into a model using 8 gpus?

#2
by bournezz - opened

I'm new to deepspeed and still don't understand how the weight sharding works. I suppose that these weights are intended for users with 4 A100 80G gpus because there are 4 groups of tp_**.pt files. Since I only have V100. I need more than 4 gpus to host these weights. Can I use these weights in my program? If not, how to re-reshard these weights into 8 partitions?

You should be fine just by setting training_mp_size=4 in the deepspeed.init_inference. However, I see that it works even by setting nothing with deepspeed>=0.8.0. I suppose that under the hood deepspeed splits every tensor another time to obtain 8 shards.

Hi, do you succeed at using V100 to run BLOOM inference?@bournezz How much V100s do you use?

Sign up or log in to comment