Is BLOOM-560M also trained in BF16?

#14
by patrickramos - opened

The BLOOM training README says that BLOOM was trained in bf16, and the model card for bigscience/bloom also mentions bf16 weights, but I can't find anything in this model card about the data type of the weights . I assume BLOOM-560M was also trained in bf16 since the model card still links to the same training README, but I just want to make sure. Thanks!

Only the 176B model is trained in BF16. The smaller models are all trained in FP16.
https://github.com/bigscience-workshop/Megatron-DeepSpeed/issues/343#issuecomment-1267299209

patrickramos changed discussion status to closed

Sign up or log in to comment