Is BLOOM-560M also trained in BF16?
#14
by
patrickramos
- opened
The BLOOM training README says that BLOOM was trained in bf16, and the model card for bigscience/bloom also mentions bf16 weights, but I can't find anything in this model card about the data type of the weights . I assume BLOOM-560M was also trained in bf16 since the model card still links to the same training README, but I just want to make sure. Thanks!
Only the 176B model is trained in BF16. The smaller models are all trained in FP16.
https://github.com/bigscience-workshop/Megatron-DeepSpeed/issues/343#issuecomment-1267299209
Thanks!
patrickramos
changed discussion status to
closed