sam-mosaic commited on
Commit
01548f3
1 Parent(s): e7119f3

add training configuration section

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -152,6 +152,11 @@ For more details on the pretraining process, see [MPT-7B](https://huggingface.co
152
 
153
  The data was tokenized using the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.
154
 
 
 
 
 
 
155
  ## Limitations and Biases
156
 
157
  _The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_
 
152
 
153
  The data was tokenized using the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.
154
 
155
+ ### Training Configuration
156
+
157
+ This model was trained on 8 A100-40GBs for about 2.3 hours using the [MosaicML Platform](https://www.mosaicml.com/platform).
158
+ The model was trained with sharded data parallelism using [FSDP](https://pytorch.org/docs/stable/fsdp.html) and used the AdamW optimizer.
159
+
160
  ## Limitations and Biases
161
 
162
  _The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_