Text Generation
Transformers
PyTorch
mpt
Composer
MosaicML
llm-foundry
custom_code
text-generation-inference
sam-mosaic commited on
Commit
6efab79
1 Parent(s): c8d4750

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -138,6 +138,11 @@ The model has been modified from a standard transformer in the following ways:
138
  | vocab size | 50432 |
139
  | sequence length | 2048 |
140
 
 
 
 
 
 
141
  ## Limitations and Biases
142
 
143
  _The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_
 
138
  | vocab size | 50432 |
139
  | sequence length | 2048 |
140
 
141
+ ### Training Configuration
142
+
143
+ This model was trained on 8 A100-80GBs for about 8.2 hours, followed by training for 6.7 hours on 32 A100-40GBs using the [MosaicML Platform](https://www.mosaicml.com/platform).
144
+ The model was trained with sharded data parallelism using [FSDP](https://pytorch.org/docs/stable/fsdp.html) and used the AdamW optimizer.
145
+
146
  ## Limitations and Biases
147
 
148
  _The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_