Text Generation
Transformers
Safetensors
7 languages
stablelm
causal-lm
Inference Endpoints
12 papers
jon-tow commited on
Commit
16f7880
1 Parent(s): 33923f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -108,7 +108,7 @@ The dataset is comprised of a filtered mixture of open-source large-scale datase
108
 
109
  ### Training Procedure
110
 
111
- The model is pre-trained on the aforementioned datasets in `bfloat16` precision, optimized with AdamW, and trained using the Arcade100k tokenizer with a vocabulary size of 100,352. We outline the complete hyperparameters choices in the project's [GitHub repository - config*](https://github.com/Stability-AI/StableLM/blob/main/configs/stablelm-2-1.6b.yml). The final checkpoint of pre-training, before cooldown, is provided in the `global_step420000` [branch](https://huggingface.co/stabilityai/stablelm-2-1_6b/blob/global_step420000/README.md).
112
 
113
  ### Training Infrastructure
114
 
 
108
 
109
  ### Training Procedure
110
 
111
+ The model is pre-trained on the aforementioned datasets in `bfloat16` precision, optimized with AdamW, and trained using the Arcade100k tokenizer with a vocabulary size of 100,352. We outline the complete hyperparameters choices in the project's [GitHub repository - config*](https://github.com/Stability-AI/StableLM/blob/main/configs/stablelm-2-1_6b.yml). The final checkpoint of pre-training, before cooldown, is provided in the `global_step420000` [branch](https://huggingface.co/stabilityai/stablelm-2-1_6b/blob/global_step420000/README.md).
112
 
113
  ### Training Infrastructure
114