|
--- |
|
license: other |
|
language: |
|
- en |
|
tags: |
|
- causal-lm |
|
--- |
|
# `Stable LM 2 1.6B` (global_step420000) |
|
|
|
## Description |
|
|
|
`Stable LM 2 1.6B` is a 1.6 billion parameter decoder-only language model pre-trained on 2 trillion tokens of diverse multilingual and code datasets for two epochs. |
|
|
|
## Usage |
|
|
|
This branch contains the training checkpoint for `Stable LM 2 1.6B` at step 420,000. It is the final checkpoint taken before cooldown. |
|
We provide the following contents in the [`global_step420000`](https://huggingface.co/stabilityai/stablelm-2-1_6b/tree/global_step420000/global_step420000) directory: |
|
|
|
- `bf16_zero_pp_mp_rank_00_optim_states.pt`: The Adam states and FP32 weights for each parameter. You will need to port this to your optimizer format when importing into your training process. |
|
|
|
- `mp_rank_00_model_states.pt`: The model weights following the [GPT-NeoX](https://github.com/EleutherAI/gpt-neox) convention. |
|
|
|
- `config.yml`: The pre-training configuration file for this checkpoint. Linear learning rate cooldown should be taken from `lr=0.0002529` to `lr=0.0`. |
|
|
|
The model weights are also converted to HuggingFace `transformers` format and can be loaded with the following code: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablelm-2-1_6b", trust_remote_code=True) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"stabilityai/stablelm-2-1_6b", |
|
trust_remote_code=True, |
|
torch_dtype="auto", |
|
revision="global_step420000" |
|
) |
|
model.cuda() |
|
``` |
|
|
|
## License |
|
|
|
* **License**: [Stability AI Non-Commercial Research Community License](https://huggingface.co/stabilityai/stablelm-2-1_6b/blob/main/LICENSE). If you'd like to use this model for commercial products or purposes, please contact us [here](https://stability.ai/membership) to learn more. |
|
|
|
## Acknowledgements |
|
|
|
- Dakota Mahan for creating the ZeRO optimizer state merging script. |
|
|
|
## Citation |
|
|
|
```bibtex |
|
@misc{StableLM-2-1.6B, |
|
url={[https://huggingface.co/stabilityai/stablelm-2-1_6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)}, |
|
title={Stable LM 2 1.6B}, |
|
author={Stability AI Language Team} |
|
} |
|
``` |
|
|