stabilityai
/

stablelm-2-1_6b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

stablelm-2-1_6b / README.md

jon-tow's picture

Update README.md

f12831a verified 10 months ago

|

2.17 kB

	---
	license: other
	language:
	- en
	tags:
	- causal-lm
	---
	# `Stable LM 2 1.6B` (global_step420000)

	## Description

	`Stable LM 2 1.6B` is a 1.6 billion parameter decoder-only language model pre-trained on 2 trillion tokens of diverse multilingual and code datasets for two epochs.

	## Usage

	This branch contains the training checkpoint for `Stable LM 2 1.6B` at step 420,000. It is the final checkpoint taken before cooldown.
	We provide the following contents in the [`global_step420000`](https://huggingface.co/stabilityai/stablelm-2-1_6b/tree/global_step420000/global_step420000) directory:

	- `bf16_zero_pp_mp_rank_00_optim_states.pt`: The Adam states and FP32 weights for each parameter. You will need to port this to your optimizer format when importing into your training process.

	- `mp_rank_00_model_states.pt`: The model weights following the [GPT-NeoX](https://github.com/EleutherAI/gpt-neox) convention.

	- `config.yml`: The pre-training configuration file for this checkpoint. Linear learning rate cooldown should be taken from `lr=0.0002529` to `lr=0.0`.

	The model weights are also converted to HuggingFace `transformers` format and can be loaded with the following code:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablelm-2-1_6b", trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	"stabilityai/stablelm-2-1_6b",
	trust_remote_code=True,
	torch_dtype="auto",
	revision="global_step420000"
	)
	model.cuda()
	```

	## License

	* License: [Stability AI Non-Commercial Research Community License](https://huggingface.co/stabilityai/stablelm-2-1_6b/blob/main/LICENSE). If you'd like to use this model for commercial products or purposes, please contact us [here](https://stability.ai/membership) to learn more.

	## Acknowledgements

	- Dakota Mahan for creating the ZeRO optimizer state merging script.

	## Citation

	```bibtex
	@misc{StableLM-2-1.6B,
	url={[https://huggingface.co/stabilityai/stablelm-2-1_6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)},
	title={Stable LM 2 1.6B},
	author={Stability AI Language Team}
	}
	```