How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Gemstone-Models/Gemstone-3072x12"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Gemstone-Models/Gemstone-3072x12",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'
Use Docker
docker model run hf.co/Gemstone-Models/Gemstone-3072x12
Quick Links

Gemstone-1536x50

Gemstone-1536x50 is part of the Gemstone Suite of Models. A set of models trained with varying widths and depths.

Training

We train using litgpt and AxoNN using AMD MI250X GPUs on Frontier at Oak Ridge National Laboratory with a global batch size of 2048.

Data

Train and validation data is taken from non-overlapping subsets of dolma. As such it is not an instruction model. This model is trained for 350 billion tokens, we upload checkpoints every 2 billion tokens (477 steps).

Using Gemstone-1536x50

The Gemstones are based on the gemma-2b architecture and use modeling_gemma.py to run using the transformers library.

Licence

This model is released under the apache-2.0 licence.

Contact

Please, feel free to contact us with any questions, or open a discussion thread.

Citation

@article{mcleish2024gemstones
    title={Gemstones: A Model Suite for Multi-Faceted Scaling Laws}, 
    author={Sean McLeish and John Kirchenbauer and David Yu Miller and Siddharth Singh and Abhinav Bhatele and Micah Goldblum and Ashwinee Panda and Tom Goldstein},
    journal={arXiv preprint arXiv:2502.},
    year={2025}
}
Downloads last month
15
Safetensors
Model size
2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train Gemstone-Models/Gemstone-3072x12