Mamba 2.8b Slim Pyjama - bf16 (16-bit)
This is a 16 bit version of Mamba-2.8b-slimpj
Mamba-2.8b-slimpj is a model using the Mamba architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset.
Model code: https://github.com/state-spaces/mamba/tree/main
To load the model, follow the installation instruction in the code repo, and then:
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj")
Inference Notebook (Colab)
- Downloads last month
- 41
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.