Edit model card

This is a the state-spaces mamba-2.8b model, fine-tuned using Supervised Fine-tuning method (SFT) on llama-2-7b-miniguanaco dataset.

To run inference on this model, run the following code:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel

#Load the model
model = MambaLMHeadModel.from_pretrained("walebadr/mamba-2.8b-SFT", dtype=torch.bfloat16, device="cuda")

device = "cuda"
messages = []

user_message = f"[INST] what is a language model? [/INST]"
input_ids = tokenizer(user_message, return_tensors="pt").input_ids.to("cuda")
out = model.generate(input_ids=input_ids, max_length=500, temperature=0.9, top_p=0.7, eos_token_id=tokenizer.eos_token_id)
decoded = tokenizer.batch_decode(out)


print("Model:", decoded[0])

Model Evaluation

Coming soon

Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .