Mamba-790M / README.md
Q-bert's picture
Create README.md
7058bfa
|
raw
history blame
1.02 kB
metadata
license: apache-2.0
language:
  - en
tags:
  - mamba-hf

Mamba-790M

mamba-hf

Mamba Models with hf_integration. mamba-hf Github Repo

Usage:

from transformers import AutoModelForCausalLM , AutoTokenizer

model = AutoModelForCausalLM.from_pretrained('Q-bert/Mamba-790M', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('Q-bert/Mamba-790M')

text = "Hi"

input_ids = tokenizer.encode(text, return_tensors="pt")

output = model.generate(input_ids, max_length=20, num_beams=5, no_repeat_ngram_size=2)

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)

Hi, I'm looking for a new job. I've been working at a company for about a year now.

Credits:

https://huggingface.co/state-spaces

Special thanks to Albert Gu and Tri Dao for their articles. (https://arxiv.org/abs/2312.00752)