Tokenizer

by hujunc - opened Dec 5, 2023

Discussion

hujunc

Dec 5, 2023

why there is no tokenizer file?

gfouilhe

Dec 5, 2023

According to the released code (https://github.com/state-spaces/mamba/blob/main/benchmarks/benchmark_generation_mamba_simple.py)

(Line 33)
is_mamba = args.model_name.startswith("state-spaces/mamba-")
if is_mamba:
    tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
    model = MambaLMHeadModel.from_pretrained(args.model_name, device=device, dtype=dtype)

You should use AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment