Tokenizer

#3
by hujunc - opened

why there is no tokenizer file?

According to the released code (https://github.com/state-spaces/mamba/blob/main/benchmarks/benchmark_generation_mamba_simple.py)

(Line 33)
is_mamba = args.model_name.startswith("state-spaces/mamba-")
if is_mamba:
    tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
    model = MambaLMHeadModel.from_pretrained(args.model_name, device=device, dtype=dtype)

You should use AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")

Sign up or log in to comment