ValueError: Couldn't instantiate the backend tokenizer
HI,
I am unable to use the model. I tried using the same code from the documentation. I am using google colab for execution.
Code:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "GOAT-AI/GOAT-7B-Community"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16
)
Error:
ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a tokenizers
library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
Hi, you forgot to install sentencepiece
package, see the last line: "You need to have sentencepiece installed to convert a slow tokenizer to a fast one."