This model does not work

#2
by chrisxx - opened

Running the code provided in the readme throws the following error:

ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a tokenizers library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

It is possible to run the model when using the processor from the repos from the larger models:

processor = TrOCRProcessor.from_pretrained('microsoft/trocr-base-handwritten')
model = VisionEncoderDecoderModel.from_pretrained('microsoft/trocr-small-handwritten')

works

However, I am not sure whether there is a difference between the processors of different model sizes.

Hi Chrisxx,

I had the same problem and fixed it with "pip install sentencepiece"

Found this solution from "https://stackoverflow.com/questions/65431837/transformers-v4-x-convert-slow-tokenizer-to-fast-tokenizer"

chrisxx changed discussion status to closed

Sign up or log in to comment