This model does not work
Running the code provided in the readme throws the following error:
ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a tokenizers
library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
It is possible to run the model when using the processor from the repos from the larger models:
processor = TrOCRProcessor.from_pretrained('microsoft/trocr-base-handwritten')
model = VisionEncoderDecoderModel.from_pretrained('microsoft/trocr-small-handwritten')
works
However, I am not sure whether there is a difference between the processors of different model sizes.
Hi Chrisxx,
I had the same problem and fixed it with "pip install sentencepiece"
Found this solution from "https://stackoverflow.com/questions/65431837/transformers-v4-x-convert-slow-tokenizer-to-fast-tokenizer"
Works!