How to use nielsr/lilt-xlm-roberta-base for inference with a document image?

#2
by pierreguillou - opened

Hi @nielsr .

Many thanks for this model that you finetuned in your notebook Fine_tune_LiLT_on_a_custom_dataset%2C_in_any_language.ipynb.

However, you did not provide the inference code to use on a document image (ie, without boxes coordinates).

I did try to adapt the code of @philschmid from his blog post Document AI: LiLT a better language agnostic LayoutLM model but it does not work.

Here is this code:

from transformers import LayoutLMv3FeatureExtractor, AutoTokenizer, LayoutLMv3Processor

model_id="nielsr/lilt-xlm-roberta-base"

# use LayoutLMv3 processor without ocr since the dataset already includes the ocr text
feature_extractor = LayoutLMv3FeatureExtractor(apply_ocr=True) # set
tokenizer = AutoTokenizer.from_pretrained(model_id)
# cannot use from_pretrained since the processor is not saved in the base model
processor = LayoutLMv3Processor(feature_extractor, tokenizer)

... and the error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-caabe3c64173> in <module>
      7 tokenizer = AutoTokenizer.from_pretrained(model_id)
      8 # cannot use from_pretrained since the processor is not saved in the base model
----> 9 processor = LayoutLMv3Processor(feature_extractor, tokenizer)

/usr/local/lib/python3.8/dist-packages/transformers/processing_utils.py in __init__(self, *args, **kwargs)
     82 
     83             if not isinstance(arg, proper_class):
---> 84                 raise ValueError(
     85                     f"Received a {type(arg).__name__} for argument {attribute_name}, but a {class_name} was expected."
     86                 )

ValueError: Received a XLMRobertaTokenizerFast for argument tokenizer, but a ('LayoutLMv3Tokenizer', 'LayoutLMv3TokenizerFast') was expected.

Any help is welcome :-) Thank you.

pierreguillou changed discussion title from How to use nielsr/lilt-xlm-roberta-base for inference with a a document image? to How to use nielsr/lilt-xlm-roberta-base for inference with a document image?
nielsr changed discussion status to closed

Sign up or log in to comment