How can I get token level bounding boxes from XLMRobertaTokenizer?

#10
by zeroman0112 - opened

The Tokenizer for this model seems to be XLMRobertaTokenizer.
However, the LiLT model seems to use the LayoutLMv3Tokenizer as can be seen in example code.
The encode function of this tokenizer includes an algorithm to convert a word level bounding box to a token level bounding box, which the XLMRobertaTokenizer does not.
How can I get the token level bounding box?
Any help would be appreciated.

Sign up or log in to comment