Are Image features used in this LayoutLM based model?

#8
by KshitizM - opened

Hello,

Just saw the code for LayoutLMForQuestionAnswering here: https://github.com/huggingface/transformers/blob/f1e8c48c5eebf899a5c79b2c48c0ef8456e6bddc/src/transformers/models/layoutlm/modeling_layoutlm.py#L1248

I don't think the document image features are used anywhere here but Image is a non-Optional argument in the DocumentQuestionAnsweringPipeline here:
https://github.com/huggingface/transformers/blob/b2c863a3196150850d17548f25ee0575bccb8224/src/transformers/pipelines/document_question_answering.py#L188
I get that it maybe is needed for OCR(tesseract) but if I provide word_boxes and use a LayoutLM(v1) based model, Image features should have no use.

So just want to confirm if image features are actually being used in this LayoutLM(v1) based model?

Thanks :)

Impira org

You can provide None for the images for LayoutLMv1, and the pipeline will succeed (as long as you provide word_boxes).

KshitizM changed discussion status to closed
KshitizM changed discussion status to open
KshitizM changed discussion status to closed

Sign up or log in to comment