Edit model card

LayoutXLM

Multimodal (text + layout/format + image) pre-training for document AI

LayoutXLM is a multilingual variant of LayoutLMv2.

The documentation of this model in the Transformers library can be found here.

Microsoft Document AI | GitHub

Introduction

LayoutXLM is a multimodal pre-trained model for multilingual document understanding, which aims to bridge the language barriers for visually-rich document understanding. Experiment results show that it has significantly outperformed the existing SOTA cross-lingual pre-trained models on the XFUND dataset.

LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding

Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei, arXiv Preprint 2021

Downloads last month
191,036
Unable to determine this model’s pipeline type. Check the docs .

Collection including microsoft/layoutxlm-base