DocExpert-1B

Model Summary

DocExpert-1B is a compact OCR recognition model designed for document element recognition. The model takes cropped document regions as input and recognizes the corresponding textual or structured content.

The model is modified based on the architecture of MinerU2.5 and uses a more compact structure, achieving competitive performance in text, table, and formula recognition.

Unlike end-to-end document parsing systems, this model does not perform layout analysis. It does not detect text blocks, tables, formulas, figures, or reading-order structure from a full page image. Instead, it is intended to be used as the recognition component in a document parsing pipeline, where upstream layout detection or external preprocessing has already produced bounding boxes and crops.

The model supports recognition for the following element types:

Text regions
Table regions
Formula regions

Typical workflow is:

document image
  -> external layout analyzer / provided bounding boxes
  -> crop text/table/formula regions
  -> DocExpert-1B recognition
  -> structured OCR output

Usage

Under construction...

Acknowledgements

We would like to acknowledge the contributions of the open-source document AI community. This model and its development process were inspired by the technical experience, engineering practices, open-source code, and data resources from projects and teams including OmniDocBench, MinerU, OpenOCR, and PaddleOCR.

We also gratefully acknowledge the open-source datasets that have supported research and development in document understanding and OCR, including DocBank, UniRec40M, PubTable, and related public resources. These projects and datasets provide an important foundation for advancing reproducible research, benchmark-driven evaluation, and practical document parsing systems.

Downloads last month: -

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for linglongOCR-group/DocExpert-1B

Quantizations

1 model

Evaluation results

text_edit_dist on OmniDocBench v1.5
self-reported

0.046
formula_cdm on OmniDocBench v1.5
self-reported

89.094
table_teds on OmniDocBench v1.5
self-reported

93.375
table_teds_struct on OmniDocBench v1.5
self-reported

96.168
reading_order_edit_dist on OmniDocBench v1.5
self-reported

0.044
overall on OmniDocBench v1.5
self-reported

92.623