DocExpert-1B
Model Summary
DocExpert-1B is a compact OCR recognition model designed for document element recognition. The model takes cropped document regions as input and recognizes the corresponding textual or structured content.
The model is modified based on the architecture of MinerU2.5 and uses a more compact structure, achieving competitive performance in text, table, and formula recognition.
Unlike end-to-end document parsing systems, this model does not perform layout analysis. It does not detect text blocks, tables, formulas, figures, or reading-order structure from a full page image. Instead, it is intended to be used as the recognition component in a document parsing pipeline, where upstream layout detection or external preprocessing has already produced bounding boxes and crops.
The model supports recognition for the following element types:
- Text regions
- Table regions
- Formula regions
Typical workflow is:
document image
-> external layout analyzer / provided bounding boxes
-> crop text/table/formula regions
-> DocExpert-1B recognition
-> structured OCR output
Usage
Under construction...
Acknowledgements
We would like to acknowledge the contributions of the open-source document AI community. This model and its development process were inspired by the technical experience, engineering practices, open-source code, and data resources from projects and teams including OmniDocBench, MinerU, OpenOCR, and PaddleOCR.
We also gratefully acknowledge the open-source datasets that have supported research and development in document understanding and OCR, including DocBank, UniRec40M, PubTable, and related public resources. These projects and datasets provide an important foundation for advancing reproducible research, benchmark-driven evaluation, and practical document parsing systems.
- Downloads last month
- -
Model tree for linglongOCR-group/DocExpert-1B
Evaluation results
- text_edit_dist on OmniDocBench v1.5self-reported0.046
- formula_cdm on OmniDocBench v1.5self-reported89.094
- table_teds on OmniDocBench v1.5self-reported93.375
- table_teds_struct on OmniDocBench v1.5self-reported96.168
- reading_order_edit_dist on OmniDocBench v1.5self-reported0.044
- overall on OmniDocBench v1.5self-reported92.623