YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ABot-OCR

ABot-OCR is a document image OCR model that converts PDF/document page images into structured Markdown output, supporting recognition and reconstruction of text, mathematical formulas (LaTeX), tables (HTML), and other elements.

Benchmarks

ABot-OCR Benchmark Results

Requirements

Python 3.11 is recommended. Install the following dependencies:

pip install vllm==0.18.0 torch==2.10.0

Note: Inference uses vLLM to load the model. Sufficient GPU memory is required (~4GB model weights; actual usage depends on batch_size and image resolution).


Inference

Inference script: abot-ocr-infer.py

1. Configure Model Path

Update the default model path in the script:

MODEL_PATH = "./abot-ocr"  # Path to the model directory in this repo

2. Run from Command Line

Edit the parameters in the __main__ block at the bottom of abot-ocr-infer.py, then run:

python abot-ocr-infer.py

Acknowledgements

Our work is inspired by many excellent open-source projects. We sincerely thank the developers of Qwen-VL, PaddleOCR-VL, MinerU, and the broader OCR community.

Downloads last month
19
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support