Doc-UFCN - Generic page detection

The generic page detection model predicts single pages from document images.

Model description

The model has been trained using the Doc-UFCN library on Horae and READ-BAD datasets. It has been trained on images with their largest dimension equal to 768 pixels, keeping the original aspect ratio.

Evaluation results

The model achieves the following results:

dataset set IoU F1 AP@[.5] AP@[.75] AP@[.5,.95]
HOME test 93.92 95.84 98.98 98.98 97.61
Horae test 96.68 98.31 99.76 98.49 98.08
Horae test-300 95.66 97.27 98.87 98.45 97.38

How to use?

Please refer to the Doc-UFCN library page to use this model.

Cite us!

@inproceedings{doc_ufcn2021,
    author = {Boillet, Mélodie and Kermorvant, Christopher and Paquet, Thierry},
    title = {{Multiple Document Datasets Pre-training Improves Text Line Detection With
              Deep Neural Networks}},
    booktitle = {2020 25th International Conference on Pattern Recognition (ICPR)},
    year = {2021},
    month = Jan,
    pages = {2134-2141},
    doi = {10.1109/ICPR48806.2021.9412447}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Inference API (serverless) does not yet support Doc-UFCN models for this pipeline type.

Space using Teklia/doc-ufcn-generic-page 1

Collection including Teklia/doc-ufcn-generic-page