Paragraph classifier
The classifier is used for binary classification of text lines in PDF or scanned documents.
For each document line, it determines:
line is a beginning of a new paragraph or
line is a continuation of the previous paragraph
For each line, feature vector is formed based on line's text and formatting, please see
dedoc/structure_extractors/feature_extractors/paragraph_feature_extractor.py
in dedoc.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.