Spaces:
Build error
Build error
File size: 994 Bytes
230c9a6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# PDF2Markdown
**Demo:(left: input image; right: rendered markdown.)**

1. Extract PDF features by these tasks:
- Layout Detection: Using the YOLOv8 model for region detection, such as images, tables, titles, text, etc.;
- Formula Detection: Using YOLOv8 for detecting formulas, including inline formulas and isolated formulas;
- Formula Recognition: Using UniMERNet for formula recognition;
- Table Recognition: Using StructEqTable for table recognition;
- Optical Character Recognition: Using PaddleOCR for text recognition;
2. Convert features to markdown file:
Using simple rules to convert the identified result to markdown (*Note: this is a simply convert code and can only support one-column PDFs, see [MinerU](https://github.com/opendatalab/MinerU) for more complex situation*).
# Usage
```
python project/pdf2markdown/scripts/run_project.py --config project/pdf2markdown/configs/pdf2markdown.yaml
```
|