Update README.md
Browse files
README.md
CHANGED
|
@@ -1,387 +1,474 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
**
|
| 6 |
-
**Converted by**: Community contribution
|
| 7 |
**Format**: ONNX (optimized for inference)
|
| 8 |
**License**: Apache 2.0
|
| 9 |
|
| 10 |
---
|
| 11 |
|
| 12 |
-
##
|
| 13 |
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
-
|
| 17 |
-
- **7 Recognition Models** - Reads text in 39+ languages
|
| 18 |
-
- **3 Preprocessing Models** - Fixes rotated or distorted documents (optional)
|
| 19 |
-
|
| 20 |
-
**Total Size**: ~258 MB
|
| 21 |
-
**Languages**: English, French, German, Spanish, Italian, Portuguese, Russian, Ukrainian, Korean, Chinese, Japanese, Thai, Greek, and 25+ more!
|
| 22 |
|
| 23 |
---
|
| 24 |
|
| 25 |
-
##
|
| 26 |
|
| 27 |
-
###
|
| 28 |
|
| 29 |
```bash
|
| 30 |
-
pip install rapidocr-onnxruntime
|
| 31 |
```
|
| 32 |
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
### Basic Usage - English
|
| 36 |
|
| 37 |
```python
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
from rapidocr_onnxruntime import RapidOCR
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
-
|
| 41 |
-
ocr = RapidOCR(
|
| 42 |
-
det_model_path="detection/PP-OCRv5_server_det.onnx",
|
| 43 |
-
rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
|
| 44 |
-
rec_keys_path="english/ppocrv5_en_dict.txt"
|
| 45 |
-
)
|
| 46 |
|
| 47 |
-
|
| 48 |
-
|
| 49 |
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
|
|
|
|
|
|
|
|
|
| 55 |
```
|
| 56 |
|
| 57 |
-
|
| 58 |
|
| 59 |
-
|
|
|
|
| 60 |
|
| 61 |
-
```
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
|
| 66 |
-
rec_keys_path="latin/ppocrv5_latin_dict.txt"
|
| 67 |
-
)
|
| 68 |
|
| 69 |
-
|
| 70 |
-
ocr = RapidOCR(
|
| 71 |
-
det_model_path="detection/PP-OCRv5_server_det.onnx",
|
| 72 |
-
rec_model_path="eslav/eslav_PP-OCRv5_mobile_rec.onnx",
|
| 73 |
-
rec_keys_path="eslav/ppocrv5_eslav_dict.txt"
|
| 74 |
-
)
|
| 75 |
|
| 76 |
-
|
| 77 |
-
ocr = RapidOCR(
|
| 78 |
-
det_model_path="detection/PP-OCRv5_server_det.onnx",
|
| 79 |
-
rec_model_path="korean/korean_PP-OCRv5_mobile_rec.onnx",
|
| 80 |
-
rec_keys_path="korean/ppocrv5_korean_dict.txt"
|
| 81 |
-
)
|
| 82 |
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
det_model_path="detection/PP-OCRv5_server_det.onnx",
|
| 86 |
-
rec_model_path="chinese/PP-OCRv5_server_rec.onnx",
|
| 87 |
-
rec_keys_path="chinese/ppocrv5_dict.txt"
|
| 88 |
-
)
|
| 89 |
|
| 90 |
-
# Thai
|
| 91 |
ocr = RapidOCR(
|
| 92 |
-
det_model_path="detection/
|
| 93 |
-
rec_model_path="
|
| 94 |
-
rec_keys_path="
|
| 95 |
)
|
| 96 |
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
rec_model_path="greek/el_PP-OCRv5_mobile_rec.onnx",
|
| 101 |
-
rec_keys_path="greek/ppocrv5_el_dict.txt"
|
| 102 |
-
)
|
| 103 |
```
|
| 104 |
|
| 105 |
---
|
| 106 |
|
| 107 |
-
##
|
| 108 |
|
| 109 |
-
###
|
| 110 |
|
| 111 |
-
|
|
| 112 |
-
|
| 113 |
-
|
|
| 114 |
-
|
|
| 115 |
-
|
|
| 116 |
-
|
|
| 117 |
-
|
|
| 118 |
-
|
|
| 119 |
-
|
|
| 120 |
|
| 121 |
-
###
|
| 122 |
|
| 123 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 124 |
|
| 125 |
-
###
|
| 126 |
|
| 127 |
-
|
|
|
|
|
|
|
|
|
|
| 128 |
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
|
| 133 |
---
|
| 134 |
|
| 135 |
-
##
|
| 136 |
|
| 137 |
-
###
|
| 138 |
-
English β’ French β’ German β’ Spanish β’ Italian β’ Portuguese β’ Dutch β’ Polish β’ Czech β’ Slovak β’ Croatian β’ Bosnian β’ Serbian (Latin) β’ Slovenian β’ Danish β’ Norwegian β’ Swedish β’ Icelandic β’ Estonian β’ Lithuanian β’ Hungarian β’ Albanian β’ Welsh β’ Irish β’ Turkish β’ Indonesian β’ Malay β’ Afrikaans β’ Swahili β’ Tagalog β’ Uzbek β’ Latin
|
| 139 |
|
| 140 |
-
|
| 141 |
-
- **English** - English (optimized)
|
| 142 |
-
- **East Slavic** - Russian β’ Bulgarian β’ Ukrainian β’ Belarusian
|
| 143 |
-
- **Korean** - Korean
|
| 144 |
-
- **Chinese/Japanese** - Simplified Chinese β’ Traditional Chinese β’ Pinyin β’ Japanese (Hiragana, Katakana, Kanji)
|
| 145 |
-
- **Thai** - Thai
|
| 146 |
-
- **Greek** - Greek
|
| 147 |
|
| 148 |
-
|
| 149 |
|
| 150 |
-
|
| 151 |
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
β βββ PP-OCRv5_server_det.onnx
|
| 156 |
-
β βββ config.json
|
| 157 |
-
β
|
| 158 |
-
βββ english/ # English (7.5 MB)
|
| 159 |
-
β βββ en_PP-OCRv5_mobile_rec.onnx
|
| 160 |
-
β βββ ppocrv5_en_dict.txt
|
| 161 |
-
β βββ config.json
|
| 162 |
-
β
|
| 163 |
-
βββ latin/ # 32 languages (7.5 MB)
|
| 164 |
-
β βββ latin_PP-OCRv5_mobile_rec.onnx
|
| 165 |
-
β βββ ppocrv5_latin_dict.txt
|
| 166 |
-
β βββ config.json
|
| 167 |
-
β
|
| 168 |
-
βββ eslav/ # Russian/Ukrainian (7.5 MB)
|
| 169 |
-
β βββ eslav_PP-OCRv5_mobile_rec.onnx
|
| 170 |
-
β βββ ppocrv5_eslav_dict.txt
|
| 171 |
-
β βββ config.json
|
| 172 |
-
β
|
| 173 |
-
βββ korean/ # Korean (13 MB)
|
| 174 |
-
β βββ korean_PP-OCRv5_mobile_rec.onnx
|
| 175 |
-
β βββ ppocrv5_korean_dict.txt
|
| 176 |
-
β βββ config.json
|
| 177 |
-
β
|
| 178 |
-
βββ chinese/ # Chinese/Japanese (81 MB)
|
| 179 |
-
β βββ PP-OCRv5_server_rec.onnx
|
| 180 |
-
β βββ ppocrv5_dict.txt
|
| 181 |
-
β βββ config.json
|
| 182 |
-
β
|
| 183 |
-
βββ thai/ # Thai (7.5 MB)
|
| 184 |
-
β βββ th_PP-OCRv5_mobile_rec.onnx
|
| 185 |
-
β βββ ppocrv5_th_dict.txt
|
| 186 |
-
β βββ config.json
|
| 187 |
-
β
|
| 188 |
-
βββ greek/ # Greek (7.4 MB)
|
| 189 |
-
β βββ el_PP-OCRv5_mobile_rec.onnx
|
| 190 |
-
β βββ ppocrv5_el_dict.txt
|
| 191 |
-
β βββ config.json
|
| 192 |
-
β
|
| 193 |
-
βββ preprocessing/ # Optional (43 MB)
|
| 194 |
-
βββ doc-orientation/
|
| 195 |
-
βββ textline-orientation/
|
| 196 |
-
βββ doc-unwarping/
|
| 197 |
-
```
|
| 198 |
|
| 199 |
-
|
| 200 |
-
|
| 201 |
-
|
| 202 |
-
|
|
|
|
| 203 |
|
| 204 |
---
|
| 205 |
|
| 206 |
-
##
|
|
|
|
|
|
|
|
|
|
| 207 |
|
| 208 |
-
|
|
|
|
| 209 |
|
| 210 |
-
|
| 211 |
-
|
| 212 |
-
|
| 213 |
-
|
| 214 |
-
|
| 215 |
-
|
| 216 |
|
| 217 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 218 |
|
| 219 |
-
|
| 220 |
-
|
| 221 |
-
|
|
|
|
|
|
|
|
|
|
| 222 |
|
| 223 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 224 |
|
| 225 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 226 |
|
| 227 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 228 |
|
| 229 |
-
|
| 230 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 231 |
```
|
| 232 |
|
| 233 |
-
|
| 234 |
|
| 235 |
-
|
|
|
|
| 236 |
|
| 237 |
```python
|
| 238 |
from rapidocr_onnxruntime import RapidOCR
|
| 239 |
-
import glob
|
| 240 |
|
|
|
|
| 241 |
ocr = RapidOCR(
|
| 242 |
-
det_model_path="detection/
|
| 243 |
-
rec_model_path="
|
| 244 |
-
rec_keys_path="
|
| 245 |
)
|
| 246 |
|
| 247 |
-
#
|
| 248 |
-
|
| 249 |
-
|
| 250 |
-
|
| 251 |
-
|
| 252 |
-
|
| 253 |
-
```
|
| 254 |
|
| 255 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 256 |
|
| 257 |
-
|
| 258 |
-
# Enable angle classification for rotated text
|
| 259 |
ocr = RapidOCR(
|
| 260 |
-
det_model_path="detection/
|
| 261 |
-
rec_model_path="
|
| 262 |
-
rec_keys_path="
|
| 263 |
-
use_angle_cls=True,
|
| 264 |
-
angle_cls_model_path="preprocessing/textline-orientation/PP-LCNet_x1_0_textline_ori.onnx"
|
| 265 |
)
|
| 266 |
```
|
| 267 |
|
|
|
|
|
|
|
| 268 |
---
|
| 269 |
|
| 270 |
-
##
|
| 271 |
|
| 272 |
-
|
|
|
|
| 273 |
|
| 274 |
-
|
| 275 |
-
2. **Recognition** - Reads text from each region using language-specific model
|
| 276 |
-
3. **Decoding** - Converts model output to text using character dictionary
|
| 277 |
|
| 278 |
-
|
|
|
|
| 279 |
|
| 280 |
-
|
| 281 |
-
|
| 282 |
-
|
| 283 |
-
|
| 284 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 285 |
|
| 286 |
-
|
|
|
|
|
|
|
|
|
|
| 287 |
|
| 288 |
-
|
| 289 |
|
| 290 |
-
|
| 291 |
-
- Korean: 88.0%
|
| 292 |
-
- English: 85.25%
|
| 293 |
-
- Latin: 84.7%
|
| 294 |
-
- Thai: 82.68%
|
| 295 |
-
- East Slavic: 81.6%
|
| 296 |
|
| 297 |
---
|
| 298 |
|
| 299 |
-
##
|
| 300 |
|
| 301 |
-
|
| 302 |
-
|
| 303 |
-
|
| 304 |
-
|
| 305 |
-
|
| 306 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 307 |
|
| 308 |
---
|
| 309 |
|
| 310 |
-
##
|
| 311 |
-
|
| 312 |
-
|
|
| 313 |
-
|
| 314 |
-
| English
|
| 315 |
-
| French, German, Spanish, Italian,
|
| 316 |
-
| Russian, Bulgarian, Ukrainian, Belarusian | `eslav/` |
|
| 317 |
-
| Korean | `korean/` |
|
| 318 |
-
| Chinese
|
| 319 |
-
| Thai | `thai/` |
|
| 320 |
-
| Greek | `greek/` |
|
| 321 |
-
|
|
| 322 |
-
|
| 323 |
-
|
|
|
|
| 324 |
|
| 325 |
---
|
| 326 |
|
| 327 |
-
##
|
| 328 |
|
| 329 |
-
**
|
| 330 |
-
|
|
|
|
|
|
|
|
|
|
| 331 |
|
| 332 |
-
|
| 333 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 334 |
|
| 335 |
-
|
| 336 |
-
A: Use the `latin/` model - it supports French and 31 other languages.
|
| 337 |
|
| 338 |
-
|
| 339 |
-
A: Yes! Licensed under Apache 2.0.
|
| 340 |
|
| 341 |
-
|
| 342 |
-
A: Very accurate! PP-OCRv5 has 30% better accuracy than PP-OCRv3.
|
| 343 |
|
| 344 |
-
|
| 345 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 346 |
|
| 347 |
---
|
| 348 |
|
| 349 |
-
##
|
| 350 |
|
| 351 |
-
|
| 352 |
-
-
|
| 353 |
-
- **Documentation**: [PaddleOCR Docs](https://paddlepaddle.github.io/PaddleOCR/)
|
| 354 |
-
- **RapidOCR**: [github.com/RapidAI/RapidOCR](https://github.com/RapidAI/RapidOCR)
|
| 355 |
-
- **ONNX Runtime**: [onnxruntime.ai](https://onnxruntime.ai/)
|
| 356 |
|
| 357 |
-
|
|
|
|
| 358 |
|
| 359 |
-
|
|
|
|
| 360 |
|
| 361 |
-
|
| 362 |
-
|
| 363 |
-
- **Based on**: [PP-OCRv5 Official Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
|
| 364 |
|
| 365 |
---
|
| 366 |
|
| 367 |
-
##
|
| 368 |
-
|
| 369 |
-
Apache License 2.0 (inherited from PaddleOCR)
|
| 370 |
|
| 371 |
-
|
| 372 |
-
-
|
| 373 |
-
-
|
| 374 |
-
- β
Distribute
|
| 375 |
-
- β
Use privately
|
| 376 |
|
| 377 |
---
|
| 378 |
|
| 379 |
-
##
|
| 380 |
|
| 381 |
-
|
| 382 |
-
-
|
| 383 |
-
-
|
| 384 |
-
-
|
| 385 |
|
| 386 |
---
|
| 387 |
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
- fr
|
| 6 |
+
- de
|
| 7 |
+
- es
|
| 8 |
+
- it
|
| 9 |
+
- pt
|
| 10 |
+
- nl
|
| 11 |
+
- pl
|
| 12 |
+
- cs
|
| 13 |
+
- sk
|
| 14 |
+
- hr
|
| 15 |
+
- bs
|
| 16 |
+
- sr
|
| 17 |
+
- sl
|
| 18 |
+
- da
|
| 19 |
+
- "no"
|
| 20 |
+
- sv
|
| 21 |
+
- is
|
| 22 |
+
- et
|
| 23 |
+
- lt
|
| 24 |
+
- hu
|
| 25 |
+
- sq
|
| 26 |
+
- cy
|
| 27 |
+
- ga
|
| 28 |
+
- tr
|
| 29 |
+
- id
|
| 30 |
+
- ms
|
| 31 |
+
- af
|
| 32 |
+
- sw
|
| 33 |
+
- tl
|
| 34 |
+
- uz
|
| 35 |
+
- la
|
| 36 |
+
- ru
|
| 37 |
+
- bg
|
| 38 |
+
- uk
|
| 39 |
+
- be
|
| 40 |
+
- ko
|
| 41 |
+
- zh
|
| 42 |
+
- ja
|
| 43 |
+
- th
|
| 44 |
+
- el
|
| 45 |
+
- hi
|
| 46 |
+
- mr
|
| 47 |
+
- ne
|
| 48 |
+
- sa
|
| 49 |
+
- ar
|
| 50 |
+
- ur
|
| 51 |
+
- fa
|
| 52 |
+
- ta
|
| 53 |
+
- te
|
| 54 |
+
tags:
|
| 55 |
+
- ocr
|
| 56 |
+
- optical-character-recognition
|
| 57 |
+
- text-detection
|
| 58 |
+
- text-recognition
|
| 59 |
+
- paddleocr
|
| 60 |
+
- onnx
|
| 61 |
+
- computer-vision
|
| 62 |
+
- document-ai
|
| 63 |
+
library_name: onnx
|
| 64 |
+
pipeline_tag: image-to-text
|
| 65 |
+
---
|
| 66 |
+
|
| 67 |
+
# PP-OCR ONNX Models
|
| 68 |
+
|
| 69 |
+
Multilingual OCR models from PaddleOCR, converted to ONNX format for production deployment.
|
| 70 |
|
| 71 |
+
**Use as a complete pipeline**: Integrate with [monkt.com](https://monkt.com) for end-to-end document processing.
|
| 72 |
|
| 73 |
+
**Source**: [PaddlePaddle PP-OCRv5 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
|
|
|
|
| 74 |
**Format**: ONNX (optimized for inference)
|
| 75 |
**License**: Apache 2.0
|
| 76 |
|
| 77 |
---
|
| 78 |
|
| 79 |
+
## Overview
|
| 80 |
|
| 81 |
+
**16 models** covering **48+ languages**:
|
| 82 |
+
- 11 PP-OCRv5 models (latest, highest accuracy)
|
| 83 |
+
- 5 PP-OCRv3 models (legacy, additional language support)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 84 |
|
| 85 |
---
|
| 86 |
|
| 87 |
+
## Quick Start
|
| 88 |
|
| 89 |
+
### Download from HuggingFace
|
| 90 |
|
| 91 |
```bash
|
| 92 |
+
pip install huggingface_hub rapidocr-onnxruntime
|
| 93 |
```
|
| 94 |
|
| 95 |
+
<details>
|
| 96 |
+
<summary><b>Download specific language models</b></summary>
|
|
|
|
| 97 |
|
| 98 |
```python
|
| 99 |
+
from huggingface_hub import hf_hub_download
|
| 100 |
+
|
| 101 |
+
# Download English models
|
| 102 |
+
det_path = hf_hub_download("monkt/paddleocr-onnx", "detection/v5/det.onnx")
|
| 103 |
+
rec_path = hf_hub_download("monkt/paddleocr-onnx", "languages/english/rec.onnx")
|
| 104 |
+
dict_path = hf_hub_download("monkt/paddleocr-onnx", "languages/english/dict.txt")
|
| 105 |
+
|
| 106 |
+
# Use with RapidOCR
|
| 107 |
from rapidocr_onnxruntime import RapidOCR
|
| 108 |
+
ocr = RapidOCR(det_model_path=det_path, rec_model_path=rec_path, rec_keys_path=dict_path)
|
| 109 |
+
result, elapsed = ocr("document.jpg")
|
| 110 |
+
```
|
| 111 |
|
| 112 |
+
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 113 |
|
| 114 |
+
<details>
|
| 115 |
+
<summary><b>Download entire language folder</b></summary>
|
| 116 |
|
| 117 |
+
```python
|
| 118 |
+
from huggingface_hub import snapshot_download
|
| 119 |
+
|
| 120 |
+
# Download all French/German/Spanish (Latin) models
|
| 121 |
+
snapshot_download("monkt/paddleocr-onnx", allow_patterns=["detection/v5/*", "languages/latin/*"])
|
| 122 |
+
|
| 123 |
+
# Download Arabic models (v3)
|
| 124 |
+
snapshot_download("monkt/paddleocr-onnx", allow_patterns=["detection/v3/*", "languages/arabic/*"])
|
| 125 |
```
|
| 126 |
|
| 127 |
+
</details>
|
| 128 |
|
| 129 |
+
<details>
|
| 130 |
+
<summary><b>Clone entire repository</b></summary>
|
| 131 |
|
| 132 |
+
```bash
|
| 133 |
+
git clone https://huggingface.co/monkt/paddleocr-onnx
|
| 134 |
+
cd paddleocr-onnx
|
| 135 |
+
```
|
|
|
|
|
|
|
|
|
|
| 136 |
|
| 137 |
+
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
|
| 139 |
+
### Basic Usage
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 140 |
|
| 141 |
+
```python
|
| 142 |
+
from rapidocr_onnxruntime import RapidOCR
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
|
|
|
|
| 144 |
ocr = RapidOCR(
|
| 145 |
+
det_model_path="detection/v5/det.onnx",
|
| 146 |
+
rec_model_path="languages/english/rec.onnx",
|
| 147 |
+
rec_keys_path="languages/english/dict.txt"
|
| 148 |
)
|
| 149 |
|
| 150 |
+
result, elapsed = ocr("document.jpg")
|
| 151 |
+
for line in result:
|
| 152 |
+
print(line[1][0]) # Extracted text
|
|
|
|
|
|
|
|
|
|
| 153 |
```
|
| 154 |
|
| 155 |
---
|
| 156 |
|
| 157 |
+
## Available Models
|
| 158 |
|
| 159 |
+
### PP-OCRv5 Recognition Models
|
| 160 |
|
| 161 |
+
| Language Group | Path | Languages | Accuracy | Size |
|
| 162 |
+
|----------------|------|-----------|----------|------|
|
| 163 |
+
| English | `languages/english/` | English | 85.25% | 7.5 MB |
|
| 164 |
+
| Latin | `languages/latin/` | French, German, Spanish, Italian, Portuguese, + 27 more | 84.7% | 7.5 MB |
|
| 165 |
+
| East Slavic | `languages/eslav/` | Russian, Bulgarian, Ukrainian, Belarusian | 81.6% | 7.5 MB |
|
| 166 |
+
| Korean | `languages/korean/` | Korean | 88.0% | 13 MB |
|
| 167 |
+
| Chinese/Japanese | `languages/chinese/` | Chinese, Japanese | - | 81 MB |
|
| 168 |
+
| Thai | `languages/thai/` | Thai | 82.68% | 7.5 MB |
|
| 169 |
+
| Greek | `languages/greek/` | Greek | 89.28% | 7.4 MB |
|
| 170 |
|
| 171 |
+
### PP-OCRv3 Recognition Models (Legacy)
|
| 172 |
|
| 173 |
+
| Language Group | Path | Languages | Version | Size |
|
| 174 |
+
|----------------|------|-----------|---------|------|
|
| 175 |
+
| Devanagari | `languages/hindi/` | Hindi, Marathi, Nepali, Sanskrit | v3 | 8.6 MB |
|
| 176 |
+
| Arabic | `languages/arabic/` | Arabic, Urdu, Persian/Farsi | v3 | 8.6 MB |
|
| 177 |
+
| Tamil | `languages/tamil/` | Tamil | v3 | 8.6 MB |
|
| 178 |
+
| Telugu | `languages/telugu/` | Telugu | v3 | 8.6 MB |
|
| 179 |
|
| 180 |
+
### Detection Models
|
| 181 |
|
| 182 |
+
| Model | Path | Version | Size |
|
| 183 |
+
|-------|------|---------|------|
|
| 184 |
+
| PP-OCRv5 Detection | `detection/v5/det.onnx` | v5 | 84 MB |
|
| 185 |
+
| PP-OCRv3 Detection | `detection/v3/det.onnx` | v3 | 2.3 MB |
|
| 186 |
|
| 187 |
+
**Note**: Use v5 detection with v5 recognition models. Use v3 detection with v3 recognition models.
|
| 188 |
+
|
| 189 |
+
### Preprocessing Models (Optional)
|
| 190 |
+
|
| 191 |
+
| Model | Path | Purpose | Accuracy | Size |
|
| 192 |
+
|-------|------|---------|----------|------|
|
| 193 |
+
| Document Orientation | `preprocessing/doc-orientation/` | Corrects rotated documents (0Β°, 90Β°, 180Β°, 270Β°) | 99.06% | 6.5 MB |
|
| 194 |
+
| Text Line Orientation | `preprocessing/textline-orientation/` | Corrects upside-down text (0Β°, 180Β°) | 98.85% | 6.5 MB |
|
| 195 |
+
| Document Unwarping | `preprocessing/doc-unwarping/` | Fixes curved/warped documents | - | 30 MB |
|
| 196 |
|
| 197 |
---
|
| 198 |
|
| 199 |
+
## Language Support
|
| 200 |
|
| 201 |
+
### PP-OCRv5 Languages (40+)
|
|
|
|
| 202 |
|
| 203 |
+
**Latin Script** (32 languages): English, French, German, Spanish, Italian, Portuguese, Dutch, Polish, Czech, Slovak, Croatian, Bosnian, Serbian, Slovenian, Danish, Norwegian, Swedish, Icelandic, Estonian, Lithuanian, Hungarian, Albanian, Welsh, Irish, Turkish, Indonesian, Malay, Afrikaans, Swahili, Tagalog, Uzbek, Latin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 204 |
|
| 205 |
+
**Cyrillic**: Russian, Bulgarian, Ukrainian, Belarusian
|
| 206 |
|
| 207 |
+
**East Asian**: Chinese (Simplified, Traditional), Japanese (Hiragana, Katakana, Kanji), Korean
|
| 208 |
|
| 209 |
+
**Southeast Asian**: Thai
|
| 210 |
+
|
| 211 |
+
**Other**: Greek
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 212 |
|
| 213 |
+
### PP-OCRv3 Languages (8)
|
| 214 |
+
|
| 215 |
+
**South Asian**: Hindi, Marathi, Nepali, Sanskrit, Tamil, Telugu
|
| 216 |
+
|
| 217 |
+
**Middle Eastern**: Arabic, Urdu, Persian/Farsi
|
| 218 |
|
| 219 |
---
|
| 220 |
|
| 221 |
+
## Usage Examples
|
| 222 |
+
|
| 223 |
+
<details>
|
| 224 |
+
<summary><b>PP-OCRv5 Models (English, Latin, East Asian, etc.)</b></summary>
|
| 225 |
|
| 226 |
+
```python
|
| 227 |
+
from rapidocr_onnxruntime import RapidOCR
|
| 228 |
|
| 229 |
+
# English
|
| 230 |
+
ocr = RapidOCR(
|
| 231 |
+
det_model_path="detection/v5/det.onnx",
|
| 232 |
+
rec_model_path="languages/english/rec.onnx",
|
| 233 |
+
rec_keys_path="languages/english/dict.txt"
|
| 234 |
+
)
|
| 235 |
|
| 236 |
+
# French, German, Spanish, etc. (32 languages)
|
| 237 |
+
ocr = RapidOCR(
|
| 238 |
+
det_model_path="detection/v5/det.onnx",
|
| 239 |
+
rec_model_path="languages/latin/rec.onnx",
|
| 240 |
+
rec_keys_path="languages/latin/dict.txt"
|
| 241 |
+
)
|
| 242 |
|
| 243 |
+
# Russian, Bulgarian, Ukrainian, Belarusian
|
| 244 |
+
ocr = RapidOCR(
|
| 245 |
+
det_model_path="detection/v5/det.onnx",
|
| 246 |
+
rec_model_path="languages/eslav/rec.onnx",
|
| 247 |
+
rec_keys_path="languages/eslav/dict.txt"
|
| 248 |
+
)
|
| 249 |
|
| 250 |
+
# Korean
|
| 251 |
+
ocr = RapidOCR(
|
| 252 |
+
det_model_path="detection/v5/det.onnx",
|
| 253 |
+
rec_model_path="languages/korean/rec.onnx",
|
| 254 |
+
rec_keys_path="languages/korean/dict.txt"
|
| 255 |
+
)
|
| 256 |
|
| 257 |
+
# Chinese/Japanese
|
| 258 |
+
ocr = RapidOCR(
|
| 259 |
+
det_model_path="detection/v5/det.onnx",
|
| 260 |
+
rec_model_path="languages/chinese/rec.onnx",
|
| 261 |
+
rec_keys_path="languages/chinese/dict.txt"
|
| 262 |
+
)
|
| 263 |
|
| 264 |
+
# Thai
|
| 265 |
+
ocr = RapidOCR(
|
| 266 |
+
det_model_path="detection/v5/det.onnx",
|
| 267 |
+
rec_model_path="languages/thai/rec.onnx",
|
| 268 |
+
rec_keys_path="languages/thai/dict.txt"
|
| 269 |
+
)
|
| 270 |
|
| 271 |
+
# Greek
|
| 272 |
+
ocr = RapidOCR(
|
| 273 |
+
det_model_path="detection/v5/det.onnx",
|
| 274 |
+
rec_model_path="languages/greek/rec.onnx",
|
| 275 |
+
rec_keys_path="languages/greek/dict.txt"
|
| 276 |
+
)
|
| 277 |
```
|
| 278 |
|
| 279 |
+
</details>
|
| 280 |
|
| 281 |
+
<details>
|
| 282 |
+
<summary><b>PP-OCRv3 Models (Hindi, Arabic, Tamil, Telugu)</b></summary>
|
| 283 |
|
| 284 |
```python
|
| 285 |
from rapidocr_onnxruntime import RapidOCR
|
|
|
|
| 286 |
|
| 287 |
+
# Hindi, Marathi, Nepali, Sanskrit
|
| 288 |
ocr = RapidOCR(
|
| 289 |
+
det_model_path="detection/v3/det.onnx",
|
| 290 |
+
rec_model_path="languages/hindi/rec.onnx",
|
| 291 |
+
rec_keys_path="languages/hindi/dict.txt"
|
| 292 |
)
|
| 293 |
|
| 294 |
+
# Arabic, Urdu, Persian/Farsi
|
| 295 |
+
ocr = RapidOCR(
|
| 296 |
+
det_model_path="detection/v3/det.onnx",
|
| 297 |
+
rec_model_path="languages/arabic/rec.onnx",
|
| 298 |
+
rec_keys_path="languages/arabic/dict.txt"
|
| 299 |
+
)
|
|
|
|
| 300 |
|
| 301 |
+
# Tamil
|
| 302 |
+
ocr = RapidOCR(
|
| 303 |
+
det_model_path="detection/v3/det.onnx",
|
| 304 |
+
rec_model_path="languages/tamil/rec.onnx",
|
| 305 |
+
rec_keys_path="languages/tamil/dict.txt"
|
| 306 |
+
)
|
| 307 |
|
| 308 |
+
# Telugu
|
|
|
|
| 309 |
ocr = RapidOCR(
|
| 310 |
+
det_model_path="detection/v3/det.onnx",
|
| 311 |
+
rec_model_path="languages/telugu/rec.onnx",
|
| 312 |
+
rec_keys_path="languages/telugu/dict.txt"
|
|
|
|
|
|
|
| 313 |
)
|
| 314 |
```
|
| 315 |
|
| 316 |
+
</details>
|
| 317 |
+
|
| 318 |
---
|
| 319 |
|
| 320 |
+
## Full Pipeline with Preprocessing
|
| 321 |
|
| 322 |
+
<details>
|
| 323 |
+
<summary><b>Optional preprocessing for rotated/distorted documents</b></summary>
|
| 324 |
|
| 325 |
+
Preprocessing models improve accuracy on rotated or distorted documents:
|
|
|
|
|
|
|
| 326 |
|
| 327 |
+
```python
|
| 328 |
+
from rapidocr_onnxruntime import RapidOCR
|
| 329 |
|
| 330 |
+
# Complete pipeline with preprocessing
|
| 331 |
+
ocr = RapidOCR(
|
| 332 |
+
det_model_path="detection/v5/det.onnx",
|
| 333 |
+
rec_model_path="languages/english/rec.onnx",
|
| 334 |
+
rec_keys_path="languages/english/dict.txt",
|
| 335 |
+
# Optional preprocessing
|
| 336 |
+
use_angle_cls=True,
|
| 337 |
+
angle_cls_model_path="preprocessing/textline-orientation/PP-LCNet_x1_0_textline_ori.onnx"
|
| 338 |
+
)
|
| 339 |
+
|
| 340 |
+
result, elapsed = ocr("rotated_document.jpg")
|
| 341 |
+
```
|
| 342 |
|
| 343 |
+
**When to use preprocessing**:
|
| 344 |
+
- **Document Orientation** (`doc-orientation/`): Scanned documents with unknown rotation (0Β°/90Β°/180Β°/270Β°)
|
| 345 |
+
- **Text Line Orientation** (`textline-orientation/`): Upside-down text lines (0Β°/180Β°)
|
| 346 |
+
- **Document Unwarping** (`doc-unwarping/`): Curved pages, warped documents, camera photos
|
| 347 |
|
| 348 |
+
**Performance impact**: +10-30% accuracy on distorted images, minimal speed overhead.
|
| 349 |
|
| 350 |
+
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 351 |
|
| 352 |
---
|
| 353 |
|
| 354 |
+
## Repository Structure
|
| 355 |
|
| 356 |
+
```
|
| 357 |
+
.
|
| 358 |
+
βββ detection/
|
| 359 |
+
β βββ v5/
|
| 360 |
+
β β βββ det.onnx # 84 MB - PP-OCRv5 detection
|
| 361 |
+
β β βββ config.json
|
| 362 |
+
β βββ v3/
|
| 363 |
+
β βββ det.onnx # 2.3 MB - PP-OCRv3 detection
|
| 364 |
+
β βββ config.json
|
| 365 |
+
β
|
| 366 |
+
βββ languages/
|
| 367 |
+
β βββ english/
|
| 368 |
+
β β βββ rec.onnx # 7.5 MB
|
| 369 |
+
β β βββ dict.txt
|
| 370 |
+
β β βββ config.json
|
| 371 |
+
β βββ latin/ # 32 languages
|
| 372 |
+
β βββ eslav/ # Russian, Bulgarian, Ukrainian, Belarusian
|
| 373 |
+
β βββ korean/
|
| 374 |
+
β βββ chinese/ # Chinese, Japanese
|
| 375 |
+
β βββ thai/
|
| 376 |
+
β βββ greek/
|
| 377 |
+
β βββ hindi/ # Hindi, Marathi, Nepali, Sanskrit (v3)
|
| 378 |
+
β βββ arabic/ # Arabic, Urdu, Persian (v3)
|
| 379 |
+
β βββ tamil/ # Tamil (v3)
|
| 380 |
+
β βββ telugu/ # Telugu (v3)
|
| 381 |
+
β
|
| 382 |
+
βββ preprocessing/
|
| 383 |
+
βββ doc-orientation/
|
| 384 |
+
βββ textline-orientation/
|
| 385 |
+
βββ doc-unwarping/
|
| 386 |
+
```
|
| 387 |
|
| 388 |
---
|
| 389 |
|
| 390 |
+
## Model Selection
|
| 391 |
+
|
| 392 |
+
| Document Language | Model Path |
|
| 393 |
+
|-------------------|------------|
|
| 394 |
+
| English | `languages/english/` |
|
| 395 |
+
| French, German, Spanish, Italian, Portuguese | `languages/latin/` |
|
| 396 |
+
| Russian, Bulgarian, Ukrainian, Belarusian | `languages/eslav/` |
|
| 397 |
+
| Korean | `languages/korean/` |
|
| 398 |
+
| Chinese, Japanese | `languages/chinese/` |
|
| 399 |
+
| Thai | `languages/thai/` |
|
| 400 |
+
| Greek | `languages/greek/` |
|
| 401 |
+
| Hindi, Marathi, Nepali, Sanskrit | `languages/hindi/` + `detection/v3/` |
|
| 402 |
+
| Arabic, Urdu, Persian/Farsi | `languages/arabic/` + `detection/v3/` |
|
| 403 |
+
| Tamil | `languages/tamil/` + `detection/v3/` |
|
| 404 |
+
| Telugu | `languages/telugu/` + `detection/v3/` |
|
| 405 |
|
| 406 |
---
|
| 407 |
|
| 408 |
+
## Technical Specifications
|
| 409 |
|
| 410 |
+
- **Framework**: PaddleOCR β ONNX
|
| 411 |
+
- **ONNX Opset**: 11
|
| 412 |
+
- **Precision**: FP32
|
| 413 |
+
- **Input Format**: RGB images (dynamic size)
|
| 414 |
+
- **Inference**: CPU/GPU via onnxruntime
|
| 415 |
|
| 416 |
+
### Detection Model
|
| 417 |
+
- **Input**: `(batch, 3, height, width)` - dynamic
|
| 418 |
+
- **Output**: Text bounding boxes
|
| 419 |
+
|
| 420 |
+
### Recognition Model
|
| 421 |
+
- **Input**: `(batch, 3, 32, width)` - height fixed at 32px
|
| 422 |
+
- **Output**: CTC logits β decoded with dictionary
|
| 423 |
|
| 424 |
+
---
|
|
|
|
| 425 |
|
| 426 |
+
## Performance
|
|
|
|
| 427 |
|
| 428 |
+
### Accuracy (PP-OCRv5)
|
|
|
|
| 429 |
|
| 430 |
+
| Model | Accuracy | Dataset |
|
| 431 |
+
|-------|----------|---------|
|
| 432 |
+
| Greek | 89.28% | 2,799 images |
|
| 433 |
+
| Korean | 88.0% | 5,007 images |
|
| 434 |
+
| English | 85.25% | 6,530 images |
|
| 435 |
+
| Latin | 84.7% | 3,111 images |
|
| 436 |
+
| Thai | 82.68% | 4,261 images |
|
| 437 |
+
| East Slavic | 81.6% | 7,031 images |
|
| 438 |
|
| 439 |
---
|
| 440 |
|
| 441 |
+
## FAQ
|
| 442 |
|
| 443 |
+
**Q: Which version should I use?**
|
| 444 |
+
A: Use PP-OCRv5 models for best accuracy. Use PP-OCRv3 only for South Asian languages not available in v5.
|
|
|
|
|
|
|
|
|
|
| 445 |
|
| 446 |
+
**Q: Can I mix v5 and v3 models?**
|
| 447 |
+
A: No. Use `detection/v5/det.onnx` with v5 recognition models, and `detection/v3/det.onnx` with v3 recognition models.
|
| 448 |
|
| 449 |
+
**Q: GPU acceleration?**
|
| 450 |
+
A: Install `onnxruntime-gpu` instead of `onnxruntime` for 10x faster inference.
|
| 451 |
|
| 452 |
+
**Q: Commercial use?**
|
| 453 |
+
A: Yes. Apache 2.0 license allows commercial use.
|
|
|
|
| 454 |
|
| 455 |
---
|
| 456 |
|
| 457 |
+
## Credits
|
|
|
|
|
|
|
| 458 |
|
| 459 |
+
- **Original Models**: [PaddlePaddle Team](https://github.com/PaddlePaddle/PaddleOCR)
|
| 460 |
+
- **Conversion**: [paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX)
|
| 461 |
+
- **Source**: [PP-OCRv5 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
|
|
|
|
|
|
|
| 462 |
|
| 463 |
---
|
| 464 |
|
| 465 |
+
## Links
|
| 466 |
|
| 467 |
+
- [PaddleOCR GitHub](https://github.com/PaddlePaddle/PaddleOCR)
|
| 468 |
+
- [PaddleOCR Documentation](https://paddlepaddle.github.io/PaddleOCR/)
|
| 469 |
+
- [ONNX Runtime](https://onnxruntime.ai/)
|
| 470 |
+
- [monkt.com](https://monkt.com) - Document processing pipeline
|
| 471 |
|
| 472 |
---
|
| 473 |
|
| 474 |
+
**License**: Apache 2.0
|