s-emanuilov commited on
Commit
7b02d0a
Β·
verified Β·
1 Parent(s): e1943cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +351 -264
README.md CHANGED
@@ -1,387 +1,474 @@
1
- # PP-OCRv5 ONNX Models
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- Fast and accurate multilingual OCR models from PaddleOCR, converted to ONNX format for easy deployment.
4
 
5
- **Original Models**: [PaddlePaddle PP-OCRv5 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
6
- **Converted by**: Community contribution
7
  **Format**: ONNX (optimized for inference)
8
  **License**: Apache 2.0
9
 
10
  ---
11
 
12
- ## 🎯 What's Inside
13
 
14
- This repository contains **11 production-ready ONNX models**:
15
-
16
- - **1 Detection Model** - Finds text in images (works with all languages)
17
- - **7 Recognition Models** - Reads text in 39+ languages
18
- - **3 Preprocessing Models** - Fixes rotated or distorted documents (optional)
19
-
20
- **Total Size**: ~258 MB
21
- **Languages**: English, French, German, Spanish, Italian, Portuguese, Russian, Ukrainian, Korean, Chinese, Japanese, Thai, Greek, and 25+ more!
22
 
23
  ---
24
 
25
- ## πŸš€ Quick Start
26
 
27
- ### Installation
28
 
29
  ```bash
30
- pip install rapidocr-onnxruntime
31
  ```
32
 
33
- That's it! No PaddlePaddle, no CUDA required. Works on CPU out of the box.
34
-
35
- ### Basic Usage - English
36
 
37
  ```python
 
 
 
 
 
 
 
 
38
  from rapidocr_onnxruntime import RapidOCR
 
 
 
39
 
40
- # Initialize OCR
41
- ocr = RapidOCR(
42
- det_model_path="detection/PP-OCRv5_server_det.onnx",
43
- rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
44
- rec_keys_path="english/ppocrv5_en_dict.txt"
45
- )
46
 
47
- # Run OCR
48
- result, elapsed = ocr("your_image.jpg")
49
 
50
- # Print results
51
- for line in result:
52
- text = line[1][0] # Extracted text
53
- confidence = line[1][1] # Confidence score
54
- print(f"{text} (confidence: {confidence:.2%})")
 
 
 
55
  ```
56
 
57
- ### Other Languages
58
 
59
- Just change the model paths:
 
60
 
61
- ```python
62
- # French, German, Spanish, Italian, etc. (32 languages)
63
- ocr = RapidOCR(
64
- det_model_path="detection/PP-OCRv5_server_det.onnx",
65
- rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
66
- rec_keys_path="latin/ppocrv5_latin_dict.txt"
67
- )
68
 
69
- # Russian, Bulgarian, Ukrainian, Belarusian
70
- ocr = RapidOCR(
71
- det_model_path="detection/PP-OCRv5_server_det.onnx",
72
- rec_model_path="eslav/eslav_PP-OCRv5_mobile_rec.onnx",
73
- rec_keys_path="eslav/ppocrv5_eslav_dict.txt"
74
- )
75
 
76
- # Korean
77
- ocr = RapidOCR(
78
- det_model_path="detection/PP-OCRv5_server_det.onnx",
79
- rec_model_path="korean/korean_PP-OCRv5_mobile_rec.onnx",
80
- rec_keys_path="korean/ppocrv5_korean_dict.txt"
81
- )
82
 
83
- # Chinese / Japanese
84
- ocr = RapidOCR(
85
- det_model_path="detection/PP-OCRv5_server_det.onnx",
86
- rec_model_path="chinese/PP-OCRv5_server_rec.onnx",
87
- rec_keys_path="chinese/ppocrv5_dict.txt"
88
- )
89
 
90
- # Thai
91
  ocr = RapidOCR(
92
- det_model_path="detection/PP-OCRv5_server_det.onnx",
93
- rec_model_path="thai/th_PP-OCRv5_mobile_rec.onnx",
94
- rec_keys_path="thai/ppocrv5_th_dict.txt"
95
  )
96
 
97
- # Greek
98
- ocr = RapidOCR(
99
- det_model_path="detection/PP-OCRv5_server_det.onnx",
100
- rec_model_path="greek/el_PP-OCRv5_mobile_rec.onnx",
101
- rec_keys_path="greek/ppocrv5_el_dict.txt"
102
- )
103
  ```
104
 
105
  ---
106
 
107
- ## πŸ“¦ Available Models
108
 
109
- ### Text Recognition Models
110
 
111
- | Model | Languages | Accuracy | Size | Best For |
112
- |-------|-----------|----------|------|----------|
113
- | **english/** | English | 85.25% | 7.5 MB | English documents |
114
- | **latin/** | French, German, Spanish, Italian, Portuguese, Dutch, Polish, Czech, + 24 more | 84.7% | 7.5 MB | European documents |
115
- | **eslav/** | Russian, Bulgarian, Ukrainian, Belarusian, English | 81.6% | 7.5 MB | Cyrillic scripts |
116
- | **korean/** | Korean, English | 88.0% | 13 MB | Korean documents |
117
- | **chinese/** | Chinese, Japanese, English | - | 81 MB | CJK documents |
118
- | **thai/** | Thai, English | 82.68% | 7.5 MB | Thai documents |
119
- | **greek/** | Greek, English | 89.28% | 7.4 MB | Greek documents |
120
 
121
- ### Detection Model
122
 
123
- - **detection/** - Universal text detection (84 MB) - Works with all languages
 
 
 
 
 
124
 
125
- ### Preprocessing Models (Optional)
126
 
127
- Enhance OCR accuracy on challenging documents:
 
 
 
128
 
129
- - **preprocessing/doc-orientation/** - Fixes rotated documents (6.5 MB, 99.06% accuracy)
130
- - **preprocessing/textline-orientation/** - Fixes upside-down text (6.5 MB, 98.85% accuracy)
131
- - **preprocessing/doc-unwarping/** - Fixes curved/warped pages (30 MB)
 
 
 
 
 
 
132
 
133
  ---
134
 
135
- ## 🌍 Supported Languages (39+)
136
 
137
- ### Latin Model (32 languages)
138
- English β€’ French β€’ German β€’ Spanish β€’ Italian β€’ Portuguese β€’ Dutch β€’ Polish β€’ Czech β€’ Slovak β€’ Croatian β€’ Bosnian β€’ Serbian (Latin) β€’ Slovenian β€’ Danish β€’ Norwegian β€’ Swedish β€’ Icelandic β€’ Estonian β€’ Lithuanian β€’ Hungarian β€’ Albanian β€’ Welsh β€’ Irish β€’ Turkish β€’ Indonesian β€’ Malay β€’ Afrikaans β€’ Swahili β€’ Tagalog β€’ Uzbek β€’ Latin
139
 
140
- ### Other Models
141
- - **English** - English (optimized)
142
- - **East Slavic** - Russian β€’ Bulgarian β€’ Ukrainian β€’ Belarusian
143
- - **Korean** - Korean
144
- - **Chinese/Japanese** - Simplified Chinese β€’ Traditional Chinese β€’ Pinyin β€’ Japanese (Hiragana, Katakana, Kanji)
145
- - **Thai** - Thai
146
- - **Greek** - Greek
147
 
148
- ---
149
 
150
- ## πŸ“ Repository Structure
151
 
152
- ```
153
- .
154
- β”œβ”€β”€ detection/ # Text detection (84 MB)
155
- β”‚ β”œβ”€β”€ PP-OCRv5_server_det.onnx
156
- β”‚ └── config.json
157
- β”‚
158
- β”œβ”€β”€ english/ # English (7.5 MB)
159
- β”‚ β”œβ”€β”€ en_PP-OCRv5_mobile_rec.onnx
160
- β”‚ β”œβ”€β”€ ppocrv5_en_dict.txt
161
- β”‚ └── config.json
162
- β”‚
163
- β”œβ”€β”€ latin/ # 32 languages (7.5 MB)
164
- β”‚ β”œβ”€β”€ latin_PP-OCRv5_mobile_rec.onnx
165
- β”‚ β”œβ”€β”€ ppocrv5_latin_dict.txt
166
- β”‚ └── config.json
167
- β”‚
168
- β”œβ”€β”€ eslav/ # Russian/Ukrainian (7.5 MB)
169
- β”‚ β”œβ”€β”€ eslav_PP-OCRv5_mobile_rec.onnx
170
- β”‚ β”œβ”€β”€ ppocrv5_eslav_dict.txt
171
- β”‚ └── config.json
172
- β”‚
173
- β”œβ”€β”€ korean/ # Korean (13 MB)
174
- β”‚ β”œβ”€β”€ korean_PP-OCRv5_mobile_rec.onnx
175
- β”‚ β”œβ”€β”€ ppocrv5_korean_dict.txt
176
- β”‚ └── config.json
177
- β”‚
178
- β”œβ”€β”€ chinese/ # Chinese/Japanese (81 MB)
179
- β”‚ β”œβ”€β”€ PP-OCRv5_server_rec.onnx
180
- β”‚ β”œβ”€β”€ ppocrv5_dict.txt
181
- β”‚ └── config.json
182
- β”‚
183
- β”œβ”€β”€ thai/ # Thai (7.5 MB)
184
- β”‚ β”œβ”€β”€ th_PP-OCRv5_mobile_rec.onnx
185
- β”‚ β”œβ”€β”€ ppocrv5_th_dict.txt
186
- β”‚ └── config.json
187
- β”‚
188
- β”œβ”€β”€ greek/ # Greek (7.4 MB)
189
- β”‚ β”œβ”€β”€ el_PP-OCRv5_mobile_rec.onnx
190
- β”‚ β”œβ”€β”€ ppocrv5_el_dict.txt
191
- β”‚ └── config.json
192
- β”‚
193
- └── preprocessing/ # Optional (43 MB)
194
- β”œβ”€β”€ doc-orientation/
195
- β”œβ”€β”€ textline-orientation/
196
- └── doc-unwarping/
197
- ```
198
 
199
- Each model directory contains:
200
- - **`.onnx`** - The model file
201
- - **`.txt`** - Character dictionary
202
- - **`config.json`** - Model metadata
 
203
 
204
  ---
205
 
206
- ## πŸ’‘ Why Use These Models?
 
 
 
207
 
208
- ### βœ… Advantages
 
209
 
210
- 1. **ONNX Format** - Fast inference, works on any platform (CPU/GPU)
211
- 2. **No PaddlePaddle Required** - Just install `rapidocr-onnxruntime`
212
- 3. **39+ Languages** - Multilingual support out of the box
213
- 4. **Production Ready** - All models tested and validated
214
- 5. **Complete Package** - Detection + Recognition + Dictionaries included
215
- 6. **Well Documented** - Every model has detailed config and usage info
216
 
217
- ### πŸ“Š Performance
 
 
 
 
 
218
 
219
- - **Speed**: Fast inference on CPU (~100-300ms per image)
220
- - **Accuracy**: 30% improvement over PP-OCRv3
221
- - **Size**: Compact models (7-84 MB each)
 
 
 
222
 
223
- ---
 
 
 
 
 
224
 
225
- ## πŸ› οΈ Advanced Usage
 
 
 
 
 
226
 
227
- ### With GPU Acceleration
 
 
 
 
 
228
 
229
- ```bash
230
- pip install onnxruntime-gpu
 
 
 
 
231
  ```
232
 
233
- Models will automatically use GPU if available for 10x faster inference.
234
 
235
- ### Batch Processing
 
236
 
237
  ```python
238
  from rapidocr_onnxruntime import RapidOCR
239
- import glob
240
 
 
241
  ocr = RapidOCR(
242
- det_model_path="detection/PP-OCRv5_server_det.onnx",
243
- rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
244
- rec_keys_path="latin/ppocrv5_latin_dict.txt"
245
  )
246
 
247
- # Process all images in a folder
248
- for image_path in glob.glob("documents/*.jpg"):
249
- result, elapsed = ocr(image_path)
250
- print(f"Processed {image_path} in {elapsed:.2f}s")
251
- for line in result:
252
- print(f" {line[1][0]}")
253
- ```
254
 
255
- ### With Preprocessing (for rotated/distorted documents)
 
 
 
 
 
256
 
257
- ```python
258
- # Enable angle classification for rotated text
259
  ocr = RapidOCR(
260
- det_model_path="detection/PP-OCRv5_server_det.onnx",
261
- rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
262
- rec_keys_path="english/ppocrv5_en_dict.txt",
263
- use_angle_cls=True,
264
- angle_cls_model_path="preprocessing/textline-orientation/PP-LCNet_x1_0_textline_ori.onnx"
265
  )
266
  ```
267
 
 
 
268
  ---
269
 
270
- ## πŸ“– Model Details
271
 
272
- ### How It Works
 
273
 
274
- 1. **Detection** - Finds all text regions in the image
275
- 2. **Recognition** - Reads text from each region using language-specific model
276
- 3. **Decoding** - Converts model output to text using character dictionary
277
 
278
- ### Model Specifications
 
279
 
280
- - **Framework**: Converted from PaddlePaddle to ONNX
281
- - **ONNX Opset**: 11
282
- - **Precision**: FP32
283
- - **Input**: RGB images (dynamic size)
284
- - **Output**: Text + confidence scores + bounding boxes
 
 
 
 
 
 
 
285
 
286
- ### Accuracy Benchmarks
 
 
 
287
 
288
- Tested on official PP-OCRv5 datasets:
289
 
290
- - Greek: 89.28%
291
- - Korean: 88.0%
292
- - English: 85.25%
293
- - Latin: 84.7%
294
- - Thai: 82.68%
295
- - East Slavic: 81.6%
296
 
297
  ---
298
 
299
- ## 🎯 Use Cases
300
 
301
- - **Document Digitization** - Scan and extract text from documents
302
- - **Multilingual OCR** - Process documents in 39+ languages
303
- - **Mobile Apps** - Lightweight models perfect for mobile deployment
304
- - **Batch Processing** - Process thousands of documents efficiently
305
- - **Real-time OCR** - Fast enough for real-time applications
306
- - **Custom Pipelines** - Integrate into your existing workflows
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
307
 
308
  ---
309
 
310
- ## πŸ“ Language Selection Guide
311
-
312
- | Your Document | Use This Model |
313
- |---------------|----------------|
314
- | English only | `english/` |
315
- | French, German, Spanish, Italian, etc. | `latin/` (best choice for European languages) |
316
- | Russian, Bulgarian, Ukrainian, Belarusian | `eslav/` |
317
- | Korean | `korean/` |
318
- | Chinese or Japanese | `chinese/` |
319
- | Thai | `thai/` |
320
- | Greek | `greek/` |
321
- | Mixed European languages | `latin/` (supports 32 languages!) |
322
-
323
- **Pro Tip**: The `latin/` model is the most versatile - it handles 32 different languages!
 
324
 
325
  ---
326
 
327
- ## ❓ FAQ
328
 
329
- **Q: Do I need PaddlePaddle installed?**
330
- A: No! These are ONNX models. Just install `rapidocr-onnxruntime`.
 
 
 
331
 
332
- **Q: Can I use GPU?**
333
- A: Yes! Install `onnxruntime-gpu` instead of `onnxruntime`.
 
 
 
 
 
334
 
335
- **Q: Which model should I use for French?**
336
- A: Use the `latin/` model - it supports French and 31 other languages.
337
 
338
- **Q: Are these models free to use?**
339
- A: Yes! Licensed under Apache 2.0.
340
 
341
- **Q: How accurate are these models?**
342
- A: Very accurate! PP-OCRv5 has 30% better accuracy than PP-OCRv3.
343
 
344
- **Q: Can I use these commercially?**
345
- A: Yes! Apache 2.0 license allows commercial use.
 
 
 
 
 
 
346
 
347
  ---
348
 
349
- ## πŸ”— Links
350
 
351
- - **Original Models**: [PaddlePaddle PP-OCRv5 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
352
- - **PaddleOCR GitHub**: [github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
353
- - **Documentation**: [PaddleOCR Docs](https://paddlepaddle.github.io/PaddleOCR/)
354
- - **RapidOCR**: [github.com/RapidAI/RapidOCR](https://github.com/RapidAI/RapidOCR)
355
- - **ONNX Runtime**: [onnxruntime.ai](https://onnxruntime.ai/)
356
 
357
- ---
 
358
 
359
- ## πŸ™ Credits
 
360
 
361
- - **Original Models**: [PaddlePaddle Team](https://github.com/PaddlePaddle/PaddleOCR)
362
- - **Conversion**: Community contribution using [paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX)
363
- - **Based on**: [PP-OCRv5 Official Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
364
 
365
  ---
366
 
367
- ## πŸ“„ License
368
-
369
- Apache License 2.0 (inherited from PaddleOCR)
370
 
371
- You are free to:
372
- - βœ… Use commercially
373
- - βœ… Modify
374
- - βœ… Distribute
375
- - βœ… Use privately
376
 
377
  ---
378
 
379
- ## πŸ› Issues & Support
380
 
381
- For issues with:
382
- - **These ONNX models**: Open an issue in this repository
383
- - **Original PaddleOCR models**: [PaddleOCR Issues](https://github.com/PaddlePaddle/PaddleOCR/issues)
384
- - **ONNX Runtime**: [onnxruntime Issues](https://github.com/microsoft/onnxruntime/issues)
385
 
386
  ---
387
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - fr
6
+ - de
7
+ - es
8
+ - it
9
+ - pt
10
+ - nl
11
+ - pl
12
+ - cs
13
+ - sk
14
+ - hr
15
+ - bs
16
+ - sr
17
+ - sl
18
+ - da
19
+ - "no"
20
+ - sv
21
+ - is
22
+ - et
23
+ - lt
24
+ - hu
25
+ - sq
26
+ - cy
27
+ - ga
28
+ - tr
29
+ - id
30
+ - ms
31
+ - af
32
+ - sw
33
+ - tl
34
+ - uz
35
+ - la
36
+ - ru
37
+ - bg
38
+ - uk
39
+ - be
40
+ - ko
41
+ - zh
42
+ - ja
43
+ - th
44
+ - el
45
+ - hi
46
+ - mr
47
+ - ne
48
+ - sa
49
+ - ar
50
+ - ur
51
+ - fa
52
+ - ta
53
+ - te
54
+ tags:
55
+ - ocr
56
+ - optical-character-recognition
57
+ - text-detection
58
+ - text-recognition
59
+ - paddleocr
60
+ - onnx
61
+ - computer-vision
62
+ - document-ai
63
+ library_name: onnx
64
+ pipeline_tag: image-to-text
65
+ ---
66
+
67
+ # PP-OCR ONNX Models
68
+
69
+ Multilingual OCR models from PaddleOCR, converted to ONNX format for production deployment.
70
 
71
+ **Use as a complete pipeline**: Integrate with [monkt.com](https://monkt.com) for end-to-end document processing.
72
 
73
+ **Source**: [PaddlePaddle PP-OCRv5 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
 
74
  **Format**: ONNX (optimized for inference)
75
  **License**: Apache 2.0
76
 
77
  ---
78
 
79
+ ## Overview
80
 
81
+ **16 models** covering **48+ languages**:
82
+ - 11 PP-OCRv5 models (latest, highest accuracy)
83
+ - 5 PP-OCRv3 models (legacy, additional language support)
 
 
 
 
 
84
 
85
  ---
86
 
87
+ ## Quick Start
88
 
89
+ ### Download from HuggingFace
90
 
91
  ```bash
92
+ pip install huggingface_hub rapidocr-onnxruntime
93
  ```
94
 
95
+ <details>
96
+ <summary><b>Download specific language models</b></summary>
 
97
 
98
  ```python
99
+ from huggingface_hub import hf_hub_download
100
+
101
+ # Download English models
102
+ det_path = hf_hub_download("monkt/paddleocr-onnx", "detection/v5/det.onnx")
103
+ rec_path = hf_hub_download("monkt/paddleocr-onnx", "languages/english/rec.onnx")
104
+ dict_path = hf_hub_download("monkt/paddleocr-onnx", "languages/english/dict.txt")
105
+
106
+ # Use with RapidOCR
107
  from rapidocr_onnxruntime import RapidOCR
108
+ ocr = RapidOCR(det_model_path=det_path, rec_model_path=rec_path, rec_keys_path=dict_path)
109
+ result, elapsed = ocr("document.jpg")
110
+ ```
111
 
112
+ </details>
 
 
 
 
 
113
 
114
+ <details>
115
+ <summary><b>Download entire language folder</b></summary>
116
 
117
+ ```python
118
+ from huggingface_hub import snapshot_download
119
+
120
+ # Download all French/German/Spanish (Latin) models
121
+ snapshot_download("monkt/paddleocr-onnx", allow_patterns=["detection/v5/*", "languages/latin/*"])
122
+
123
+ # Download Arabic models (v3)
124
+ snapshot_download("monkt/paddleocr-onnx", allow_patterns=["detection/v3/*", "languages/arabic/*"])
125
  ```
126
 
127
+ </details>
128
 
129
+ <details>
130
+ <summary><b>Clone entire repository</b></summary>
131
 
132
+ ```bash
133
+ git clone https://huggingface.co/monkt/paddleocr-onnx
134
+ cd paddleocr-onnx
135
+ ```
 
 
 
136
 
137
+ </details>
 
 
 
 
 
138
 
139
+ ### Basic Usage
 
 
 
 
 
140
 
141
+ ```python
142
+ from rapidocr_onnxruntime import RapidOCR
 
 
 
 
143
 
 
144
  ocr = RapidOCR(
145
+ det_model_path="detection/v5/det.onnx",
146
+ rec_model_path="languages/english/rec.onnx",
147
+ rec_keys_path="languages/english/dict.txt"
148
  )
149
 
150
+ result, elapsed = ocr("document.jpg")
151
+ for line in result:
152
+ print(line[1][0]) # Extracted text
 
 
 
153
  ```
154
 
155
  ---
156
 
157
+ ## Available Models
158
 
159
+ ### PP-OCRv5 Recognition Models
160
 
161
+ | Language Group | Path | Languages | Accuracy | Size |
162
+ |----------------|------|-----------|----------|------|
163
+ | English | `languages/english/` | English | 85.25% | 7.5 MB |
164
+ | Latin | `languages/latin/` | French, German, Spanish, Italian, Portuguese, + 27 more | 84.7% | 7.5 MB |
165
+ | East Slavic | `languages/eslav/` | Russian, Bulgarian, Ukrainian, Belarusian | 81.6% | 7.5 MB |
166
+ | Korean | `languages/korean/` | Korean | 88.0% | 13 MB |
167
+ | Chinese/Japanese | `languages/chinese/` | Chinese, Japanese | - | 81 MB |
168
+ | Thai | `languages/thai/` | Thai | 82.68% | 7.5 MB |
169
+ | Greek | `languages/greek/` | Greek | 89.28% | 7.4 MB |
170
 
171
+ ### PP-OCRv3 Recognition Models (Legacy)
172
 
173
+ | Language Group | Path | Languages | Version | Size |
174
+ |----------------|------|-----------|---------|------|
175
+ | Devanagari | `languages/hindi/` | Hindi, Marathi, Nepali, Sanskrit | v3 | 8.6 MB |
176
+ | Arabic | `languages/arabic/` | Arabic, Urdu, Persian/Farsi | v3 | 8.6 MB |
177
+ | Tamil | `languages/tamil/` | Tamil | v3 | 8.6 MB |
178
+ | Telugu | `languages/telugu/` | Telugu | v3 | 8.6 MB |
179
 
180
+ ### Detection Models
181
 
182
+ | Model | Path | Version | Size |
183
+ |-------|------|---------|------|
184
+ | PP-OCRv5 Detection | `detection/v5/det.onnx` | v5 | 84 MB |
185
+ | PP-OCRv3 Detection | `detection/v3/det.onnx` | v3 | 2.3 MB |
186
 
187
+ **Note**: Use v5 detection with v5 recognition models. Use v3 detection with v3 recognition models.
188
+
189
+ ### Preprocessing Models (Optional)
190
+
191
+ | Model | Path | Purpose | Accuracy | Size |
192
+ |-------|------|---------|----------|------|
193
+ | Document Orientation | `preprocessing/doc-orientation/` | Corrects rotated documents (0Β°, 90Β°, 180Β°, 270Β°) | 99.06% | 6.5 MB |
194
+ | Text Line Orientation | `preprocessing/textline-orientation/` | Corrects upside-down text (0Β°, 180Β°) | 98.85% | 6.5 MB |
195
+ | Document Unwarping | `preprocessing/doc-unwarping/` | Fixes curved/warped documents | - | 30 MB |
196
 
197
  ---
198
 
199
+ ## Language Support
200
 
201
+ ### PP-OCRv5 Languages (40+)
 
202
 
203
+ **Latin Script** (32 languages): English, French, German, Spanish, Italian, Portuguese, Dutch, Polish, Czech, Slovak, Croatian, Bosnian, Serbian, Slovenian, Danish, Norwegian, Swedish, Icelandic, Estonian, Lithuanian, Hungarian, Albanian, Welsh, Irish, Turkish, Indonesian, Malay, Afrikaans, Swahili, Tagalog, Uzbek, Latin
 
 
 
 
 
 
204
 
205
+ **Cyrillic**: Russian, Bulgarian, Ukrainian, Belarusian
206
 
207
+ **East Asian**: Chinese (Simplified, Traditional), Japanese (Hiragana, Katakana, Kanji), Korean
208
 
209
+ **Southeast Asian**: Thai
210
+
211
+ **Other**: Greek
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
212
 
213
+ ### PP-OCRv3 Languages (8)
214
+
215
+ **South Asian**: Hindi, Marathi, Nepali, Sanskrit, Tamil, Telugu
216
+
217
+ **Middle Eastern**: Arabic, Urdu, Persian/Farsi
218
 
219
  ---
220
 
221
+ ## Usage Examples
222
+
223
+ <details>
224
+ <summary><b>PP-OCRv5 Models (English, Latin, East Asian, etc.)</b></summary>
225
 
226
+ ```python
227
+ from rapidocr_onnxruntime import RapidOCR
228
 
229
+ # English
230
+ ocr = RapidOCR(
231
+ det_model_path="detection/v5/det.onnx",
232
+ rec_model_path="languages/english/rec.onnx",
233
+ rec_keys_path="languages/english/dict.txt"
234
+ )
235
 
236
+ # French, German, Spanish, etc. (32 languages)
237
+ ocr = RapidOCR(
238
+ det_model_path="detection/v5/det.onnx",
239
+ rec_model_path="languages/latin/rec.onnx",
240
+ rec_keys_path="languages/latin/dict.txt"
241
+ )
242
 
243
+ # Russian, Bulgarian, Ukrainian, Belarusian
244
+ ocr = RapidOCR(
245
+ det_model_path="detection/v5/det.onnx",
246
+ rec_model_path="languages/eslav/rec.onnx",
247
+ rec_keys_path="languages/eslav/dict.txt"
248
+ )
249
 
250
+ # Korean
251
+ ocr = RapidOCR(
252
+ det_model_path="detection/v5/det.onnx",
253
+ rec_model_path="languages/korean/rec.onnx",
254
+ rec_keys_path="languages/korean/dict.txt"
255
+ )
256
 
257
+ # Chinese/Japanese
258
+ ocr = RapidOCR(
259
+ det_model_path="detection/v5/det.onnx",
260
+ rec_model_path="languages/chinese/rec.onnx",
261
+ rec_keys_path="languages/chinese/dict.txt"
262
+ )
263
 
264
+ # Thai
265
+ ocr = RapidOCR(
266
+ det_model_path="detection/v5/det.onnx",
267
+ rec_model_path="languages/thai/rec.onnx",
268
+ rec_keys_path="languages/thai/dict.txt"
269
+ )
270
 
271
+ # Greek
272
+ ocr = RapidOCR(
273
+ det_model_path="detection/v5/det.onnx",
274
+ rec_model_path="languages/greek/rec.onnx",
275
+ rec_keys_path="languages/greek/dict.txt"
276
+ )
277
  ```
278
 
279
+ </details>
280
 
281
+ <details>
282
+ <summary><b>PP-OCRv3 Models (Hindi, Arabic, Tamil, Telugu)</b></summary>
283
 
284
  ```python
285
  from rapidocr_onnxruntime import RapidOCR
 
286
 
287
+ # Hindi, Marathi, Nepali, Sanskrit
288
  ocr = RapidOCR(
289
+ det_model_path="detection/v3/det.onnx",
290
+ rec_model_path="languages/hindi/rec.onnx",
291
+ rec_keys_path="languages/hindi/dict.txt"
292
  )
293
 
294
+ # Arabic, Urdu, Persian/Farsi
295
+ ocr = RapidOCR(
296
+ det_model_path="detection/v3/det.onnx",
297
+ rec_model_path="languages/arabic/rec.onnx",
298
+ rec_keys_path="languages/arabic/dict.txt"
299
+ )
 
300
 
301
+ # Tamil
302
+ ocr = RapidOCR(
303
+ det_model_path="detection/v3/det.onnx",
304
+ rec_model_path="languages/tamil/rec.onnx",
305
+ rec_keys_path="languages/tamil/dict.txt"
306
+ )
307
 
308
+ # Telugu
 
309
  ocr = RapidOCR(
310
+ det_model_path="detection/v3/det.onnx",
311
+ rec_model_path="languages/telugu/rec.onnx",
312
+ rec_keys_path="languages/telugu/dict.txt"
 
 
313
  )
314
  ```
315
 
316
+ </details>
317
+
318
  ---
319
 
320
+ ## Full Pipeline with Preprocessing
321
 
322
+ <details>
323
+ <summary><b>Optional preprocessing for rotated/distorted documents</b></summary>
324
 
325
+ Preprocessing models improve accuracy on rotated or distorted documents:
 
 
326
 
327
+ ```python
328
+ from rapidocr_onnxruntime import RapidOCR
329
 
330
+ # Complete pipeline with preprocessing
331
+ ocr = RapidOCR(
332
+ det_model_path="detection/v5/det.onnx",
333
+ rec_model_path="languages/english/rec.onnx",
334
+ rec_keys_path="languages/english/dict.txt",
335
+ # Optional preprocessing
336
+ use_angle_cls=True,
337
+ angle_cls_model_path="preprocessing/textline-orientation/PP-LCNet_x1_0_textline_ori.onnx"
338
+ )
339
+
340
+ result, elapsed = ocr("rotated_document.jpg")
341
+ ```
342
 
343
+ **When to use preprocessing**:
344
+ - **Document Orientation** (`doc-orientation/`): Scanned documents with unknown rotation (0Β°/90Β°/180Β°/270Β°)
345
+ - **Text Line Orientation** (`textline-orientation/`): Upside-down text lines (0Β°/180Β°)
346
+ - **Document Unwarping** (`doc-unwarping/`): Curved pages, warped documents, camera photos
347
 
348
+ **Performance impact**: +10-30% accuracy on distorted images, minimal speed overhead.
349
 
350
+ </details>
 
 
 
 
 
351
 
352
  ---
353
 
354
+ ## Repository Structure
355
 
356
+ ```
357
+ .
358
+ β”œβ”€β”€ detection/
359
+ β”‚ β”œβ”€β”€ v5/
360
+ β”‚ β”‚ β”œβ”€β”€ det.onnx # 84 MB - PP-OCRv5 detection
361
+ β”‚ β”‚ └── config.json
362
+ β”‚ └── v3/
363
+ β”‚ β”œβ”€β”€ det.onnx # 2.3 MB - PP-OCRv3 detection
364
+ β”‚ └── config.json
365
+ β”‚
366
+ β”œβ”€β”€ languages/
367
+ β”‚ β”œβ”€β”€ english/
368
+ β”‚ β”‚ β”œβ”€β”€ rec.onnx # 7.5 MB
369
+ β”‚ β”‚ β”œβ”€β”€ dict.txt
370
+ β”‚ β”‚ └── config.json
371
+ β”‚ β”œβ”€β”€ latin/ # 32 languages
372
+ β”‚ β”œβ”€β”€ eslav/ # Russian, Bulgarian, Ukrainian, Belarusian
373
+ β”‚ β”œβ”€β”€ korean/
374
+ β”‚ β”œβ”€β”€ chinese/ # Chinese, Japanese
375
+ β”‚ β”œβ”€β”€ thai/
376
+ β”‚ β”œβ”€β”€ greek/
377
+ β”‚ β”œβ”€β”€ hindi/ # Hindi, Marathi, Nepali, Sanskrit (v3)
378
+ β”‚ β”œβ”€β”€ arabic/ # Arabic, Urdu, Persian (v3)
379
+ β”‚ β”œβ”€β”€ tamil/ # Tamil (v3)
380
+ β”‚ └── telugu/ # Telugu (v3)
381
+ β”‚
382
+ └── preprocessing/
383
+ β”œβ”€β”€ doc-orientation/
384
+ β”œβ”€β”€ textline-orientation/
385
+ └── doc-unwarping/
386
+ ```
387
 
388
  ---
389
 
390
+ ## Model Selection
391
+
392
+ | Document Language | Model Path |
393
+ |-------------------|------------|
394
+ | English | `languages/english/` |
395
+ | French, German, Spanish, Italian, Portuguese | `languages/latin/` |
396
+ | Russian, Bulgarian, Ukrainian, Belarusian | `languages/eslav/` |
397
+ | Korean | `languages/korean/` |
398
+ | Chinese, Japanese | `languages/chinese/` |
399
+ | Thai | `languages/thai/` |
400
+ | Greek | `languages/greek/` |
401
+ | Hindi, Marathi, Nepali, Sanskrit | `languages/hindi/` + `detection/v3/` |
402
+ | Arabic, Urdu, Persian/Farsi | `languages/arabic/` + `detection/v3/` |
403
+ | Tamil | `languages/tamil/` + `detection/v3/` |
404
+ | Telugu | `languages/telugu/` + `detection/v3/` |
405
 
406
  ---
407
 
408
+ ## Technical Specifications
409
 
410
+ - **Framework**: PaddleOCR β†’ ONNX
411
+ - **ONNX Opset**: 11
412
+ - **Precision**: FP32
413
+ - **Input Format**: RGB images (dynamic size)
414
+ - **Inference**: CPU/GPU via onnxruntime
415
 
416
+ ### Detection Model
417
+ - **Input**: `(batch, 3, height, width)` - dynamic
418
+ - **Output**: Text bounding boxes
419
+
420
+ ### Recognition Model
421
+ - **Input**: `(batch, 3, 32, width)` - height fixed at 32px
422
+ - **Output**: CTC logits β†’ decoded with dictionary
423
 
424
+ ---
 
425
 
426
+ ## Performance
 
427
 
428
+ ### Accuracy (PP-OCRv5)
 
429
 
430
+ | Model | Accuracy | Dataset |
431
+ |-------|----------|---------|
432
+ | Greek | 89.28% | 2,799 images |
433
+ | Korean | 88.0% | 5,007 images |
434
+ | English | 85.25% | 6,530 images |
435
+ | Latin | 84.7% | 3,111 images |
436
+ | Thai | 82.68% | 4,261 images |
437
+ | East Slavic | 81.6% | 7,031 images |
438
 
439
  ---
440
 
441
+ ## FAQ
442
 
443
+ **Q: Which version should I use?**
444
+ A: Use PP-OCRv5 models for best accuracy. Use PP-OCRv3 only for South Asian languages not available in v5.
 
 
 
445
 
446
+ **Q: Can I mix v5 and v3 models?**
447
+ A: No. Use `detection/v5/det.onnx` with v5 recognition models, and `detection/v3/det.onnx` with v3 recognition models.
448
 
449
+ **Q: GPU acceleration?**
450
+ A: Install `onnxruntime-gpu` instead of `onnxruntime` for 10x faster inference.
451
 
452
+ **Q: Commercial use?**
453
+ A: Yes. Apache 2.0 license allows commercial use.
 
454
 
455
  ---
456
 
457
+ ## Credits
 
 
458
 
459
+ - **Original Models**: [PaddlePaddle Team](https://github.com/PaddlePaddle/PaddleOCR)
460
+ - **Conversion**: [paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX)
461
+ - **Source**: [PP-OCRv5 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
 
 
462
 
463
  ---
464
 
465
+ ## Links
466
 
467
+ - [PaddleOCR GitHub](https://github.com/PaddlePaddle/PaddleOCR)
468
+ - [PaddleOCR Documentation](https://paddlepaddle.github.io/PaddleOCR/)
469
+ - [ONNX Runtime](https://onnxruntime.ai/)
470
+ - [monkt.com](https://monkt.com) - Document processing pipeline
471
 
472
  ---
473
 
474
+ **License**: Apache 2.0