Introduction

This repository hosts the PaddleOCR document helper models — page-orientation classification, dewarping, and table-structure recognition — for the React Native ExecuTorch library, fused into one multi-method .pte per backend for the ExecuTorch runtime (XNNPACK, CoreML, Vulkan). These are document pre/post-processing companions to react-native-executorch-paddleocr - not an OCR model on their own.

If you'd like to run these models in your own ExecuTorch runtime, refer to the official documentation for setup instructions.

Methods

The fused .pte exposes four methods (the .pte is pure tensor→tensor; the client does normalization, argmax/softmax, grid-sampling and the decode loop):

method source model input output purpose
orientation PP-LCNet doc_ori [1,3,224,224] logits[1,4] page rotation 0 / 90 / 180 / 270° (argmax)
dewarp UVDoc [1,3,712,488] grid[1,2,45,31] sampling grid → grid_sample to unwarp a curved/folded page
table_encode SLANeXt [1,3,488,488] feat[1,256,96] encode a cropped table image (run once)
table_decode_step SLANeXt decoder (feat[1,256,96], hidden[1,256], onehot[1,50]) (probs[1,50], hidden[1,256]) one autoregressive structure-token step

Backends & precision

backend target precision size
xnnpack CPU int8 ~26 MB
coreml Apple ANE weight-only int8 11.9 MB
vulkan Android GPU fp16, except table_decode_step → XNNPACK 23.9 MB

table_decode_step is always computed in fp32 (autoregressive stability)

Compatibility

If you intend to use these models outside of React Native ExecuTorch, make sure your runtime is compatible with the ExecuTorch version used to export the .pte files. For more details, see the compatibility note in the ExecuTorch GitHub repository. If you work with React Native ExecuTorch, the library constants guarantee compatibility with the runtime used behind the scenes.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including software-mansion/react-native-executorch-PaddleHelpers