TPS Localization Network (GGUF)

Thin-Plate Spline localization CNN for document dewarping. Predicts 20 control point coordinates from a document image, which are then used to compute a TPS warp that straightens curved/distorted text.

Architecture

PaddleOCR RARE "small" variant (~108K params):

  • Conv0: 3->16, 3x3 + BN(folded) + ReLU + MaxPool2x2
  • Conv1: 16->32, 3x3 + BN(folded) + ReLU + MaxPool2x2
  • Conv2: 32->64, 3x3 + BN(folded) + ReLU + MaxPool2x2
  • Conv3: 64->128, 3x3 + BN(folded) + ReLU + AdaptiveAvgPool(1)
  • FC1: 128->64 + ReLU
  • FC2: 64->40 (20 control points x 2 coords)

Files

File Size Description
tps-loc-f32.gguf 424 KB F32 weights, 108K params

Source

Extracted from PaddleOCR rec_mv3_tps_bilstm_att_v2.0 recognition model (Apache-2.0). BatchNorm folded into conv weights at conversion time.

Usage

Parity

C++ vs Python reference: cos=1.000000, max_abs=0.000000 (exact F32 match).

Downloads last month
163
GGUF
Model size
108k params
Architecture
tps-localization
Hardware compatibility
Log In to add your hardware

8-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support