Mountchicken's picture
Upload 704 files
9bf4bd7

A newer version of the Gradio SDK is available: 5.6.0

Upgrade

FCENet

Fourier Contour Embedding for Arbitrary-Shaped Text Detection

Abstract

One of the main challenges for arbitrary-shaped text detection is to design a good text instance representation that allows networks to learn diverse text geometry variances. Most of existing methods model text instances in image spatial domain via masks or contour point sequences in the Cartesian or the polar coordinate system. However, the mask representation might lead to expensive post-processing, while the point sequence one may have limited capability to model texts with highly-curved shapes. To tackle these problems, we model text instances in the Fourier domain and propose one novel Fourier Contour Embedding (FCE) method to represent arbitrary shaped text contours as compact signatures. We further construct FCENet with a backbone, feature pyramid networks (FPN) and a simple post-processing with the Inverse Fourier Transformation (IFT) and Non-Maximum Suppression (NMS). Different from previous methods, FCENet first predicts compact Fourier signatures of text instances, and then reconstructs text contours via IFT and NMS during test. Extensive experiments demonstrate that FCE is accurate and robust to fit contours of scene texts even with highly-curved shapes, and also validate the effectiveness and the good generalization of FCENet for arbitrary-shaped text detection. Furthermore, experimental results show that our FCENet is superior to the state-of-the-art (SOTA) methods on CTW1500 and Total-Text, especially on challenging highly-curved text subset.

Results and models

CTW1500

Method Backbone Pretrained Model Training set Test set #epochs Test size Precision Recall Hmean Download
FCENet_r50dcn ResNet50 + DCNv2 - CTW1500 Train CTW1500 Test 1500 (736, 1080) 0.8689 0.8296 0.8488 model | log
FCENet_r50-oclip ResNet50-oCLIP - CTW1500 Train CTW1500 Test 1500 (736, 1080) 0.8383 0.801 0.8192 model | log

ICDAR2015

Method Backbone Pretrained Model Training set Test set #epochs Test size Precision Recall Hmean Download
FCENet_r50 ResNet50 - IC15 Train IC15 Test 1500 (2260, 2260) 0.8243 0.8834 0.8528 model | log
FCENet_r50-oclip ResNet50-oCLIP - IC15 Train IC15 Test 1500 (2260, 2260) 0.9176 0.8098 0.8604 model | log

Total Text

Method Backbone Pretrained Model Training set Test set #epochs Test size Precision Recall Hmean Download
FCENet_r50 ResNet50 - Totaltext Train Totaltext Test 1500 (1280, 960) 0.8485 0.7810 0.8134 model | log

Citation

@InProceedings{zhu2021fourier,
      title={Fourier Contour Embedding for Arbitrary-Shaped Text Detection},
      author={Yiqin Zhu and Jianyong Chen and Lingyu Liang and Zhanghui Kuang and Lianwen Jin and Wayne Zhang},
      year={2021},
      booktitle = {CVPR}
      }