Papers
arxiv:2407.06950

Spanish TrOCR: Leveraging Transfer Learning for Language Adaptation

Published on Jul 9

Abstract

This study explores the transfer learning capabilities of the TrOCR architecture to Spanish. TrOCR is a transformer-based Optical Character Recognition (OCR) model renowned for its state-of-the-art performance in English benchmarks. Inspired by Li et al. assertion regarding its adaptability to multilingual text recognition, we investigate two distinct approaches to adapt the model to a new language: integrating an English TrOCR encoder with a language specific decoder and train the model on this specific language, and fine-tuning the English base TrOCR model on a new language data. Due to the scarcity of publicly available datasets, we present a resource-efficient pipeline for creating OCR datasets in any language, along with a comprehensive benchmark of the different image generation methods employed with a focus on Visual Rich Documents (VRDs). Additionally, we offer a comparative analysis of the two approaches for the Spanish language, demonstrating that fine-tuning the English TrOCR on Spanish yields superior recognition than the language specific decoder for a fixed dataset size. We evaluate our model employing character and word error rate metrics on a public available printed dataset, comparing the performance against other open-source and cloud OCR spanish models. As far as we know, these resources represent the best open-source model for OCR in Spanish. The Spanish TrOCR models are publicly available on HuggingFace [20] and the code to generate the dataset is available on Github [25].

Community

Sign up or log in to comment

Models citing this paper 3

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2407.06950 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2407.06950 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.