---
language:
  - en
  - zh
tags:
  - Image-to-Text
  - OCR
  - Image-Captioning
datasets:
  - priyank-m/text_recognition_en_zh_clean
metrics:
  - cer
---

Multilingual OCR (mOCR) is a VisionEncoderDecoder model based on the concept of TrOCR for English and Chinese document text-recognition.
It uses a pre-trained Vision encoder and a pre-trained Language model as decoder.

Encoder model used:  facebook/vit-mae-large


Decoder model used:  xlm-roberta-base