m_OCR / README.md
priyank-m's picture
Update README.md
89d9aa4
|
raw
history blame
471 Bytes
metadata
language:
  - en
  - zh
tags:
  - Image-to-Text
  - OCR
  - Image-Captioning
datasets:
  - priyank-m/text_recognition_en_zh_clean
metrics:
  - cer

Multilingual OCR (mOCR) is a VisionEncoderDecoder model based on the concept of TrOCR for English and Chinese document text-recognition. It uses a pre-trained Vision encoder and a pre-trained Language model as decoder.

Encoder model used: facebook/vit-mae-large

Decoder model used: xlm-roberta-base