--- language: - en - zh tags: - Image-to-Text - OCR - Image-Captioning datasets: - priyank-m/text_recognition_en_zh_clean metrics: - cer --- Multilingual OCR (mOCR) is a VisionEncoderDecoder model based on the concept of TrOCR for English and Chinese document text-recognition. It uses a pre-trained Vision encoder and a pre-trained Language model as decoder. Encoder model used: facebook/vit-mae-large Decoder model used: xlm-roberta-base