Edit model card

thai_trocr_thaigov_v2

Vision Encoder Decoder Models

  • Use microsoft/trocr-base-handwritten as encoder.
  • Use airesearch/wangchanberta-base-att-spm-uncased as decoder
  • Fine-tune on 250k synthetic text images dataset using ThaiGov V2 Corpus
  • Use SynthTIGER to generate synthetic text image.
  • It is useful to fine-tune any Thai OCR task.

Usage

from PIL import Image
from transformers import TrOCRProcessor, VisionEncoderDecoderModel

processor = TrOCRProcessor.from_pretrained("kkatiz/thai-trocr-thaigov-v2")
model = VisionEncoderDecoderModel.from_pretrained("kkatiz/thai-trocr-thaigov-v2")

image = Image.open("... your image path").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)

generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)
Downloads last month
85
Safetensors
Model size
220M params
Tensor type
F32
·