anuashok/ocr-captcha-v2

This model is a fine-tuned version of microsoft/trocr-base-printed on your custom dataset. captchas like

image/png

Training Summary

  • CER (Character Error Rate): 0.02025931928687196
  • Hyperparameters:
    • Learning Rate: 1.1081459294764632e-05
    • Batch Size: 4
    • Num Epochs: 3
    • Warmup Ratio: 0.07863134774153628
    • Weight Decay: 0.06248152825021373
    • Num Beams: 6
    • Length Penalty: 0.5095100725173662

Usage

from transformers import VisionEncoderDecoderModel, TrOCRProcessor
import torch
from PIL import Image

# Load model and processor
processor = TrOCRProcessor.from_pretrained("anuashok/ocr-captcha-v2")
model = VisionEncoderDecoderModel.from_pretrained("anuashok/ocr-captcha-v2")

# Load image
image = Image.open('path_to_your_image.jpg').convert("RGB")

# Prepare image
pixel_values = processor(image, return_tensors="pt").pixel_values

# Generate text
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)
Downloads last month
737
Safetensors
Model size
334M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for anuashok/ocr-captcha-v2

Finetuned
(15)
this model

Spaces using anuashok/ocr-captcha-v2 2