Image-to-Text
Transformers
Safetensors
vision-encoder-decoder
image-text-to-text
ocr
trocr
handwriting
medical
Instructions to use khedim/Medical-Prescription-OCR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use khedim/Medical-Prescription-OCR with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="khedim/Medical-Prescription-OCR")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("khedim/Medical-Prescription-OCR") model = AutoModelForImageTextToText.from_pretrained("khedim/Medical-Prescription-OCR") - Notebooks
- Google Colab
- Kaggle
Medical Prescription OCR TrOCR Small
This model was fine-tuned in Kaggle on line-level medical prescription crops
exported through data/splits/image_annotations.csv.
Base model
microsoft/trocr-small-handwritten
Dataset summary
- Splits root:
/kaggle/working/downloads/splits_extracted/splits - Train lines:
20250 - Validation lines:
2526 - Test lines:
2517
Training setup
- Epochs:
6 - Effective batch size:
24 - Learning rate:
4e-05 - Weight decay:
0.01 - GPUs seen:
2 - Validation beams:
2 - Final eval beams:
4
Metrics
- Best validation CER:
0.0056 - Test line CER:
0.0184 - Test full-prescription CER:
0.0128 - Test line word accuracy:
0.9756 - Test full-prescription word accuracy:
0.9865
- Downloads last month
- 14