Edit model card

YOLOv8 Patent Text Region Detection Model

Model Description

patent_text_regions is a YOLOv8 model fine-tuned on a custom dataset of page-level images drawn from historical patent specifications published by the British Patent Office. It has been trained to recognize all text regions located within pages of patent specifications as a single class. We take the initialized weights from the official release of the small YOLOv8s model (yolov8s.pt) and fine tune on our custom dataset.

Usage

This model can be used in the same way as any pre-trained YOLOv8 model by setting the model path to best.pt.

Training Data

The dataset was created by randomly sampling 420 page images from British patent specifications published between 1850-1899. The data was randomly split 80-10-10 (train-val-test) and then standard preprocessing (images were stretched and auto-oriented to 640 x 640 pixels) and the following data augmentations were applied using Roboflow:

  • Crop: 0% Minimum Zoom, 20% Maximum Zoom
  • Grayscale: Apply to 15% of images
  • Saturation: Between -25% and +25%
  • Blur: Up to 2.5px
  • Noise: Up to 0.1% of pixels

The custom dataset consists of 1,092 labelled images in total, which are made available in this repository.

Hyperparameters

We train the model using default hyperparameters, except from the batch size (128) and the number of epochs (300).

Evaluation

Evals on the test set are reported below:

  • mAP50: 0.987
  • mAP50-95: 0.892
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .