YOLOv8 Patent Text Region Detection Model

Model Description

patent_text_regions is a YOLOv8 model fine-tuned on a custom dataset of page-level images drawn from historical patent specifications published by the British Patent Office. It has been trained to recognize all text regions located within pages of patent specifications as a single class. We take the initialized weights from the official release of the small YOLOv8s model (yolov8s.pt) and fine tune on our custom dataset.

Usage

This model can be used in the same way as any pre-trained YOLOv8 model by setting the model path to best.pt.

Training Data

The dataset was created by randomly sampling 420 page images from British patent specifications published between 1850-1899. The data was randomly split 80-10-10 (train-val-test) and then standard preprocessing (images were stretched and auto-oriented to 640 x 640 pixels) and the following data augmentations were applied using Roboflow:

Crop: 0% Minimum Zoom, 20% Maximum Zoom
Grayscale: Apply to 15% of images
Saturation: Between -25% and +25%
Blur: Up to 2.5px
Noise: Up to 0.1% of pixels

The custom dataset consists of 1,092 labelled images in total, which are made available in this repository.

Hyperparameters

We train the model using default hyperparameters, except from the batch size (128) and the number of epochs (300).

Evaluation

Evals on the test set are reported below:

mAP50: 0.987
mAP50-95: 0.892

Citation

If you use our model or custom training/evaluation data in your research, please cite our accompanying paper as follows:

@article{bct2025,
  title = {300 Years of British Patents},
  author = {Enrico Berkes and Matthew Lee Chen and Matteo Tranchero},
  journal = {arXiv preprint arXiv:2401.12345},
  year = {2025},
  url = {https://arxiv.org/abs/2401.12345}
}

gbpatentdata
/

patent_text_regions