Edit model card

Faster-RCNN model

Pretrained on DocArtefacts. The Faster-RCNN architecture was introduced in this paper.

Model description

The core idea of the author is to unify Region Proposal with the core detection module of Fast-RCNN.



Python 3.6 (or higher) and pip are required to install docTR.

Latest stable release

You can install the last stable release of the package using pypi as follows:

pip install python-doctr[torch]

Developer mode

Alternatively, if you wish to use the latest features of the project that haven't made their way to a release yet, you can install the package from source (install Git first):

git clone https://github.com/mindee/doctr.git
pip install -e doctr/.[torch]

Usage instructions

from PIL import Image
import torch
from torchvision.transforms import Compose, ConvertImageDtype, PILToTensor
from doctr.models.obj_detection.factory import from_hub

model = from_hub("mindee/fasterrcnn_mobilenet_v3_large_fpn").eval()

img = Image.open(path_to_an_image).convert("RGB")

# Preprocessing
transform = Compose([

input_tensor = transform(img).unsqueeze(0)

# Inference
with torch.inference_mode():
    output = model(input_tensor)


Original paper

  author    = {Shaoqing Ren and
               Kaiming He and
               Ross B. Girshick and
               Jian Sun},
  title     = {Faster {R-CNN:} Towards Real-Time Object Detection with Region Proposal
  journal   = {CoRR},
  volume    = {abs/1506.01497},
  year      = {2015},
  url       = {http://arxiv.org/abs/1506.01497},
  eprinttype = {arXiv},
  eprint    = {1506.01497},
  timestamp = {Mon, 13 Aug 2018 16:46:02 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/RenHG015.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}

Source of this implementation

    title={docTR: Document Text Recognition},
    publisher = {GitHub},
    howpublished = {\url{https://github.com/mindee/doctr}}
Downloads last month
Hosted inference API
Drag image file here or click to browse from your device
This model can be loaded on the Inference API on-demand.