fg-mindee commited on
Commit
c69889d
1 Parent(s): 147862f

feat: Added Pytorch model

Browse files
Files changed (3) hide show
  1. README.md +105 -0
  2. config.json +1 -0
  3. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - object-detection
5
+ - pytorch
6
+ datasets:
7
+ - docartefacts
8
+ ---
9
+
10
+
11
+ # Faster-RCNN model
12
+
13
+ Pretrained on [DocArtefacts](https://mindee.github.io/doctr/datasets.html#doctr.datasets.DocArtefacts). The Faster-RCNN architecture was introduced in [this paper](https://arxiv.org/pdf/1506.01497.pdf).
14
+
15
+
16
+ ## Model description
17
+
18
+ The core idea of the author is to unify Region Proposal with the core detection module of Fast-RCNN.
19
+
20
+
21
+ ## Installation
22
+
23
+ ### Prerequisites
24
+
25
+ Python 3.6 (or higher) and [pip](https://pip.pypa.io/en/stable/) are required to install docTR.
26
+
27
+ ### Latest stable release
28
+
29
+ You can install the last stable release of the package using [pypi](https://pypi.org/project/python-doctr/) as follows:
30
+
31
+ ```shell
32
+ pip install python-doctr[torch]
33
+ ```
34
+
35
+ ### Developer mode
36
+
37
+ Alternatively, if you wish to use the latest features of the project that haven't made their way to a release yet, you can install the package from source *(install [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) first)*:
38
+
39
+ ```shell
40
+ git clone https://github.com/mindee/doctr.git
41
+ pip install -e doctr/.[torch]
42
+ ```
43
+
44
+
45
+ ## Usage instructions
46
+
47
+ ```python
48
+ from PIL import Image
49
+ import torch
50
+ from torchvision.transforms import Compose, ConvertImageDtype, PILToTensor
51
+ from holocron.models import model_from_hf_hub
52
+
53
+ model = model_from_hf_hub("mindee/fasterrcnn_mobilenet_v3_large_fpn").eval()
54
+
55
+ img = Image.open(path_to_an_image).convert("RGB")
56
+
57
+ # Preprocessing
58
+ transform = Compose([
59
+ PILToTensor(),
60
+ ConvertImageDtype(torch.float32),
61
+ ])
62
+
63
+ input_tensor = transform(img).unsqueeze(0)
64
+
65
+ # Inference
66
+ with torch.inference_mode():
67
+ output = model(input_tensor)
68
+ ```
69
+
70
+
71
+ ## Citation
72
+
73
+ Original paper
74
+
75
+ ```bibtex
76
+ @article{DBLP:journals/corr/RenHG015,
77
+ author = {Shaoqing Ren and
78
+ Kaiming He and
79
+ Ross B. Girshick and
80
+ Jian Sun},
81
+ title = {Faster {R-CNN:} Towards Real-Time Object Detection with Region Proposal
82
+ Networks},
83
+ journal = {CoRR},
84
+ volume = {abs/1506.01497},
85
+ year = {2015},
86
+ url = {http://arxiv.org/abs/1506.01497},
87
+ eprinttype = {arXiv},
88
+ eprint = {1506.01497},
89
+ timestamp = {Mon, 13 Aug 2018 16:46:02 +0200},
90
+ biburl = {https://dblp.org/rec/journals/corr/RenHG015.bib},
91
+ bibsource = {dblp computer science bibliography, https://dblp.org}
92
+ }
93
+ ```
94
+
95
+ Source of this implementation
96
+
97
+ ```bibtex
98
+ @misc{doctr2021,
99
+ title={docTR: Document Text Recognition},
100
+ author={Mindee},
101
+ year={2021},
102
+ publisher = {GitHub},
103
+ howpublished = {\url{https://github.com/mindee/doctr}}
104
+ }
105
+ ```
config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225], "arch": "fasterrcnn_mobilenet_v3_large_fpn", "interpolation": "bilinear", "input_shape": [3, 1024, 1024], "classes": ["background", "qr_code", "bar_code", "logo", "photo"]}
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d5b2490d6f0185186fc6e76323aa5192bf79cc2231ff9e01589a1619ea02f428
3
+ size 76078985