doc: add data
Browse files
README.md
CHANGED
@@ -142,35 +142,41 @@ Some fonts are problematic during the generation process. The script has an manu
|
|
142 |
|
143 |
On our synthesized dataset,
|
144 |
|
145 |
-
| Backbone | Data Aug | Pretrained | Crop<br>Text<br>BBox | Output<br>Norm | Input Size | Hyper<br>Param | Accur | Commit | Dataset | Precision |
|
146 |
-
| :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-:|
|
147 |
-
| DeepFont |
|
148 |
-
| DeepFont |
|
149 |
-
| ResNet-18 | β | β | β | Sigmoid | 512x512 | I | 18.58% | 5c43f60 | I |
|
150 |
-
| ResNet-18 | β | β | β | Sigmoid | 512x512 | II<sup>2</sup> | 14.39% | 5a85fd3 | I |
|
151 |
-
| ResNet-18 | β | β | β | Tanh | 512x512 | II | 16.24% | ff82fe6 | I |
|
152 |
-
| ResNet-18 |
|
153 |
-
| ResNet-18 |
|
154 |
-
| ResNet-18 |
|
155 |
-
| ResNet-18 |
|
156 |
-
| ResNet-18 |
|
157 |
-
| ResNet-18 |
|
158 |
-
| ResNet-34 |
|
159 |
-
| ResNet-50 |
|
160 |
-
| ResNet-18 |
|
161 |
-
| ResNet-18 |
|
162 |
-
| ResNet-18 |
|
163 |
-
| ResNet-18 |
|
164 |
-
| ResNet-50 |
|
165 |
-
|
|
|
|
|
|
|
|
|
166 |
* <sup>1</sup> `learning rate = 0.0001, lambda = (2, 0.5, 1)`
|
167 |
* <sup>2</sup> `learning rate = 0.00005, lambda = (4, 0.5, 1)`
|
168 |
-
* <sup>
|
169 |
-
* <sup>
|
170 |
* <sup>5</sup> Initial version of synthesized dataset
|
171 |
* <sup>6</sup> Doubled synthesized dataset
|
172 |
* <sup>7</sup> Data Augmentation v1: Color Jitter + Random Crop [81%-100%]
|
173 |
* <sup>8</sup> Data Augmentation v2: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°]
|
|
|
|
|
174 |
|
175 |
## Related works and Resources
|
176 |
|
|
|
142 |
|
143 |
On our synthesized dataset,
|
144 |
|
145 |
+
| Backbone | Data Aug | Pretrained | Crop<br>Text<br>BBox | Preserve<br>Aspect<br>Ratio | Output<br>Norm | Input Size | Hyper<br>Param | Accur | Commit | Dataset | Precision |
|
146 |
+
| :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-:| :-: |
|
147 |
+
| DeepFont | βοΈ* | β | β
| β | Sigmoid | 105x105 | I<sup>1</sup> | [Can't Converge] | 665559f | I<sup>5</sup> | bfloat16_3x |
|
148 |
+
| DeepFont | βοΈ* | β | β
| β | Sigmoid | 105x105 | IV<sup>4</sup> | [Can't Converge] | 665559f | I | bfloat16_3x |
|
149 |
+
| ResNet-18 | β | β | β | β | Sigmoid | 512x512 | I | 18.58% | 5c43f60 | I | float32 |
|
150 |
+
| ResNet-18 | β | β | β | β | Sigmoid | 512x512 | II<sup>2</sup> | 14.39% | 5a85fd3 | I | bfloat16_3x |
|
151 |
+
| ResNet-18 | β | β | β | β | Tanh | 512x512 | II | 16.24% | ff82fe6 | I | bfloat16_3x |
|
152 |
+
| ResNet-18 | β
*<sup>7</sup> | β | β | β | Tanh | 512x512 | II | 27.71% | a976004 | I | bfloat16_3x |
|
153 |
+
| ResNet-18 | β
* | β | β | β | Tanh | 512x512 | I | 29.95% | 8364103 | I | bfloat16_3x |
|
154 |
+
| ResNet-18 | β
* | β | β | β | Sigmoid | 512x512 | I | 29.37% [Early stop] | 8d2e833 | I | bfloat16_3x |
|
155 |
+
| ResNet-18 | β
* | β | β | β | Sigmoid | 416x416 | I | [Lower Trend] | d5a3215 | I | bfloat16_3x |
|
156 |
+
| ResNet-18 | β
* | β | β | β | Sigmoid | 320x320 | I | [Lower Trend] | afcdd80 | I | bfloat16_3x |
|
157 |
+
| ResNet-18 | β
* | β | β | β | Sigmoid | 224x224 | I | [Lower Trend] | 8b9de80 | I | bfloat16_3x |
|
158 |
+
| ResNet-34 | β
* | β | β | β | Sigmoid | 512x512 | I | 32.03% | 912d566 | I | bfloat16_3x |
|
159 |
+
| ResNet-50 | β
* | β | β | β | Sigmoid | 512x512 | I | 34.21% | e980b66 | I | bfloat16_3x |
|
160 |
+
| ResNet-18 | β
* | β
| β | β | Sigmoid | 512x512 | I | 31.24% | 416c7bb | I | bfloat16_3x |
|
161 |
+
| ResNet-18 | β
* | β
| β
| β | Sigmoid | 512x512 | I | 34.69% | 855e240 | I | bfloat16_3x |
|
162 |
+
| ResNet-18 | βοΈ*<sup>8</sup> | β
| β
| β | Sigmoid | 512x512 | I | 38.32% | 1750035 | I | bfloat16_3x |
|
163 |
+
| ResNet-18 | βοΈ* | β
| β
| β | Sigmoid | 512x512 | III<sup>3</sup> | 38.87% | 0693434 | I | bfloat16_3x |
|
164 |
+
| ResNet-50 | βοΈ* | β
| β
| β | Sigmoid | 512x512 | III | 48.99% | bc0f7fc | II<sup>6</sup> | bfloat16_3x |
|
165 |
+
| ResNet-50 | βοΈ | β
| β
| β
<sup>10</sup> | Sigmoid | 512x512 | III | 46.12% | 0f071a5 | II | bfloat16 |
|
166 |
+
| ResNet-50 | β<sup>9</sup> | β
| β
| β | Sigmoid | 512x512 | III | 43.86% | 0f071a5 | II | bfloat16 |
|
167 |
+
| ResNet-50 | β | β
| β
| β
| Sigmoid | 512x512 | III | 41.35% | 0f071a5 | II | bfloat16 |
|
168 |
+
|
169 |
+
* \* Bug in implementation
|
170 |
* <sup>1</sup> `learning rate = 0.0001, lambda = (2, 0.5, 1)`
|
171 |
* <sup>2</sup> `learning rate = 0.00005, lambda = (4, 0.5, 1)`
|
172 |
+
* <sup>3</sup> `learning rate = 0.001, lambda = (2, 0.5, 1)`
|
173 |
+
* <sup>4</sup> `learning rate = 0.01, lambda = (2, 0.5, 1)`
|
174 |
* <sup>5</sup> Initial version of synthesized dataset
|
175 |
* <sup>6</sup> Doubled synthesized dataset
|
176 |
* <sup>7</sup> Data Augmentation v1: Color Jitter + Random Crop [81%-100%]
|
177 |
* <sup>8</sup> Data Augmentation v2: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°]
|
178 |
+
* <sup>9</sup> Data Augmentation v3: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°] + Random Horizontal Flip + Random Downsample [1, 2]
|
179 |
+
* <sup>10</sup> Preserve Aspect Ratio by Random Cropping
|
180 |
|
181 |
## Related works and Resources
|
182 |
|