gyrojeff commited on
Commit
b032bd8
β€’
1 Parent(s): a02f133

doc: add data

Browse files
Files changed (1) hide show
  1. README.md +29 -23
README.md CHANGED
@@ -142,35 +142,41 @@ Some fonts are problematic during the generation process. The script has an manu
142
 
143
  On our synthesized dataset,
144
 
145
- | Backbone | Data Aug | Pretrained | Crop<br>Text<br>BBox | Output<br>Norm | Input Size | Hyper<br>Param | Accur | Commit | Dataset | Precision |
146
- | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-:|
147
- | DeepFont | βœ”οΈ | ❌ | βœ… | Sigmoid | 105x105 | I<sup>1</sup> | [Can't Converge] | 665559f | I<sup>5</sup> | Float32Matmul=High |
148
- | DeepFont | βœ”οΈ | ❌ | βœ… | Sigmoid | 105x105 | IV<sup>4</sup> | [Can't Converge] | 665559f | I | Float32Matmul=High |
149
- | ResNet-18 | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 18.58% | 5c43f60 | I | Float32Matmul=Highest |
150
- | ResNet-18 | ❌ | ❌ | ❌ | Sigmoid | 512x512 | II<sup>2</sup> | 14.39% | 5a85fd3 | I | Float32Matmul=High |
151
- | ResNet-18 | ❌ | ❌ | ❌ | Tanh | 512x512 | II | 16.24% | ff82fe6 | I | Float32Matmul=High |
152
- | ResNet-18 | βœ…<sup>7</sup> | ❌ | ❌ | Tanh | 512x512 | II | 27.71% | a976004 | I | Float32Matmul=High |
153
- | ResNet-18 | βœ… | ❌ | ❌ | Tanh | 512x512 | I | 29.95% | 8364103 | I | Float32Matmul=High |
154
- | ResNet-18 | βœ… | ❌ | ❌ | Sigmoid | 512x512 | I | 29.37% [Early stop] | 8d2e833 | I | Float32Matmul=High |
155
- | ResNet-18 | βœ… | ❌ | ❌ | Sigmoid | 416x416 | I | [Lower Trend] | d5a3215 | I | Float32Matmul=High |
156
- | ResNet-18 | βœ… | ❌ | ❌ | Sigmoid | 320x320 | I | [Lower Trend] | afcdd80 | I | Float32Matmul=High |
157
- | ResNet-18 | βœ… | ❌ | ❌ | Sigmoid | 224x224 | I | [Lower Trend] | 8b9de80 | I | Float32Matmul=High |
158
- | ResNet-34 | βœ… | ❌ | ❌ | Sigmoid | 512x512 | I | 32.03% | 912d566 | I | Float32Matmul=High |
159
- | ResNet-50 | βœ… | ❌ | ❌ | Sigmoid | 512x512 | I | 34.21% | e980b66 | I | Float32Matmul=High |
160
- | ResNet-18 | βœ… | βœ… | ❌ | Sigmoid | 512x512 | I | 31.24% | 416c7bb | I | Float32Matmul=High |
161
- | ResNet-18 | βœ… | βœ… | βœ… | Sigmoid | 512x512 | I | 34.69% | 855e240 | I | Float32Matmul=High |
162
- | ResNet-18 | βœ”οΈ<sup>8</sup> | βœ… | βœ… | Sigmoid | 512x512 | I | 38.32% | 1750035 | I | Float32Matmul=High |
163
- | ResNet-18 | βœ”οΈ | βœ… | βœ… | Sigmoid | 512x512 | III<sup>3</sup> | 38.87% | 0693434 | I | Float32Matmul=High |
164
- | ResNet-50 | βœ”οΈ | βœ… | βœ… | Sigmoid | 512x512 | III | 48.99% | bc0f7fc | II<sup>6</sup> | Float32Matmul=High |
165
-
 
 
 
 
166
  * <sup>1</sup> `learning rate = 0.0001, lambda = (2, 0.5, 1)`
167
  * <sup>2</sup> `learning rate = 0.00005, lambda = (4, 0.5, 1)`
168
- * <sup>4</sup> `learning rate = 0.001, lambda = (2, 0.5, 1)`
169
- * <sup>3</sup> `learning rate = 0.01, lambda = (2, 0.5, 1)`
170
  * <sup>5</sup> Initial version of synthesized dataset
171
  * <sup>6</sup> Doubled synthesized dataset
172
  * <sup>7</sup> Data Augmentation v1: Color Jitter + Random Crop [81%-100%]
173
  * <sup>8</sup> Data Augmentation v2: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°]
 
 
174
 
175
  ## Related works and Resources
176
 
 
142
 
143
  On our synthesized dataset,
144
 
145
+ | Backbone | Data Aug | Pretrained | Crop<br>Text<br>BBox | Preserve<br>Aspect<br>Ratio | Output<br>Norm | Input Size | Hyper<br>Param | Accur | Commit | Dataset | Precision |
146
+ | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-:| :-: |
147
+ | DeepFont | βœ”οΈ* | ❌ | βœ… | ❌ | Sigmoid | 105x105 | I<sup>1</sup> | [Can't Converge] | 665559f | I<sup>5</sup> | bfloat16_3x |
148
+ | DeepFont | βœ”οΈ* | ❌ | βœ… | ❌ | Sigmoid | 105x105 | IV<sup>4</sup> | [Can't Converge] | 665559f | I | bfloat16_3x |
149
+ | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 18.58% | 5c43f60 | I | float32 |
150
+ | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Sigmoid | 512x512 | II<sup>2</sup> | 14.39% | 5a85fd3 | I | bfloat16_3x |
151
+ | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Tanh | 512x512 | II | 16.24% | ff82fe6 | I | bfloat16_3x |
152
+ | ResNet-18 | βœ…*<sup>7</sup> | ❌ | ❌ | ❌ | Tanh | 512x512 | II | 27.71% | a976004 | I | bfloat16_3x |
153
+ | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Tanh | 512x512 | I | 29.95% | 8364103 | I | bfloat16_3x |
154
+ | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 29.37% [Early stop] | 8d2e833 | I | bfloat16_3x |
155
+ | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 416x416 | I | [Lower Trend] | d5a3215 | I | bfloat16_3x |
156
+ | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 320x320 | I | [Lower Trend] | afcdd80 | I | bfloat16_3x |
157
+ | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 224x224 | I | [Lower Trend] | 8b9de80 | I | bfloat16_3x |
158
+ | ResNet-34 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 32.03% | 912d566 | I | bfloat16_3x |
159
+ | ResNet-50 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 34.21% | e980b66 | I | bfloat16_3x |
160
+ | ResNet-18 | βœ…* | βœ… | ❌ | ❌ | Sigmoid | 512x512 | I | 31.24% | 416c7bb | I | bfloat16_3x |
161
+ | ResNet-18 | βœ…* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | I | 34.69% | 855e240 | I | bfloat16_3x |
162
+ | ResNet-18 | βœ”οΈ*<sup>8</sup> | βœ… | βœ… | ❌ | Sigmoid | 512x512 | I | 38.32% | 1750035 | I | bfloat16_3x |
163
+ | ResNet-18 | βœ”οΈ* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III<sup>3</sup> | 38.87% | 0693434 | I | bfloat16_3x |
164
+ | ResNet-50 | βœ”οΈ* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III | 48.99% | bc0f7fc | II<sup>6</sup> | bfloat16_3x |
165
+ | ResNet-50 | βœ”οΈ | βœ… | βœ… | βœ…<sup>10</sup> | Sigmoid | 512x512 | III | 46.12% | 0f071a5 | II | bfloat16 |
166
+ | ResNet-50 | ❕<sup>9</sup> | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III | 43.86% | 0f071a5 | II | bfloat16 |
167
+ | ResNet-50 | ❕ | βœ… | βœ… | βœ… | Sigmoid | 512x512 | III | 41.35% | 0f071a5 | II | bfloat16 |
168
+
169
+ * \* Bug in implementation
170
  * <sup>1</sup> `learning rate = 0.0001, lambda = (2, 0.5, 1)`
171
  * <sup>2</sup> `learning rate = 0.00005, lambda = (4, 0.5, 1)`
172
+ * <sup>3</sup> `learning rate = 0.001, lambda = (2, 0.5, 1)`
173
+ * <sup>4</sup> `learning rate = 0.01, lambda = (2, 0.5, 1)`
174
  * <sup>5</sup> Initial version of synthesized dataset
175
  * <sup>6</sup> Doubled synthesized dataset
176
  * <sup>7</sup> Data Augmentation v1: Color Jitter + Random Crop [81%-100%]
177
  * <sup>8</sup> Data Augmentation v2: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°]
178
+ * <sup>9</sup> Data Augmentation v3: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°] + Random Horizontal Flip + Random Downsample [1, 2]
179
+ * <sup>10</sup> Preserve Aspect Ratio by Random Cropping
180
 
181
  ## Related works and Resources
182