Update README.md
Browse files
README.md
CHANGED
@@ -27,8 +27,7 @@ It achieves the following results on the evaluation set:
|
|
27 |
|
28 |
**Outputs:**
|
29 |
- **Bounding Boxes:** The model outputs the location for the bounding box coordinates in the form of special <loc[value]> tokens, where value is a number that represents a normalized coordinate. Each detection is represented by four location coordinates in the order y_min, x_min, y_max, x_max, followed by the label that was detected in that box. To convert values to coordinates, you first need to divide the numbers by 1024, then multiply y by the image height and x by its width. This will give you the coordinates of the bounding boxes, relative to the original image size.
|
30 |
-
If everything goes smoothly, the model will output a text similar to "<loc[value]><loc[value]><loc[value]><loc[value]> table; <loc[value]><loc[value]><loc[value]><loc[value]> table" depending on the number of tables detected in the image. Then, you can use the following script to convert the text output into PASCAL VOC
|
31 |
-
|
32 |
```python
|
33 |
import re
|
34 |
|
@@ -42,7 +41,7 @@ def post_process(bbox_text, image_width, image_height):
|
|
42 |
loc_values = loc_values[:4]
|
43 |
|
44 |
loc_values = [value/1024 for value in loc_values]
|
45 |
-
#
|
46 |
loc_values = [
|
47 |
int(loc_values[1]*image_width), int(loc_values[0]*image_height),
|
48 |
int(loc_values[3]*image_width), int(loc_values[2]*image_height),
|
|
|
27 |
|
28 |
**Outputs:**
|
29 |
- **Bounding Boxes:** The model outputs the location for the bounding box coordinates in the form of special <loc[value]> tokens, where value is a number that represents a normalized coordinate. Each detection is represented by four location coordinates in the order y_min, x_min, y_max, x_max, followed by the label that was detected in that box. To convert values to coordinates, you first need to divide the numbers by 1024, then multiply y by the image height and x by its width. This will give you the coordinates of the bounding boxes, relative to the original image size.
|
30 |
+
If everything goes smoothly, the model will output a text similar to "<loc[value]><loc[value]><loc[value]><loc[value]> table; <loc[value]><loc[value]><loc[value]><loc[value]> table" depending on the number of tables detected in the image. Then, you can use the following script to convert the text output into PASCAL VOC formatted bounding boxes.
|
|
|
31 |
```python
|
32 |
import re
|
33 |
|
|
|
41 |
loc_values = loc_values[:4]
|
42 |
|
43 |
loc_values = [value/1024 for value in loc_values]
|
44 |
+
# convert to (xmin, ymin, xmax, ymax)
|
45 |
loc_values = [
|
46 |
int(loc_values[1]*image_width), int(loc_values[0]*image_height),
|
47 |
int(loc_values[3]*image_width), int(loc_values[2]*image_height),
|