loliipopshock commited on
Commit
4d61931
Β·
1 Parent(s): 9aafcc3

Add documentation for prima training

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -6,6 +6,21 @@
6
  - In `scripts/`, it lists specific command for running the code for processing the given dataset.
7
  - The `configs/` contains the configuration for different deep learning models, and is organized by datasets.
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ## Reference
10
 
11
  - **[cocosplit](https://github.com/akarazniewicz/cocosplit)** A script that splits the coco annotations into train and test sets.
 
6
  - In `scripts/`, it lists specific command for running the code for processing the given dataset.
7
  - The `configs/` contains the configuration for different deep learning models, and is organized by datasets.
8
 
9
+ ## Supported Datasets
10
+
11
+ - Prima Layout Analysis Dataset [`scripts/train_prima.sh`](https://github.com/Layout-Parser/layout-model-training/blob/master/scripts/train_prima.sh)
12
+ - You will need to download the dataset from the [official website](https://www.primaresearch.org/dataset/) and put it in the `data/prima` folder.
13
+ - As the original dataset is stored in the [PAGE format](https://www.primaresearch.org/tools/PAGEViewer), the script will use [`tools/convert_prima_to_coco.py`](https://github.com/Layout-Parser/layout-model-training/blob/master/tools/convert_prima_to_coco.py) to convert it to COCO format.
14
+ - The final dataset folder structure should look like:
15
+ ```bash
16
+ data/
17
+ └── prima/
18
+ β”œβ”€β”€ Images/
19
+ β”œβ”€β”€ XML/
20
+ β”œβ”€β”€ License.txt
21
+ └── annotations*.json
22
+ ```
23
+
24
  ## Reference
25
 
26
  - **[cocosplit](https://github.com/akarazniewicz/cocosplit)** A script that splits the coco annotations into train and test sets.