Spaces:

mckabue
/

document-similarity-matching-using-visual-layout-features-archive

Build error

App Files Files Community

Charles Kabui commited on Mar 8, 2024

Commit

390e6ec

1 Parent(s): 5eb5dd1

git rm -rf layout-parser

Browse files

Files changed (10) hide show

model/layout-parser/.gitignore +0 -133
model/layout-parser/README.md +0 -38
model/layout-parser/configs/prima/fast_rcnn_R_50_FPN_3x.yaml +0 -307
model/layout-parser/configs/prima/mask_rcnn_R_50_FPN_3x.yaml +0 -307
model/layout-parser/requirements.txt +0 -6
model/layout-parser/scripts/train_prima.sh +0 -17
model/layout-parser/tools/convert_prima_to_coco.py +0 -225
model/layout-parser/tools/train_net.py +0 -229
model/layout-parser/utils/__init__.py +0 -0
model/layout-parser/utils/cocosplit.py +0 -112

model/layout-parser/.gitignore DELETED Viewed

@@ -1,133 +0,0 @@
-# folrder
-data
-data/
-credential
-credential/
-model
-model/
-result
-result*/
-outputs/
-# Mac Finder Configurations
-.DS_Store
-# IDEA configurations
-.idea/
-# IPython checkpoints
-.ipynb_checkpoints/
-log
-# Visual Studio Code
-.vscode/
-# Byte-compiled / optimized / DLL files
-__pycache__/
-*.py[cod]
-*$py.class
-# C extensions
-*.so
-# Distribution / packaging
-.Python
-build/
-develop-eggs/
-dist/
-downloads/
-eggs/
-.eggs/
-lib64/
-parts/
-sdist/
-var/
-wheels/
-*.egg-info/
-.installed.cfg
-*.egg
-MANIFEST
-# PyInstaller
-#  Usually these files are written by a python script from a template
-#  before PyInstaller builds the exe, so as to inject date/other infos into it.
-*.manifest
-*.spec
-# Installer logs
-pip-log.txt
-pip-delete-this-directory.txt
-# Unit test / coverage reports
-htmlcov/
-.tox/
-.coverage
-.coverage.*
-.cache
-nosetests.xml
-coverage.xml
-*.cover
-.hypothesis/
-.pytest_cache/
-# Translations
-*.mo
-*.pot
-# Django stuff:
-*.log
-local_settings.py
-db.sqlite3
-# Flask stuff:
-instance/
-.webassets-cache
-# Scrapy stuff:
-.scrapy
-# Sphinx documentation
-docs/_build/
-# PyBuilder
-target/
-# Jupyter Notebook
-.ipynb_checkpoints
-# IPython
-profile_default/
-ipython_config.py
-# pyenv
-.python-version
-# celery beat schedule file
-celerybeat-schedule
-# SageMath parsed files
-*.sage.py
-# Environments
-.env
-.venv
-env/
-venv/
-ENV/
-env.bak/
-venv.bak/
-# Spyder project settings
-.spyderproject
-.spyproject
-# Rope project settings
-.ropeproject
-# mkdocs documentation
-/site
-# mypy
-.mypy_cache/
-.dmypy.json
-dmypy.json

model/layout-parser/README.md DELETED Viewed

@@ -1,38 +0,0 @@
-# Scripts for training Layout Detection Models using Detectron2
-## Usage
-### Directory Structure
-- In `tools/`, we provide a series of handy scripts for converting data formats and training the models.
-- In `scripts/`, it lists specific command for running the code for processing the given dataset.
-- The `configs/` contains the configuration for different deep learning models, and is organized by datasets.
-### How to train the models?
-- Get the dataset and annotations -- if you are not sure, feel free to check [this tutorial](https://github.com/Layout-Parser/layout-parser/tree/main/examples/Customizing%20Layout%20Models%20with%20Label%20Studio%20Annotation).
-- Duplicate and modify the config files and training scripts
-    - For example, you might want to copy [`configs/prima/fast_rcnn_R_50_FPN_3x`](configs/prima/fast_rcnn_R_50_FPN_3x.yaml) to [`configs/your-dataset-name/fast_rcnn_R_50_FPN_3x`](configs/prima/fast_rcnn_R_50_FPN_3x.yaml), and you can create your own `scripts/train_<your-dataset-name>.sh` based on [`scripts/train_prima.sh`](scripts/train_prima.sh).
-    - You'll modify the `--dataset_name`, `--json_annotation_train`, `--image_path_train`, `--json_annotation_val`, `--image_path_val`, and `--config-file` args appropriately.
-- If you have a dataset with segmentation masks, you can try to train with the [`mask_rcnn model`](configs/prima/mask_rcnn_R_50_FPN_3x.yaml); otherwise you might want to start with the [`fast_rcnn model`](configs/prima/fast_rcnn_R_50_FPN_3x.yaml)
-    - If you see error `AttributeError: Cannot find field 'gt_masks' in the given Instances!` during training, this means you should not use
-## Supported Datasets
-- Prima Layout Analysis Dataset [`scripts/train_prima.sh`](https://github.com/Layout-Parser/layout-model-training/blob/master/scripts/train_prima.sh)
-    - You will need to download the dataset from the [official website](https://www.primaresearch.org/dataset/) and put it in the `data/prima` folder.
-    - As the original dataset is stored in the [PAGE format](https://www.primaresearch.org/tools/PAGEViewer), the script will use [`tools/convert_prima_to_coco.py`](https://github.com/Layout-Parser/layout-model-training/blob/master/tools/convert_prima_to_coco.py) to convert it to COCO format.
-    - The final dataset folder structure should look like:
-        ```bash
-        data/
-        └── prima/
-            ├── Images/
-            ├── XML/
-            ├── License.txt
-            └── annotations*.json
-        ```
-## Reference
-- **[cocosplit](https://github.com/akarazniewicz/cocosplit)**  A script that splits the coco annotations into train and test sets.
-- **[Detectron2](https://github.com/facebookresearch/detectron2)** Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms.

model/layout-parser/configs/prima/fast_rcnn_R_50_FPN_3x.yaml DELETED Viewed

@@ -1,307 +0,0 @@
-CUDNN_BENCHMARK: false
-DATALOADER:
-  ASPECT_RATIO_GROUPING: true
-  FILTER_EMPTY_ANNOTATIONS: true
-  NUM_WORKERS: 4
-  REPEAT_THRESHOLD: 0.0
-  SAMPLER_TRAIN: TrainingSampler
-DATASETS:
-  PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
-  PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
-  PROPOSAL_FILES_TEST: []
-  PROPOSAL_FILES_TRAIN: []
-  TEST: []
-  TRAIN: []
-GLOBAL:
-  HACK: 1.0
-INPUT:
-  CROP:
-    ENABLED: false
-    SIZE:
-    - 0.9
-    - 0.9
-    TYPE: relative_range
-  FORMAT: BGR
-  MASK_FORMAT: polygon
-  MAX_SIZE_TEST: 1333
-  MAX_SIZE_TRAIN: 1333
-  MIN_SIZE_TEST: 800
-  MIN_SIZE_TRAIN:
-  - 640
-  - 672
-  - 704
-  - 736
-  - 768
-  - 800
-  MIN_SIZE_TRAIN_SAMPLING: choice
-MODEL:
-  ANCHOR_GENERATOR:
-    ANGLES:
-    - - -90
-      - 0
-      - 90
-    ASPECT_RATIOS:
-    - - 0.5
-      - 1.0
-      - 2.0
-    NAME: DefaultAnchorGenerator
-    OFFSET: 0.0
-    SIZES:
-    - - 32
-    - - 64
-    - - 128
-    - - 256
-    - - 512
-  BACKBONE:
-    FREEZE_AT: 2
-    NAME: build_resnet_fpn_backbone
-  DEVICE: cuda
-  FPN:
-    FUSE_TYPE: sum
-    IN_FEATURES:
-    - res2
-    - res3
-    - res4
-    - res5
-    NORM: ''
-    OUT_CHANNELS: 256
-  KEYPOINT_ON: false
-  LOAD_PROPOSALS: false
-  MASK_ON: false
-  META_ARCHITECTURE: GeneralizedRCNN
-  PANOPTIC_FPN:
-    COMBINE:
-      ENABLED: true
-      INSTANCES_CONFIDENCE_THRESH: 0.5
-      OVERLAP_THRESH: 0.5
-      STUFF_AREA_LIMIT: 4096
-    INSTANCE_LOSS_WEIGHT: 1.0
-  PIXEL_MEAN:
-  - 103.53
-  - 116.28
-  - 123.675
-  PIXEL_STD:
-  - 1.0
-  - 1.0
-  - 1.0
-  PROPOSAL_GENERATOR:
-    MIN_SIZE: 0
-    NAME: RPN
-  RESNETS:
-    DEFORM_MODULATED: false
-    DEFORM_NUM_GROUPS: 1
-    DEFORM_ON_PER_STAGE:
-    - false
-    - false
-    - false
-    - false
-    DEPTH: 50
-    NORM: FrozenBN
-    NUM_GROUPS: 1
-    OUT_FEATURES:
-    - res2
-    - res3
-    - res4
-    - res5
-    RES2_OUT_CHANNELS: 256
-    RES5_DILATION: 1
-    STEM_OUT_CHANNELS: 64
-    STRIDE_IN_1X1: true
-    WIDTH_PER_GROUP: 64
-  RETINANET:
-    BBOX_REG_WEIGHTS:
-    - 1.0
-    - 1.0
-    - 1.0
-    - 1.0
-    FOCAL_LOSS_ALPHA: 0.25
-    FOCAL_LOSS_GAMMA: 2.0
-    IN_FEATURES:
-    - p3
-    - p4
-    - p5
-    - p6
-    - p7
-    IOU_LABELS:
-    - 0
-    - -1
-    - 1
-    IOU_THRESHOLDS:
-    - 0.4
-    - 0.5
-    NMS_THRESH_TEST: 0.5
-    NUM_CLASSES: 80
-    NUM_CONVS: 4
-    PRIOR_PROB: 0.01
-    SCORE_THRESH_TEST: 0.05
-    SMOOTH_L1_LOSS_BETA: 0.1
-    TOPK_CANDIDATES_TEST: 1000
-  ROI_BOX_CASCADE_HEAD:
-    BBOX_REG_WEIGHTS:
-    - - 10.0
-      - 10.0
-      - 5.0
-      - 5.0
-    - - 20.0
-      - 20.0
-      - 10.0
-      - 10.0
-    - - 30.0
-      - 30.0
-      - 15.0
-      - 15.0
-    IOUS:
-    - 0.5
-    - 0.6
-    - 0.7
-  ROI_BOX_HEAD:
-    BBOX_REG_WEIGHTS:
-    - 10.0
-    - 10.0
-    - 5.0
-    - 5.0
-    CLS_AGNOSTIC_BBOX_REG: false
-    CONV_DIM: 256
-    FC_DIM: 1024
-    NAME: FastRCNNConvFCHead
-    NORM: ''
-    NUM_CONV: 0
-    NUM_FC: 2
-    POOLER_RESOLUTION: 7
-    POOLER_SAMPLING_RATIO: 0
-    POOLER_TYPE: ROIAlignV2
-    SMOOTH_L1_BETA: 0.0
-    TRAIN_ON_PRED_BOXES: false
-  ROI_HEADS:
-    BATCH_SIZE_PER_IMAGE: 512
-    IN_FEATURES:
-    - p2
-    - p3
-    - p4
-    - p5
-    IOU_LABELS:
-    - 0
-    - 1
-    IOU_THRESHOLDS:
-    - 0.5
-    NAME: StandardROIHeads
-    NMS_THRESH_TEST: 0.5
-    NUM_CLASSES: 80
-    POSITIVE_FRACTION: 0.25
-    PROPOSAL_APPEND_GT: true
-    SCORE_THRESH_TEST: 0.05
-  ROI_KEYPOINT_HEAD:
-    CONV_DIMS:
-    - 512
-    - 512
-    - 512
-    - 512
-    - 512
-    - 512
-    - 512
-    - 512
-    LOSS_WEIGHT: 1.0
-    MIN_KEYPOINTS_PER_IMAGE: 1
-    NAME: KRCNNConvDeconvUpsampleHead
-    NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
-    NUM_KEYPOINTS: 17
-    POOLER_RESOLUTION: 14
-    POOLER_SAMPLING_RATIO: 0
-    POOLER_TYPE: ROIAlignV2
-  ROI_MASK_HEAD:
-    CLS_AGNOSTIC_MASK: false
-    CONV_DIM: 256
-    NAME: MaskRCNNConvUpsampleHead
-    NORM: ''
-    NUM_CONV: 4
-    POOLER_RESOLUTION: 14
-    POOLER_SAMPLING_RATIO: 0
-    POOLER_TYPE: ROIAlignV2
-  RPN:
-    BATCH_SIZE_PER_IMAGE: 256
-    BBOX_REG_WEIGHTS:
-    - 1.0
-    - 1.0
-    - 1.0
-    - 1.0
-    BOUNDARY_THRESH: -1
-    HEAD_NAME: StandardRPNHead
-    IN_FEATURES:
-    - p2
-    - p3
-    - p4
-    - p5
-    - p6
-    IOU_LABELS:
-    - 0
-    - -1
-    - 1
-    IOU_THRESHOLDS:
-    - 0.3
-    - 0.7
-    LOSS_WEIGHT: 1.0
-    NMS_THRESH: 0.7
-    POSITIVE_FRACTION: 0.5
-    POST_NMS_TOPK_TEST: 1000
-    POST_NMS_TOPK_TRAIN: 1000
-    PRE_NMS_TOPK_TEST: 1000
-    PRE_NMS_TOPK_TRAIN: 2000
-    SMOOTH_L1_BETA: 0.0
-  SEM_SEG_HEAD:
-    COMMON_STRIDE: 4
-    CONVS_DIM: 128
-    IGNORE_VALUE: 255
-    IN_FEATURES:
-    - p2
-    - p3
-    - p4
-    - p5
-    LOSS_WEIGHT: 1.0
-    NAME: SemSegFPNHead
-    NORM: GN
-    NUM_CLASSES: 54
-  WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl
-OUTPUT_DIR: ./output
-SEED: -1
-SOLVER:
-  BASE_LR: 0.02
-  BIAS_LR_FACTOR: 1.0
-  CHECKPOINT_PERIOD: 20000
-  GAMMA: 0.1
-  IMS_PER_BATCH: 16
-  LR_SCHEDULER_NAME: WarmupMultiStepLR
-  MAX_ITER: 60000
-  MOMENTUM: 0.9
-  STEPS:
-  - 210000
-  - 250000
-  WARMUP_FACTOR: 0.001
-  WARMUP_ITERS: 1000
-  WARMUP_METHOD: linear
-  WEIGHT_DECAY: 0.0001
-  WEIGHT_DECAY_BIAS: 0.0001
-  WEIGHT_DECAY_NORM: 0.0
-TEST:
-  AUG:
-    ENABLED: false
-    FLIP: true
-    MAX_SIZE: 4000
-    MIN_SIZES:
-    - 400
-    - 500
-    - 600
-    - 700
-    - 800
-    - 900
-    - 1000
-    - 1100
-    - 1200
-  DETECTIONS_PER_IMAGE: 100
-  EVAL_PERIOD: 0
-  EXPECTED_RESULTS: []
-  KEYPOINT_OKS_SIGMAS: []
-  PRECISE_BN:
-    ENABLED: false
-    NUM_ITER: 200
-VERSION: 2
-VIS_PERIOD: 0

model/layout-parser/configs/prima/mask_rcnn_R_50_FPN_3x.yaml DELETED Viewed

@@ -1,307 +0,0 @@
-CUDNN_BENCHMARK: false
-DATALOADER:
-  ASPECT_RATIO_GROUPING: true
-  FILTER_EMPTY_ANNOTATIONS: true
-  NUM_WORKERS: 4
-  REPEAT_THRESHOLD: 0.0
-  SAMPLER_TRAIN: TrainingSampler
-DATASETS:
-  PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
-  PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
-  PROPOSAL_FILES_TEST: []
-  PROPOSAL_FILES_TRAIN: []
-  TEST: []
-  TRAIN: []
-GLOBAL:
-  HACK: 1.0
-INPUT:
-  CROP:
-    ENABLED: false
-    SIZE:
-    - 0.9
-    - 0.9
-    TYPE: relative_range
-  FORMAT: BGR
-  MASK_FORMAT: polygon
-  MAX_SIZE_TEST: 1333
-  MAX_SIZE_TRAIN: 1333
-  MIN_SIZE_TEST: 800
-  MIN_SIZE_TRAIN:
-  - 640
-  - 672
-  - 704
-  - 736
-  - 768
-  - 800
-  MIN_SIZE_TRAIN_SAMPLING: choice
-MODEL:
-  ANCHOR_GENERATOR:
-    ANGLES:
-    - - -90
-      - 0
-      - 90
-    ASPECT_RATIOS:
-    - - 0.5
-      - 1.0
-      - 2.0
-    NAME: DefaultAnchorGenerator
-    OFFSET: 0.0
-    SIZES:
-    - - 32
-    - - 64
-    - - 128
-    - - 256
-    - - 512
-  BACKBONE:
-    FREEZE_AT: 2
-    NAME: build_resnet_fpn_backbone
-  DEVICE: cuda
-  FPN:
-    FUSE_TYPE: sum
-    IN_FEATURES:
-    - res2
-    - res3
-    - res4
-    - res5
-    NORM: ''
-    OUT_CHANNELS: 256
-  KEYPOINT_ON: false
-  LOAD_PROPOSALS: false
-  MASK_ON: true
-  META_ARCHITECTURE: GeneralizedRCNN
-  PANOPTIC_FPN:
-    COMBINE:
-      ENABLED: true
-      INSTANCES_CONFIDENCE_THRESH: 0.5
-      OVERLAP_THRESH: 0.5
-      STUFF_AREA_LIMIT: 4096
-    INSTANCE_LOSS_WEIGHT: 1.0
-  PIXEL_MEAN:
-  - 103.53
-  - 116.28
-  - 123.675
-  PIXEL_STD:
-  - 1.0
-  - 1.0
-  - 1.0
-  PROPOSAL_GENERATOR:
-    MIN_SIZE: 0
-    NAME: RPN
-  RESNETS:
-    DEFORM_MODULATED: false
-    DEFORM_NUM_GROUPS: 1
-    DEFORM_ON_PER_STAGE:
-    - false
-    - false
-    - false
-    - false
-    DEPTH: 50
-    NORM: FrozenBN
-    NUM_GROUPS: 1
-    OUT_FEATURES:
-    - res2
-    - res3
-    - res4
-    - res5
-    RES2_OUT_CHANNELS: 256
-    RES5_DILATION: 1
-    STEM_OUT_CHANNELS: 64
-    STRIDE_IN_1X1: true
-    WIDTH_PER_GROUP: 64
-  RETINANET:
-    BBOX_REG_WEIGHTS:
-    - 1.0
-    - 1.0
-    - 1.0
-    - 1.0
-    FOCAL_LOSS_ALPHA: 0.25
-    FOCAL_LOSS_GAMMA: 2.0
-    IN_FEATURES:
-    - p3
-    - p4
-    - p5
-    - p6
-    - p7
-    IOU_LABELS:
-    - 0
-    - -1
-    - 1
-    IOU_THRESHOLDS:
-    - 0.4
-    - 0.5
-    NMS_THRESH_TEST: 0.5
-    NUM_CLASSES: 80
-    NUM_CONVS: 4
-    PRIOR_PROB: 0.01
-    SCORE_THRESH_TEST: 0.05
-    SMOOTH_L1_LOSS_BETA: 0.1
-    TOPK_CANDIDATES_TEST: 1000
-  ROI_BOX_CASCADE_HEAD:
-    BBOX_REG_WEIGHTS:
-    - - 10.0
-      - 10.0
-      - 5.0
-      - 5.0
-    - - 20.0
-      - 20.0
-      - 10.0
-      - 10.0
-    - - 30.0
-      - 30.0
-      - 15.0
-      - 15.0
-    IOUS:
-    - 0.5
-    - 0.6
-    - 0.7
-  ROI_BOX_HEAD:
-    BBOX_REG_WEIGHTS:
-    - 10.0
-    - 10.0
-    - 5.0
-    - 5.0
-    CLS_AGNOSTIC_BBOX_REG: false
-    CONV_DIM: 256
-    FC_DIM: 1024
-    NAME: FastRCNNConvFCHead
-    NORM: ''
-    NUM_CONV: 0
-    NUM_FC: 2
-    POOLER_RESOLUTION: 7
-    POOLER_SAMPLING_RATIO: 0
-    POOLER_TYPE: ROIAlignV2
-    SMOOTH_L1_BETA: 0.0
-    TRAIN_ON_PRED_BOXES: false
-  ROI_HEADS:
-    BATCH_SIZE_PER_IMAGE: 512
-    IN_FEATURES:
-    - p2
-    - p3
-    - p4
-    - p5
-    IOU_LABELS:
-    - 0
-    - 1
-    IOU_THRESHOLDS:
-    - 0.5
-    NAME: StandardROIHeads
-    NMS_THRESH_TEST: 0.5
-    NUM_CLASSES: 80
-    POSITIVE_FRACTION: 0.25
-    PROPOSAL_APPEND_GT: true
-    SCORE_THRESH_TEST: 0.05
-  ROI_KEYPOINT_HEAD:
-    CONV_DIMS:
-    - 512
-    - 512
-    - 512
-    - 512
-    - 512
-    - 512
-    - 512
-    - 512
-    LOSS_WEIGHT: 1.0
-    MIN_KEYPOINTS_PER_IMAGE: 1
-    NAME: KRCNNConvDeconvUpsampleHead
-    NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
-    NUM_KEYPOINTS: 17
-    POOLER_RESOLUTION: 14
-    POOLER_SAMPLING_RATIO: 0
-    POOLER_TYPE: ROIAlignV2
-  ROI_MASK_HEAD:
-    CLS_AGNOSTIC_MASK: false
-    CONV_DIM: 256
-    NAME: MaskRCNNConvUpsampleHead
-    NORM: ''
-    NUM_CONV: 4
-    POOLER_RESOLUTION: 14
-    POOLER_SAMPLING_RATIO: 0
-    POOLER_TYPE: ROIAlignV2
-  RPN:
-    BATCH_SIZE_PER_IMAGE: 256
-    BBOX_REG_WEIGHTS:
-    - 1.0
-    - 1.0
-    - 1.0
-    - 1.0
-    BOUNDARY_THRESH: -1
-    HEAD_NAME: StandardRPNHead
-    IN_FEATURES:
-    - p2
-    - p3
-    - p4
-    - p5
-    - p6
-    IOU_LABELS:
-    - 0
-    - -1
-    - 1
-    IOU_THRESHOLDS:
-    - 0.3
-    - 0.7
-    LOSS_WEIGHT: 1.0
-    NMS_THRESH: 0.7
-    POSITIVE_FRACTION: 0.5
-    POST_NMS_TOPK_TEST: 1000
-    POST_NMS_TOPK_TRAIN: 1000
-    PRE_NMS_TOPK_TEST: 1000
-    PRE_NMS_TOPK_TRAIN: 2000
-    SMOOTH_L1_BETA: 0.0
-  SEM_SEG_HEAD:
-    COMMON_STRIDE: 4
-    CONVS_DIM: 128
-    IGNORE_VALUE: 255
-    IN_FEATURES:
-    - p2
-    - p3
-    - p4
-    - p5
-    LOSS_WEIGHT: 1.0
-    NAME: SemSegFPNHead
-    NORM: GN
-    NUM_CLASSES: 54
-  WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl
-OUTPUT_DIR: ./output
-SEED: -1
-SOLVER:
-  BASE_LR: 0.02
-  BIAS_LR_FACTOR: 1.0
-  CHECKPOINT_PERIOD: 20000
-  GAMMA: 0.1
-  IMS_PER_BATCH: 16
-  LR_SCHEDULER_NAME: WarmupMultiStepLR
-  MAX_ITER: 60000
-  MOMENTUM: 0.9
-  STEPS:
-  - 210000
-  - 250000
-  WARMUP_FACTOR: 0.001
-  WARMUP_ITERS: 1000
-  WARMUP_METHOD: linear
-  WEIGHT_DECAY: 0.0001
-  WEIGHT_DECAY_BIAS: 0.0001
-  WEIGHT_DECAY_NORM: 0.0
-TEST:
-  AUG:
-    ENABLED: false
-    FLIP: true
-    MAX_SIZE: 4000
-    MIN_SIZES:
-    - 400
-    - 500
-    - 600
-    - 700
-    - 800
-    - 900
-    - 1000
-    - 1100
-    - 1200
-  DETECTIONS_PER_IMAGE: 100
-  EVAL_PERIOD: 0
-  EXPECTED_RESULTS: []
-  KEYPOINT_OKS_SIGMAS: []
-  PRECISE_BN:
-    ENABLED: false
-    NUM_ITER: 200
-VERSION: 2
-VIS_PERIOD: 0

model/layout-parser/requirements.txt DELETED Viewed

@@ -1,6 +0,0 @@
-layoutparser
-funcy
-bs4
-scikit-learn
-imagesize
-tqdm

model/layout-parser/scripts/train_prima.sh DELETED Viewed

@@ -1,17 +0,0 @@
-#!/bin/bash
-cd ../tools
-python convert_prima_to_coco.py \
-    --prima_datapath ../data/prima \
-    --anno_savepath ../data/prima/annotations.json
-python train_net.py \
-    --dataset_name          prima-layout \
-    --json_annotation_train ../data/prima/annotations-train.json \
-    --image_path_train      ../data/prima/Images \
-    --json_annotation_val   ../data/prima/annotations-val.json \
-    --image_path_val        ../data/prima/Images \
-    --config-file           ../configs/prima/mask_rcnn_R_50_FPN_3x.yaml \
-    OUTPUT_DIR  ../outputs/prima/mask_rcnn_R_50_FPN_3x/ \
-    SOLVER.IMS_PER_BATCH 2

model/layout-parser/tools/convert_prima_to_coco.py DELETED Viewed

@@ -1,225 +0,0 @@
-import os, re, json
-import imagesize
-from glob import glob
-from bs4 import BeautifulSoup
-import numpy as np
-from PIL import Image
-import argparse
-from tqdm import tqdm
-import sys
-sys.path.append('..')
-from utils import cocosplit
-class NpEncoder(json.JSONEncoder):
-    def default(self, obj):
-        if isinstance(obj, np.integer):
-            return int(obj)
-        elif isinstance(obj, np.floating):
-            return float(obj)
-        elif isinstance(obj, np.ndarray):
-            return obj.tolist()
-        else:
-            return super(NpEncoder, self).default(obj)
-def cvt_coords_to_array(obj):
-    return np.array(
-            [(float(pt['x']), float(pt['y']))
-                 for pt in obj.find_all("Point")]
-        )
-def cal_ployarea(points):
-    x = points[:,0]
-    y = points[:,1]
-    return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
-def _create_category(schema=0):
-    if schema==0:
-        categories = \
-            [{"supercategory": "layout", "id": 0, "name": "Background"},
-             {"supercategory": "layout", "id": 1, "name": "TextRegion"},
-             {"supercategory": "layout", "id": 2, "name": "ImageRegion"},
-             {"supercategory": "layout", "id": 3, "name": "TableRegion"},
-             {"supercategory": "layout", "id": 4, "name": "MathsRegion"},
-             {"supercategory": "layout", "id": 5, "name": "SeparatorRegion"},
-             {"supercategory": "layout", "id": 6, "name": "OtherRegion"}]
-        find_categories = lambda name: \
-            [val["id"] for val in categories if val['name'] == name][0]
-        conversion = \
-            {
-                'TextRegion':       find_categories("TextRegion"),
-                'TableRegion':      find_categories("TableRegion"),
-                'MathsRegion':      find_categories("MathsRegion"),
-                'ChartRegion':      find_categories("ImageRegion"),
-                'GraphicRegion':    find_categories("ImageRegion"),
-                'ImageRegion':      find_categories("ImageRegion"),
-                'LineDrawingRegion':find_categories("OtherRegion"),
-                'SeparatorRegion':  find_categories("SeparatorRegion"),
-                'NoiseRegion':      find_categories("OtherRegion"),
-                'FrameRegion':      find_categories("OtherRegion"),
-            }
-        return categories, conversion
-_categories, _categories_conversion = _create_category(schema=0)
-_info = {
-    "description": "PRIMA Layout Analysis Dataset",
-    "url": "https://www.primaresearch.org/datasets/Layout_Analysis",
-    "version": "1.0",
-    "year": 2010,
-    "contributor": "PRIMA Research",
-    "date_created": "2020/09/01",
-}
-def _load_soup(filename):
-    with open(filename, "r") as fp:
-        soup = BeautifulSoup(fp.read(),'xml')
-    return soup
-def _image_template(image_id, image_path):
-    width, height = imagesize.get(image_path)
-    return {
-        "file_name": os.path.basename(image_path),
-        "height": height,
-        "width": width,
-        "id": int(image_id)
-    }
-def _anno_template(anno_id, image_id, pts, obj_tag):
-    x_1, x_2 = pts[:,0].min(), pts[:,0].max()
-    y_1, y_2 = pts[:,1].min(), pts[:,1].max()
-    height = y_2 - y_1
-    width  = x_2 - x_1
-    return {
-        "segmentation": [pts.flatten().tolist()],
-        "area": cal_ployarea(pts),
-        "iscrowd": 0,
-        "image_id": image_id,
-        "bbox": [x_1, y_1, width, height],
-        "category_id": _categories_conversion[obj_tag],
-        "id": anno_id
-    }
-class PRIMADataset():
-    def __init__(self, base_path, anno_path='XML',
-                                  image_path='Images'):
-        self.base_path = base_path
-        self.anno_path = os.path.join(base_path, anno_path)
-        self.image_path = os.path.join(base_path, image_path)
-        self._ids = self.find_all_image_ids()
-    def __len__(self):
-        return len(self.ids)
-    def __getitem__(self, idx):
-        return self.load_image_and_annotaiton(idx)
-    def find_all_annotation_files(self):
-        return glob(os.path.join(self.anno_path, '*.xml'))
-    def find_all_image_ids(self):
-        replacer = lambda s: os.path.basename(s).replace('pc-', '').replace('.xml', '')
-        return [replacer(s) for s in self.find_all_annotation_files()]
-    def load_image_and_annotaiton(self, idx):
-        image_id = self._ids[idx]
-        image_path = os.path.join(self.image_path, f'{image_id}.tif')
-        image = Image.open(image_path)
-        anno = self.load_annotation(idx)
-        return image, anno
-    def load_annotation(self, idx):
-        image_id = self._ids[idx]
-        anno_path  = os.path.join(self.anno_path,  f'pc-{image_id}.xml')
-        # A dirtly hack to load the files w/wo pc- simualtaneously
-        if not os.path.exists(anno_path):
-            anno_path = os.path.join(self.anno_path,  f'{image_id}.xml')
-            assert os.path.exists(anno_path), "Invalid path"
-        anno = _load_soup(anno_path)
-        return anno
-    def convert_to_COCO(self, save_path):
-        all_image_infos = []
-        all_anno_infos  = []
-        anno_id = 0
-        for idx, image_id in enumerate(tqdm(self._ids)):
-            # We use the idx as the image id
-            image_path = os.path.join(self.image_path, f'{image_id}.tif')
-            image_info = _image_template(idx, image_path)
-            all_image_infos.append(image_info)
-            anno = self.load_annotation(idx)
-            for item in anno.find_all(re.compile(".*Region")):
-                pts = cvt_coords_to_array(item.Coords)
-                if 0 not in pts.shape:
-                    # Sometimes there will be polygons with less
-                    # than 4 edges, and they could not be appropriately
-                    # handled by the COCO format. So we just drop them.
-                    if pts.shape[0] >= 4:
-                        anno_info = _anno_template(anno_id, idx, pts, item.name)
-                        all_anno_infos.append(anno_info)
-                        anno_id += 1
-        final_annotation = {
-            "info": _info,
-            "licenses": [],
-            "images": all_image_infos,
-            "annotations": all_anno_infos,
-            "categories": _categories}
-        with open(save_path, 'w') as fp:
-            json.dump(final_annotation, fp, cls=NpEncoder)
-        return final_annotation
-parser = argparse.ArgumentParser()
-parser.add_argument('--prima_datapath', type=str, default='./data/prima', help='the path to the prima data folders')
-parser.add_argument('--anno_savepath',  type=str, default='./annotations.json', help='the path to save the new annotations')
-if __name__ == "__main__":
-    args = parser.parse_args()
-    print("Start running the conversion script")
-    print(f"Loading the information from the path {args.prima_datapath}")
-    dataset = PRIMADataset(args.prima_datapath)
-    print(f"Saving the annotation to {args.anno_savepath}")
-    res = dataset.convert_to_COCO(args.anno_savepath)
-    cocosplit.main(
-        args.anno_savepath,
-        split_ratio=0.8,
-        having_annotations=True,
-        train_save_path=args.anno_savepath.replace('.json', '-train.json'),
-        test_save_path=args.anno_savepath.replace('.json', '-val.json'),
-        random_state=24)

model/layout-parser/tools/train_net.py DELETED Viewed

@@ -1,229 +0,0 @@
-"""
-The script is based on https://github.com/facebookresearch/detectron2/blob/master/tools/train_net.py.
-"""
-import logging
-import os
-import json
-from collections import OrderedDict
-import detectron2.utils.comm as comm
-import detectron2.data.transforms as T
-from detectron2.checkpoint import DetectionCheckpointer
-from detectron2.config import get_cfg
-from detectron2.data import DatasetMapper, build_detection_train_loader
-from detectron2.data.datasets import register_coco_instances
-from detectron2.engine import (
-    DefaultTrainer,
-    default_argument_parser,
-    default_setup,
-    hooks,
-    launch,
-)
-from detectron2.evaluation import (
-    COCOEvaluator,
-    verify_results,
-)
-from detectron2.modeling import GeneralizedRCNNWithTTA
-import pandas as pd
-def get_augs(cfg):
-    """Add all the desired augmentations here. A list of availble augmentations
-    can be found here:
-       https://detectron2.readthedocs.io/en/latest/modules/data_transforms.html
-    """
-    augs = [
-        T.ResizeShortestEdge(
-            cfg.INPUT.MIN_SIZE_TRAIN,
-            cfg.INPUT.MAX_SIZE_TRAIN,
-            cfg.INPUT.MIN_SIZE_TRAIN_SAMPLING,
-        )
-    ]
-    if cfg.INPUT.CROP.ENABLED:
-        augs.append(
-            T.RandomCrop_CategoryAreaConstraint(
-                cfg.INPUT.CROP.TYPE,
-                cfg.INPUT.CROP.SIZE,
-                cfg.INPUT.CROP.SINGLE_CATEGORY_MAX_AREA,
-                cfg.MODEL.SEM_SEG_HEAD.IGNORE_VALUE,
-            )
-        )
-    horizontal_flip: bool = cfg.INPUT.RANDOM_FLIP == "horizontal"
-    augs.append(T.RandomFlip(horizontal=horizontal_flip, vertical=not horizontal_flip))
-    # Rotate the image between -90 to 0 degrees clockwise around the centre
-    augs.append(T.RandomRotation(angle=[-90.0, 0.0]))
-    return augs
-class Trainer(DefaultTrainer):
-    """
-    We use the "DefaultTrainer" which contains pre-defined default logic for
-    standard training workflow. They may not work for you, especially if you
-    are working on a new research project. In that case you can use the cleaner
-    "SimpleTrainer", or write your own training loop. You can use
-    "tools/plain_train_net.py" as an example.
-    Adapted from:
-        https://github.com/facebookresearch/detectron2/blob/master/projects/DeepLab/train_net.py
-    """
-    @classmethod
-    def build_train_loader(cls, cfg):
-        mapper = DatasetMapper(cfg, is_train=True, augmentations=get_augs(cfg))
-        return build_detection_train_loader(cfg, mapper=mapper)
-    @classmethod
-    def build_evaluator(cls, cfg, dataset_name, output_folder=None):
-        """
-        Returns:
-            DatasetEvaluator or None
-        It is not implemented by default.
-        """
-        return COCOEvaluator(dataset_name, cfg, True, output_folder)
-    @classmethod
-    def test_with_TTA(cls, cfg, model):
-        logger = logging.getLogger("detectron2.trainer")
-        # In the end of training, run an evaluation with TTA
-        # Only support some R-CNN models.
-        logger.info("Running inference with test-time augmentation ...")
-        model = GeneralizedRCNNWithTTA(cfg, model)
-        evaluators = [
-            cls.build_evaluator(
-                cfg, name, output_folder=os.path.join(cfg.OUTPUT_DIR, "inference_TTA")
-            )
-            for name in cfg.DATASETS.TEST
-        ]
-        res = cls.test(cfg, model, evaluators)
-        res = OrderedDict({k + "_TTA": v for k, v in res.items()})
-        return res
-    @classmethod
-    def eval_and_save(cls, cfg, model):
-        evaluators = [
-            cls.build_evaluator(
-                cfg, name, output_folder=os.path.join(cfg.OUTPUT_DIR, "inference")
-            )
-            for name in cfg.DATASETS.TEST
-        ]
-        res = cls.test(cfg, model, evaluators)
-        pd.DataFrame(res).to_csv(os.path.join(cfg.OUTPUT_DIR, "eval.csv"))
-        return res
-def setup(args):
-    """
-    Create configs and perform basic setups.
-    """
-    cfg = get_cfg()
-    if args.config_file != "":
-        cfg.merge_from_file(args.config_file)
-    cfg.merge_from_list(args.opts)
-    with open(args.json_annotation_train, "r") as fp:
-        anno_file = json.load(fp)
-    cfg.MODEL.ROI_HEADS.NUM_CLASSES = len(anno_file["categories"])
-    del anno_file
-    cfg.DATASETS.TRAIN = (f"{args.dataset_name}-train",)
-    cfg.DATASETS.TEST = (f"{args.dataset_name}-val",)
-    cfg.freeze()
-    default_setup(cfg, args)
-    return cfg
-def main(args):
-    # Register Datasets
-    register_coco_instances(
-        f"{args.dataset_name}-train",
-        {},
-        args.json_annotation_train,
-        args.image_path_train,
-    )
-    register_coco_instances(
-        f"{args.dataset_name}-val",
-        {},
-        args.json_annotation_val,
-        args.image_path_val
-    )
-    cfg = setup(args)
-    if args.eval_only:
-        model = Trainer.build_model(cfg)
-        DetectionCheckpointer(model, save_dir=cfg.OUTPUT_DIR).resume_or_load(
-            cfg.MODEL.WEIGHTS, resume=args.resume
-        )
-        res = Trainer.test(cfg, model)
-        if cfg.TEST.AUG.ENABLED:
-            res.update(Trainer.test_with_TTA(cfg, model))
-        if comm.is_main_process():
-            verify_results(cfg, res)
-        # Save the evaluation results
-        pd.DataFrame(res).to_csv(f"{cfg.OUTPUT_DIR}/eval.csv")
-        return res
-    # Ensure that the Output directory exists
-    os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
-    """
-    If you'd like to do anything fancier than the standard training logic,
-    consider writing your own training loop (see plain_train_net.py) or
-    subclassing the trainer.
-    """
-    trainer = Trainer(cfg)
-    trainer.resume_or_load(resume=args.resume)
-    trainer.register_hooks(
-        [hooks.EvalHook(0, lambda: trainer.eval_and_save(cfg, trainer.model))]
-    )
-    if cfg.TEST.AUG.ENABLED:
-        trainer.register_hooks(
-            [hooks.EvalHook(0, lambda: trainer.test_with_TTA(cfg, trainer.model))]
-        )
-    return trainer.train()
-if __name__ == "__main__":
-    parser = default_argument_parser()
-    # Extra Configurations for dataset names and paths
-    parser.add_argument(
-        "--dataset_name",
-        help="The Dataset Name")
-    parser.add_argument(
-        "--json_annotation_train",
-        help="The path to the training set JSON annotation",
-    )
-    parser.add_argument(
-        "--image_path_train",
-        help="The path to the training set image folder",
-    )
-    parser.add_argument(
-        "--json_annotation_val",
-        help="The path to the validation set JSON annotation",
-    )
-    parser.add_argument(
-        "--image_path_val",
-        help="The path to the validation set image folder",
-    )
-    args = parser.parse_args()
-    print("Command Line Args:", args)
-    # Dataset Registration is moved to the main function to support multi-gpu training
-    # See ref https://github.com/facebookresearch/detectron2/issues/253#issuecomment-554216517
-    launch(
-        main,
-        args.num_gpus,
-        num_machines=args.num_machines,
-        machine_rank=args.machine_rank,
-        dist_url=args.dist_url,
-        args=(args,),
-    )

model/layout-parser/utils/__init__.py DELETED Viewed

File without changes

model/layout-parser/utils/cocosplit.py DELETED Viewed

@@ -1,112 +0,0 @@
-# Modified based on https://github.com/akarazniewicz/cocosplit/blob/master/cocosplit.py
-import json
-import argparse
-import funcy
-from sklearn.model_selection import train_test_split
-parser = argparse.ArgumentParser(
-    description="Splits COCO annotations file into training and test sets."
-)
-parser.add_argument(
-    "--annotation-path",
-    metavar="coco_annotations",
-    type=str,
-    help="Path to COCO annotations file.",
-)
-parser.add_argument(
-    "--train", type=str, help="Where to store COCO training annotations"
-)
-parser.add_argument("--test", type=str, help="Where to store COCO test annotations")
-parser.add_argument(
-    "--split-ratio",
-    dest="split_ratio",
-    type=float,
-    required=True,
-    help="A percentage of a split; a number in (0, 1)",
-)
-parser.add_argument(
-    "--having-annotations",
-    dest="having_annotations",
-    action="store_true",
-    help="Ignore all images without annotations. Keep only these with at least one annotation",
-)
-def save_coco(file, tagged_data):
-    with open(file, "wt", encoding="UTF-8") as coco:
-        json.dump(tagged_data, coco, indent=2, sort_keys=True)
-def filter_annotations(annotations, images):
-    image_ids = funcy.lmap(lambda i: int(i["id"]), images)
-    return funcy.lfilter(lambda a: int(a["image_id"]) in image_ids, annotations)
-def main(
-    annotation_path,
-    split_ratio,
-    having_annotations,
-    train_save_path,
-    test_save_path,
-    random_state=None,
-):
-    with open(annotation_path, "rt", encoding="UTF-8") as annotations:
-        coco = json.load(annotations)
-    images = coco["images"]
-    annotations = coco["annotations"]
-    ids_with_annotations = funcy.lmap(lambda a: int(a["image_id"]), annotations)
-    # Images with annotations
-    img_ann = funcy.lremove(lambda i: i["id"] not in ids_with_annotations, images)
-    tr_ann, ts_ann = train_test_split(
-        img_ann, train_size=split_ratio, random_state=random_state
-    )
-    img_wo_ann = funcy.lremove(lambda i: i["id"] in ids_with_annotations, images)
-    if len(img_wo_ann) > 0:
-        tr_wo_ann, ts_wo_ann = train_test_split(
-            img_wo_ann, train_size=split_ratio, random_state=random_state
-        )
-    else:
-        tr_wo_ann, ts_wo_ann = [], []  # Images without annotations
-    if having_annotations:
-        tr, ts = tr_ann, ts_ann
-    else:
-        # Merging the 2 image lists (i.e. with and without annotation)
-        tr_ann.extend(tr_wo_ann)
-        ts_ann.extend(ts_wo_ann)
-        tr, ts = tr_ann, ts_ann
-    # Train Data
-    coco.update({"images": tr, "annotations": filter_annotations(annotations, tr)})
-    save_coco(train_save_path, coco)
-    # Test Data
-    coco.update({"images": ts, "annotations": filter_annotations(annotations, ts)})
-    save_coco(test_save_path, coco)
-    print(
-        "Saved {} entries in {} and {} in {}".format(
-            len(tr), train_save_path, len(ts), test_save_path
-        )
-    )
-if __name__ == "__main__":
-    args = parser.parse_args()
-    main(
-        args.annotation_path,
-        args.split_ratio,
-        args.having_annotations,
-        args.train,
-        args.test,
-        random_state=24,
-    )