Tensorpacks Cascade-RCNN with FPN and Group Normalization on ResNext32xd4-50 trained on Pubtabnet for Semantic Segmentation of tables.

The model and its training code has been mainly taken from: Tensorpack .

Regarding the dataset, please check: Xu Zhong et. all. - Image-based table recognition: data, model, and evaluation.

The model has been trained on detecting cells from tables. Note, that the datasets contains tables only. Therefore, it is required to perform a table detection task before detecting cells.

The code has been adapted so that it can be used in a deepdoctection pipeline.

How this model can be used

This model can be used with the deepdoctection in a full pipeline, along with table recognition and OCR. Check the general instruction following this Get_started tutorial.

This is an inference model only

To reduce the size of the checkpoint we removed all variables that are not necessary for inference. Therefore it cannot be used for fine-tuning. To fine tune this model please check this model .

How this model was trained.

To recreate the model run on the deepdoctection framework, run:


>>> import os
>>> from deep_doctection.datasets import DatasetRegistry
>>> from deep_doctection.eval import MetricRegistry
>>> from deep_doctection.utils import get_configs_dir_path
>>> from deep_doctection.train import train_faster_rcnn

pubtabnet = DatasetRegistry.get_dataset("pubtabnet")
pubtabnet.dataflow.categories.filter_categories(categories="CELL")
    
path_config_yaml=os.path.join(get_configs_dir_path(),"tp/cell/conf_frcnn_cell.yaml")
path_weights = ""
    
dataset_train = pubtabnet
config_overwrite=["TRAIN.STEPS_PER_EPOCH=500","TRAIN.STARTING_EPOCH=1",
                  "TRAIN.CHECKPOINT_PERIOD=50","BACKBONE.FREEZE_AT=0", "PREPROC.TRAIN_SHORT_EDGE_SIZE=[200,600]"]
build_train_config=["max_datapoints=500000"]
dataset_val = pubtabnet
build_val_config = ["max_datapoints=4000"]
    
coco_metric = MetricRegistry.get_metric("coco")
coco_metric.set_params(max_detections=[50,200,600], area_range=[[0,1000000],[0,200],[200,800],[800,1000000]])

train_faster_rcnn(path_config_yaml=path_config_yaml,
                  dataset_train=dataset_train,
                  path_weights=path_weights,
                  config_overwrite=config_overwrite,
                  log_dir="/path/to/dir",
                  build_train_config=build_train_config,
                  dataset_val=dataset_val,
                  build_val_config=build_val_config,
                  metric=coco_metric,
                  pipeline_component_name="ImageLayoutService"
                  )

How to fine-tune this model

To fine tune this model, please check this Fine-tune tutorial.