--- license: mit --- # LayoutReader **TODO:** 1. upload models to huggingface 2. explain why this repo 3. explain the new dataset 4. build docker image ## Helper ### Build Dataset ```bash python tools.py cache-dataset-spans --help ``` ### Train ```bash bash train.sh ``` ### Eval ```bash python eval.py --help ``` ## Spans-Level Results One bbox contains multiple tokens. Usually, parse pdf file to get bbox. Training data is generated by `tools.py`. > only use the first part of test file | Method | shuf | BLEU Idx | BLEU Token | |----------------------------|------|----------|------------| | Heuristic Method | no | 44.4 | 70.7 | | LayoutReader (layout only) | no | 95.3 | 97.8 | | LayoutReader (layout only) | yes | 95.0 | 97.6 | ## Tokens-Level Results One bbox contains only one token. ### New eval script > only use the first part of test file | Method | shuf | BLEU Idx | BLEU Token | |-----------------------------|------|----------|------------| | Heuristic Method | no | 78.3 | 79.4 | | LayoutReader (layout only) | no | 98.0 | 98.2 | | LayoutReader (layout only) | yes | 97.8 | 98.0 | | LayoutReader (public model) | no | 98.0 | 98.3 | ### Old eval script (from original paper) * Evaluation results of the LayoutReader on the reading order detection task, where the source-side of training/testing data is in the left-to-right and top-to-bottom order | Method | Encoder | BLEU | ARD | |----------------------------|------------------------|--------|------| | Heuristic Method | - | 0.6972 | 8.46 | | LayoutReader (layout only) | LayoutLM (layout only) | 0.9732 | 2.31 | | LayoutReader | LayoutLM | 0.9819 | 1.75 | * Input order study with left-to-right and top-to-bottom inputs in evaluation, where r is the proportion of shuffled samples in training. | Method | BLEU | BLEU | BLEU | ARD | ARD | ARD | |----------------------------|--------|--------|--------|--------|-------|------| | | r=100% | r=50% | r=0% | r=100% | r=50% | r=0% | | LayoutReader (layout only) | 0.9701 | 0.9729 | 0.9732 | 2.85 | 2.61 | 2.31 | | LayoutReader | 0.9765 | 0.9788 | 0.9819 | 2.50 | 2.24 | 1.75 | * Input order study with token-shuffled inputs in evaluation, where r is the proportion of shuffled samples in training. | Method | BLEU | BLEU | BLEU | ARD | ARD | ARD | |----------------------------|--------|--------|--------|--------|-------|--------| | | r=100% | r=50% | r=0% | r=100% | r=50% | r=0% | | LayoutReader (layout only) | 0.9718 | 0.9714 | 0.1331 | 2.72 | 2.82 | 105.40 | | LayoutReader | 0.9772 | 0.9770 | 0.1783 | 2.48 | 2.46 | 72.94 |