hantian
/

layoutreader

Token Classification

Inference Endpoints

Model card Files Files and versions Community

hantian commited on Feb 28

Commit

ba3b3cb

•

1 Parent(s): 28bbc69

Update README.md

Files changed (1) hide show

README.md +83 -0

README.md CHANGED Viewed

@@ -1,3 +1,86 @@
 ---
 license: mit
 ---

 ---
 license: mit
 ---
+# LayoutReader
+**TODO:**
+1. upload models to huggingface
+2. explain why this repo
+3. explain the new dataset
+4. build docker image
+## Helper
+### Build Dataset
+```bash
+python tools.py cache-dataset-spans --help
+```
+### Train
+```bash
+bash train.sh
+```
+### Eval
+```bash
+python eval.py --help
+```
+## Spans-Level Results
+One bbox contains multiple tokens. Usually, parse pdf file to get bbox. Training data is generated by `tools.py`.
+> only use the first part of test file
+| Method                     | shuf | BLEU Idx | BLEU Token |
+|----------------------------|------|----------|------------|
+| Heuristic Method           | no   | 44.4     | 70.7       |
+| LayoutReader (layout only) | no   | 95.3     | 97.8       |
+| LayoutReader (layout only) | yes  | 95.0     | 97.6       |
+## Tokens-Level Results
+One bbox contains only one token.
+### New eval script
+> only use the first part of test file
+| Method                      | shuf | BLEU Idx | BLEU Token |
+|-----------------------------|------|----------|------------|
+| Heuristic Method            | no   | 78.3     | 79.4       |
+| LayoutReader (layout only)  | no   | 98.0     | 98.2       |
+| LayoutReader (layout only)  | yes  | 97.8     | 98.0       |
+| LayoutReader (public model) | no   | 98.0     | 98.3       |
+### Old eval script (from original paper)
+* Evaluation results of the LayoutReader on the reading order detection task, where the source-side of training/testing
+  data is in the left-to-right and top-to-bottom order
+| Method                     | Encoder                | BLEU   | ARD  |
+|----------------------------|------------------------|--------|------|
+| Heuristic Method           | -                      | 0.6972 | 8.46 |
+| LayoutReader (layout only) | LayoutLM (layout only) | 0.9732 | 2.31 |
+| LayoutReader               | LayoutLM               | 0.9819 | 1.75 |
+* Input order study with left-to-right and top-to-bottom inputs in evaluation, where r is the proportion of
+  shuffled samples in training.
+| Method                     | BLEU   | BLEU   | BLEU   | ARD    | ARD   | ARD  |
+|----------------------------|--------|--------|--------|--------|-------|------|
+|                            | r=100% | r=50%  | r=0%   | r=100% | r=50% | r=0% |
+| LayoutReader (layout only) | 0.9701 | 0.9729 | 0.9732 | 2.85   | 2.61  | 2.31 |
+| LayoutReader               | 0.9765 | 0.9788 | 0.9819 | 2.50   | 2.24  | 1.75 |
+* Input order study with token-shuffled inputs in evaluation, where r is the proportion of shuffled samples in training.
+| Method                     | BLEU   | BLEU   | BLEU   | ARD    | ARD   | ARD    |
+|----------------------------|--------|--------|--------|--------|-------|--------|
+|                            | r=100% | r=50%  | r=0%   | r=100% | r=50% | r=0%   |
+| LayoutReader (layout only) | 0.9718 | 0.9714 | 0.1331 | 2.72   | 2.82  | 105.40 |
+| LayoutReader               | 0.9772 | 0.9770 | 0.1783 | 2.48   | 2.46  | 72.94  |