layoutreader / README.md
hantian's picture
Update README.md
ba3b3cb verified
|
raw
history blame
No virus
2.99 kB
---
license: mit
---
# LayoutReader
**TODO:**
1. upload models to huggingface
2. explain why this repo
3. explain the new dataset
4. build docker image
## Helper
### Build Dataset
```bash
python tools.py cache-dataset-spans --help
```
### Train
```bash
bash train.sh
```
### Eval
```bash
python eval.py --help
```
## Spans-Level Results
One bbox contains multiple tokens. Usually, parse pdf file to get bbox. Training data is generated by `tools.py`.
> only use the first part of test file
| Method | shuf | BLEU Idx | BLEU Token |
|----------------------------|------|----------|------------|
| Heuristic Method | no | 44.4 | 70.7 |
| LayoutReader (layout only) | no | 95.3 | 97.8 |
| LayoutReader (layout only) | yes | 95.0 | 97.6 |
## Tokens-Level Results
One bbox contains only one token.
### New eval script
> only use the first part of test file
| Method | shuf | BLEU Idx | BLEU Token |
|-----------------------------|------|----------|------------|
| Heuristic Method | no | 78.3 | 79.4 |
| LayoutReader (layout only) | no | 98.0 | 98.2 |
| LayoutReader (layout only) | yes | 97.8 | 98.0 |
| LayoutReader (public model) | no | 98.0 | 98.3 |
### Old eval script (from original paper)
* Evaluation results of the LayoutReader on the reading order detection task, where the source-side of training/testing
data is in the left-to-right and top-to-bottom order
| Method | Encoder | BLEU | ARD |
|----------------------------|------------------------|--------|------|
| Heuristic Method | - | 0.6972 | 8.46 |
| LayoutReader (layout only) | LayoutLM (layout only) | 0.9732 | 2.31 |
| LayoutReader | LayoutLM | 0.9819 | 1.75 |
* Input order study with left-to-right and top-to-bottom inputs in evaluation, where r is the proportion of
shuffled samples in training.
| Method | BLEU | BLEU | BLEU | ARD | ARD | ARD |
|----------------------------|--------|--------|--------|--------|-------|------|
| | r=100% | r=50% | r=0% | r=100% | r=50% | r=0% |
| LayoutReader (layout only) | 0.9701 | 0.9729 | 0.9732 | 2.85 | 2.61 | 2.31 |
| LayoutReader | 0.9765 | 0.9788 | 0.9819 | 2.50 | 2.24 | 1.75 |
* Input order study with token-shuffled inputs in evaluation, where r is the proportion of shuffled samples in training.
| Method | BLEU | BLEU | BLEU | ARD | ARD | ARD |
|----------------------------|--------|--------|--------|--------|-------|--------|
| | r=100% | r=50% | r=0% | r=100% | r=50% | r=0% |
| LayoutReader (layout only) | 0.9718 | 0.9714 | 0.1331 | 2.72 | 2.82 | 105.40 |
| LayoutReader | 0.9772 | 0.9770 | 0.1783 | 2.48 | 2.46 | 72.94 |