File size: 2,987 Bytes
28bbc69
 
 
ba3b3cb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
license: mit
---

# LayoutReader

**TODO:**
1. upload models to huggingface
2. explain why this repo
3. explain the new dataset
4. build docker image

## Helper

### Build Dataset

```bash
python tools.py cache-dataset-spans --help
```

### Train

```bash
bash train.sh
```

### Eval

```bash
python eval.py --help
```

## Spans-Level Results

One bbox contains multiple tokens. Usually, parse pdf file to get bbox. Training data is generated by `tools.py`.

> only use the first part of test file

| Method                     | shuf | BLEU Idx | BLEU Token |
|----------------------------|------|----------|------------|
| Heuristic Method           | no   | 44.4     | 70.7       |
| LayoutReader (layout only) | no   | 95.3     | 97.8       |
| LayoutReader (layout only) | yes  | 95.0     | 97.6       |

## Tokens-Level Results

One bbox contains only one token.

### New eval script

> only use the first part of test file

| Method                      | shuf | BLEU Idx | BLEU Token |
|-----------------------------|------|----------|------------|
| Heuristic Method            | no   | 78.3     | 79.4       |
| LayoutReader (layout only)  | no   | 98.0     | 98.2       |
| LayoutReader (layout only)  | yes  | 97.8     | 98.0       |
| LayoutReader (public model) | no   | 98.0     | 98.3       |

### Old eval script (from original paper)

* Evaluation results of the LayoutReader on the reading order detection task, where the source-side of training/testing
  data is in the left-to-right and top-to-bottom order

| Method                     | Encoder                | BLEU   | ARD  |
|----------------------------|------------------------|--------|------|
| Heuristic Method           | -                      | 0.6972 | 8.46 |
| LayoutReader (layout only) | LayoutLM (layout only) | 0.9732 | 2.31 |
| LayoutReader               | LayoutLM               | 0.9819 | 1.75 |

* Input order study with left-to-right and top-to-bottom inputs in evaluation, where r is the proportion of
  shuffled samples in training.

| Method                     | BLEU   | BLEU   | BLEU   | ARD    | ARD   | ARD  |
|----------------------------|--------|--------|--------|--------|-------|------|
|                            | r=100% | r=50%  | r=0%   | r=100% | r=50% | r=0% |
| LayoutReader (layout only) | 0.9701 | 0.9729 | 0.9732 | 2.85   | 2.61  | 2.31 |
| LayoutReader               | 0.9765 | 0.9788 | 0.9819 | 2.50   | 2.24  | 1.75 |

* Input order study with token-shuffled inputs in evaluation, where r is the proportion of shuffled samples in training.

| Method                     | BLEU   | BLEU   | BLEU   | ARD    | ARD   | ARD    |
|----------------------------|--------|--------|--------|--------|-------|--------|
|                            | r=100% | r=50%  | r=0%   | r=100% | r=50% | r=0%   |
| LayoutReader (layout only) | 0.9718 | 0.9714 | 0.1331 | 2.72   | 2.82  | 105.40 |
| LayoutReader               | 0.9772 | 0.9770 | 0.1783 | 2.48   | 2.46  | 72.94  |