File size: 3,573 Bytes
a77090b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ee6fdb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a77090b
b27b699
 
 
a77090b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
license: cc-by-nc-sa-4.0
base_model: microsoft/layoutlmv2-base-uncased
tags:
- generated_from_trainer
datasets:
- cord
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: layoutlmv2-finetuned-cord
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: cord
      type: cord
      config: cord
      split: validation
      args: cord
    metrics:
    - name: Precision
      type: precision
      value: 0.9652945924132365
    - name: Recall
      type: recall
      value: 0.9676375404530745
    - name: F1
      type: f1
      value: 0.9664646464646465
    - name: Accuracy
      type: accuracy
      value: 0.9702653247941445
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# overfitting issue
I use this colab:
https://colab.research.google.com/drive/1AXh3G3-VmbMWlwbSvesVIurzNlcezTce?usp=sharing

to Fine tuning LayoutLMv2ForTokenClassification on CORD dataset

here is the result:
https://huggingface.co/doc2txt/layoutlmv2-finetuned-cord

* F1: 0.9665

and indeed the result are pretty amazing when running on the test set,
however when running on any other receipt (printed or pdf) the result are completely off

So from some reason the model is overfitting to the cord dataset, even though I use similar images for testing.

I don't think that there is a **Data leakage** unless the cord DS is not clean (which I assume it is clean)

What could be the reason for this?
Is it some inherent property of LayoutLM?
The LayoutLM models are somewhat old, and it seems deserted...

I don't have much experience so I would appreciate any info
Thanks

here is an example code of how to run this model on a specific img folder:
https://huggingface.co/doc2txt/layoutlmv2-finetuned-cord/blob/main/LayoutLMv2Main_cord2_gOcr_folder.py

# layoutlmv2-finetuned-cord

This model is a fine-tuned version of [microsoft/layoutlmv2-base-uncased](https://huggingface.co/microsoft/layoutlmv2-base-uncased) on the cord dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2819
- Precision: 0.9653
- Recall: 0.9676
- F1: 0.9665
- Accuracy: 0.9703

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
| No log        | 1.0   | 400  | 1.2752          | 0.8527    | 0.8382 | 0.8454 | 0.8481   |
| 1.9583        | 2.0   | 800  | 0.6372          | 0.8799    | 0.8948 | 0.8873 | 0.9021   |
| 0.7097        | 3.0   | 1200 | 0.4255          | 0.9241    | 0.9264 | 0.9253 | 0.9414   |
| 0.3845        | 4.0   | 1600 | 0.3021          | 0.9414    | 0.9482 | 0.9448 | 0.9611   |
| 0.2699        | 5.0   | 2000 | 0.2819          | 0.9653    | 0.9676 | 0.9665 | 0.9703   |


### Framework versions

- Transformers 4.37.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1