mychen76/invoices-and-receipts_ocr_v1
Viewer โข Updated โข 2.24k โข 941 โข 88
A LayoutLMv3-large model fine-tuned for Named Entity Recognition on invoice documents. This is fine-tuning pipeline being developed for production invoice processing โ including Ghanaian and West African invoices.
This model extracts 15 structured entity types from invoice images using both text content and spatial layout simultaneously. LayoutLMv3 is a multimodal transformer that processes words, bounding boxes, and image patches together โ making it significantly more accurate than text-only NER models on document images.
| Property | Detail |
|---|---|
| Source dataset | mychen76/invoices-and-receipts_ocr_v1 |
| Total samples | 2,043 |
| Kept after filtering | 416 (20.4% keep rate) |
| OCR engine | docTR (db_resnet50 + crnn_vgg16_bn) |
| Annotation method | Fuzzy match (threshold=90%) + spatial constraints |
| Min labelled tokens | 8 non-O tags per sample |
| Hyperparameter | Value |
|---|---|
| Base model | cuongdz01/layoutlmv3-large-cord |
| Learning rate | 1e-5 |
| Epochs | 5 |
| Batch size | 4 (2 per GPU ร 2ร T4) |
| Optimiser | AdamW (weight decay=0.01) |
| LR scheduler | Linear warmup (10%) + linear decay |
| Max sequence length | 512 |
| Train/val split | 90/10 (374 train / 42 val) |
| Platform | Kaggle 2ร T4 GPU |
| Epoch | Train Loss | Val Loss | Val F1 |
|---|---|---|---|
| 1 | 0.7984 | 0.1447 | 0.8964 |
| 2 | 0.0918 | 0.0658 | 0.9176 |
| 3 | 0.0504 | 0.0484 | 0.9446 |
| 4 | 0.0393 | 0.0492 | 0.9472 |
| 5 | 0.0335 | 0.0478 | 0.9497 |
Best Val F1: 0.9497 (epoch 5)
This model recognises 15 entity types using BIO tagging (30 labels total including B- and I- prefixes):
| Entity | Description |
|---|---|
| INVOICE_NUMBER | Unique invoice identifier |
| INVOICE_DATE | Date invoice was issued |
| DUE_DATE | Payment due date |
| REFERENCE_NUMBER | PO or reference number |
| SELLER_NAME | Name of the selling entity |
| SELLER_ADDRESS | Address of the seller |
| CLIENT_NAME | Name of the buying entity |
| CLIENT_ADDRESS | Address of the client |
| ITEM_NAME | Name of line item product/service |
| ITEM_DESC | Description of line item |
| QTY | Quantity of line item |
| UNIT_PRICE | Price per unit |
| LINE_TOTAL | Total for a single line item |
| TOTAL_AMOUNT | Final invoice total |
| TAX_AMOUNT | Tax or VAT amount |
Key design decisions that ensure label quality: