Invoice NER v2 — amaye15
A LayoutLMv3-large model fine-tuned for Named Entity Recognition on invoice documents. This is in a sequential fine-tuning pipeline being developed for production invoice processing — including Ghanaian and West African invoices.
Built on top of invoice-ner-v1-mychen76, this model extends coverage to a much larger and more diverse invoice dataset (7,801 samples vs 416 in v1).
Model Description
Same architecture as v1 — LayoutLMv3-large processing words, bounding boxes, and image patches simultaneously. Fine-tuned with a reduced learning rate (5e-6 vs 1e-5) to preserve knowledge from v1 while adapting to the larger amaye15 dataset.
Training Data
| Property | Detail |
|---|---|
| Source dataset | amaye15/invoices-google-ocr |
| Total samples | 13,463 |
| Kept after filtering | 7,801 (57.9% keep rate) |
| OCR engine | docTR (db_resnet50 + crnn_vgg16_bn) |
| Annotation method | Regex extraction + fuzzy match (90%) + spatial constraints |
| Min labelled tokens | 8 non-O tags per sample |
Training Configuration
| Hyperparameter | Value |
|---|---|
| Base model | albertosei/invoice-ner-v1-mychen76 |
| Learning rate | 5e-6 (reduced to prevent catastrophic forgetting) |
| Epochs | 3 |
| Batch size | 4 (2 per GPU × 2× T4) |
| Optimiser | AdamW (weight decay=0.01) |
| LR scheduler | Linear warmup (10%) + linear decay |
| Max sequence length | 512 |
| Train/val split | 90/10 (7,021 train / 780 val) |
| Platform | Kaggle 2× T4 GPU |
Results
| Epoch | Train Loss | Val Loss | Val F1 |
|---|---|---|---|
| 1 | 0.1756 | 0.0721 | 0.8883 |
| 2 | 0.0622 | 0.0535 | 0.9071 |
| 3 | 0.0487 | 0.0510 | 0.9131 |
Best Val F1: 0.9131 (epoch 3)
Entity Types
Same 15 entity types as v1:
| Entity | Description |
|---|---|
| INVOICE_NUMBER | Unique invoice identifier |
| INVOICE_DATE | Date invoice was issued |
| DUE_DATE | Payment due date |
| REFERENCE_NUMBER | PO or reference number |
| SELLER_NAME | Name of the selling entity |
| SELLER_ADDRESS | Address of the seller |
| CLIENT_NAME | Name of the buying entity |
| CLIENT_ADDRESS | Address of the client |
| ITEM_NAME | Name of line item product/service |
| ITEM_DESC | Description of line item |
| QTY | Quantity of line item |
| UNIT_PRICE | Price per unit |
| LINE_TOTAL | Total for a single line item |
| TOTAL_AMOUNT | Final invoice total |
| TAX_AMOUNT | Tax or VAT amount |
Why F1 is Lower Than v1
v1 scored 0.9497 on a 42-sample validation set of synthetic invoices similar to its training data. v2 scores 0.9131 on a 780-sample validation set of much more diverse invoices. The larger, harder validation set explains the difference — v2 is a stronger and more generalisable model overall.
Limitations
- Still primarily trained on synthetic/Western-style invoices
- Real-world Ghanaian invoice performance ~50-60% at this stage
- Intermediate checkpoint — superseded by v3, v4, v5
Sequential Fine-tuning Pipeline
- Downloads last month
- 56
Model tree for albertosei/invoice-ner-v2-amaye15
Base model
microsoft/layoutlmv3-large