Invoice NER v2 — amaye15

A LayoutLMv3-large model fine-tuned for Named Entity Recognition on invoice documents. This is in a sequential fine-tuning pipeline being developed for production invoice processing — including Ghanaian and West African invoices.

Built on top of invoice-ner-v1-mychen76, this model extends coverage to a much larger and more diverse invoice dataset (7,801 samples vs 416 in v1).

Model Description

Same architecture as v1 — LayoutLMv3-large processing words, bounding boxes, and image patches simultaneously. Fine-tuned with a reduced learning rate (5e-6 vs 1e-5) to preserve knowledge from v1 while adapting to the larger amaye15 dataset.

Training Data

Property Detail
Source dataset amaye15/invoices-google-ocr
Total samples 13,463
Kept after filtering 7,801 (57.9% keep rate)
OCR engine docTR (db_resnet50 + crnn_vgg16_bn)
Annotation method Regex extraction + fuzzy match (90%) + spatial constraints
Min labelled tokens 8 non-O tags per sample

Training Configuration

Hyperparameter Value
Base model albertosei/invoice-ner-v1-mychen76
Learning rate 5e-6 (reduced to prevent catastrophic forgetting)
Epochs 3
Batch size 4 (2 per GPU × 2× T4)
Optimiser AdamW (weight decay=0.01)
LR scheduler Linear warmup (10%) + linear decay
Max sequence length 512
Train/val split 90/10 (7,021 train / 780 val)
Platform Kaggle 2× T4 GPU

Results

Epoch Train Loss Val Loss Val F1
1 0.1756 0.0721 0.8883
2 0.0622 0.0535 0.9071
3 0.0487 0.0510 0.9131

Best Val F1: 0.9131 (epoch 3)

Entity Types

Same 15 entity types as v1:

Entity Description
INVOICE_NUMBER Unique invoice identifier
INVOICE_DATE Date invoice was issued
DUE_DATE Payment due date
REFERENCE_NUMBER PO or reference number
SELLER_NAME Name of the selling entity
SELLER_ADDRESS Address of the seller
CLIENT_NAME Name of the buying entity
CLIENT_ADDRESS Address of the client
ITEM_NAME Name of line item product/service
ITEM_DESC Description of line item
QTY Quantity of line item
UNIT_PRICE Price per unit
LINE_TOTAL Total for a single line item
TOTAL_AMOUNT Final invoice total
TAX_AMOUNT Tax or VAT amount

Why F1 is Lower Than v1

v1 scored 0.9497 on a 42-sample validation set of synthetic invoices similar to its training data. v2 scores 0.9131 on a 780-sample validation set of much more diverse invoices. The larger, harder validation set explains the difference — v2 is a stronger and more generalisable model overall.

Limitations

  • Still primarily trained on synthetic/Western-style invoices
  • Real-world Ghanaian invoice performance ~50-60% at this stage
  • Intermediate checkpoint — superseded by v3, v4, v5

Sequential Fine-tuning Pipeline

Downloads last month
56
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for albertosei/invoice-ner-v2-amaye15

Dataset used to train albertosei/invoice-ner-v2-amaye15