Invoice NER v2 — amaye15

A LayoutLMv3-large model fine-tuned for Named Entity Recognition on invoice documents. This is in a sequential fine-tuning pipeline being developed for production invoice processing — including Ghanaian and West African invoices.

Built on top of invoice-ner-v1-mychen76, this model extends coverage to a much larger and more diverse invoice dataset (7,801 samples vs 416 in v1).

Model Description

Same architecture as v1 — LayoutLMv3-large processing words, bounding boxes, and image patches simultaneously. Fine-tuned with a reduced learning rate (5e-6 vs 1e-5) to preserve knowledge from v1 while adapting to the larger amaye15 dataset.

Training Data

Property	Detail
Source dataset	`amaye15/invoices-google-ocr`
Total samples	13,463
Kept after filtering	7,801 (57.9% keep rate)
OCR engine	docTR (db_resnet50 + crnn_vgg16_bn)
Annotation method	Regex extraction + fuzzy match (90%) + spatial constraints
Min labelled tokens	8 non-O tags per sample

Training Configuration

Hyperparameter	Value
Base model	`albertosei/invoice-ner-v1-mychen76`
Learning rate	5e-6 (reduced to prevent catastrophic forgetting)
Epochs	3
Batch size	4 (2 per GPU × 2× T4)
Optimiser	AdamW (weight decay=0.01)
LR scheduler	Linear warmup (10%) + linear decay
Max sequence length	512
Train/val split	90/10 (7,021 train / 780 val)
Platform	Kaggle 2× T4 GPU

Results

Epoch	Train Loss	Val Loss	Val F1
1	0.1756	0.0721	0.8883
2	0.0622	0.0535	0.9071
3	0.0487	0.0510	0.9131

Best Val F1: 0.9131 (epoch 3)

Entity Types

Same 15 entity types as v1:

Entity	Description
INVOICE_NUMBER	Unique invoice identifier
INVOICE_DATE	Date invoice was issued
DUE_DATE	Payment due date
REFERENCE_NUMBER	PO or reference number
SELLER_NAME	Name of the selling entity
SELLER_ADDRESS	Address of the seller
CLIENT_NAME	Name of the buying entity
CLIENT_ADDRESS	Address of the client
ITEM_NAME	Name of line item product/service
ITEM_DESC	Description of line item
QTY	Quantity of line item
UNIT_PRICE	Price per unit
LINE_TOTAL	Total for a single line item
TOTAL_AMOUNT	Final invoice total
TAX_AMOUNT	Tax or VAT amount

Why F1 is Lower Than v1

v1 scored 0.9497 on a 42-sample validation set of synthetic invoices similar to its training data. v2 scores 0.9131 on a 780-sample validation set of much more diverse invoices. The larger, harder validation set explains the difference — v2 is a stronger and more generalisable model overall.

Limitations

Still primarily trained on synthetic/Western-style invoices
Real-world Ghanaian invoice performance ~50-60% at this stage
Intermediate checkpoint — superseded by v3, v4, v5

Sequential Fine-tuning Pipeline

Downloads last month: 56

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for albertosei/invoice-ner-v2-amaye15

Base model

microsoft/layoutlmv3-large

Finetuned

cuongdz01/layoutlmv3-large-cord

Finetuned

albertosei/invoice-ner-v1-mychen76

Finetuned

(2)

this model

albertosei
/

invoice-ner-v2-amaye15

Invoice NER v2 — amaye15

Model Description

Training Data

Training Configuration

Results

Entity Types

Why F1 is Lower Than v1

Limitations

Sequential Fine-tuning Pipeline

Model tree for albertosei/invoice-ner-v2-amaye15

Dataset used to train albertosei/invoice-ner-v2-amaye15