qwen-receipt-extractor

LoRA fine-tuned Qwen2.5-0.5B-Instruct for structured JSON extraction from noisy OCR receipts and invoices.

Model description

Fine-tuned on ~2040 noisy OCR receipt/invoice examples using LoRA (rank 16). Takes raw, garbled OCR text and extracts structured JSON with company name, address, date, total amount, and line items.

Trained to run entirely locally — no API calls at inference time. Documents never leave your machine.

Performance

Evaluated on 204 held-out examples vs base Qwen2.5-0.5B-Instruct:

Field	Baseline	Fine-tuned	Δ
JSON valid	80.9%	99.5%	+18.6%
Company name	38.2%	46.6%	+8.3%
Date	15.2%	83.8%	+68.6%
Total amount	0.0%	99.0%	+99.0%
Line items	58.5%	97.0%	+38.5%

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import json

BASE_MODEL = "Qwen/Qwen2.5-0.5B-Instruct"
ADAPTER = "avatar63/qwen-receipt-extractor"

INSTRUCTION = (
    "Extract the following fields from the OCR text as JSON: "
    "company_name, address, date, total_amount, line_items "
    "(each with item_name, quantity, price). "
    "Use null for any field that cannot be determined."
)

tokenizer = AutoTokenizer.from_pretrained(ADAPTER, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL, dtype=torch.float16, device_map="auto", trust_remote_code=True
)
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()

noisy_text = """
RELI4NCE FR3SH
Sh0p N0 12, 5ect0r 18
D4te: O5-ll-2O24
Net P4y4ble: 34O.OO
"""

messages = [
    {"role": "system", "content": INSTRUCTION},
    {"role": "user", "content": noisy_text}
]

text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.1,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

generated = outputs[0][inputs["input_ids"].shape[1]:]
result = tokenizer.decode(generated, skip_special_tokens=True)
print(json.loads(result))

Demo Space: https://huggingface.co/spaces/avatar63/receipt-extractor-demo

Training details

Parameter	Value
Base model	Qwen/Qwen2.5-0.5B-Instruct
Method	LoRA via HuggingFace PEFT + TRL
LoRA rank	16
LoRA alpha	32
Trainable parameters	8.8M / 502M (1.75%)
Training examples	~2040
Epochs	3
Learning rate	2e-4
Hardware	RTX 3060 12GB
Training time	~28 minutes

Limitations

Partial character-level denoising — item names and company suffixes may retain some OCR noise
Address hallucination on sparse/ambiguous inputs
Net payable vs subtotal ambiguity on some receipts
Trained primarily on Malaysian and synthetic English receipts

Datasets

SROIE — ICDAR 2019
High Quality Invoice Images for OCR — Kaggle

Base model

Qwen2.5-0.5B-Instruct — Qwen Team, Alibaba Cloud

Downloads last month: 24

Model tree for avatar63/qwen-receipt-extractor

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Adapter

(642)

this model

avatar63
/

qwen-receipt-extractor

qwen-receipt-extractor

Model description

Performance

Usage

Demo Space: https://huggingface.co/spaces/avatar63/receipt-extractor-demo

Training details

Limitations

Datasets

Base model

Model tree for avatar63/qwen-receipt-extractor

Space using avatar63/qwen-receipt-extractor 1