Text Generation
PEFT
Safetensors
English
lora
ocr
document-extraction
receipt
invoice
fine-tuned
conversational
Instructions to use avatar63/qwen-receipt-extractor with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use avatar63/qwen-receipt-extractor with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct") model = PeftModel.from_pretrained(base_model, "avatar63/qwen-receipt-extractor") - Notebooks
- Google Colab
- Kaggle
qwen-receipt-extractor
LoRA fine-tuned Qwen2.5-0.5B-Instruct for structured JSON extraction from noisy OCR receipts and invoices.
GitHub: avatar63/llm-doc-extract
Model description
Fine-tuned on ~2040 noisy OCR receipt/invoice examples using LoRA (rank 16). Takes raw, garbled OCR text and extracts structured JSON with company name, address, date, total amount, and line items.
Trained to run entirely locally โ no API calls at inference time. Documents never leave your machine.
Performance
Evaluated on 204 held-out examples vs base Qwen2.5-0.5B-Instruct:
| Field | Baseline | Fine-tuned | ฮ |
|---|---|---|---|
| JSON valid | 80.9% | 99.5% | +18.6% |
| Company name | 38.2% | 46.6% | +8.3% |
| Date | 15.2% | 83.8% | +68.6% |
| Total amount | 0.0% | 99.0% | +99.0% |
| Line items | 58.5% | 97.0% | +38.5% |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import json
BASE_MODEL = "Qwen/Qwen2.5-0.5B-Instruct"
ADAPTER = "avatar63/qwen-receipt-extractor"
INSTRUCTION = (
"Extract the following fields from the OCR text as JSON: "
"company_name, address, date, total_amount, line_items "
"(each with item_name, quantity, price). "
"Use null for any field that cannot be determined."
)
tokenizer = AutoTokenizer.from_pretrained(ADAPTER, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
BASE_MODEL, dtype=torch.float16, device_map="auto", trust_remote_code=True
)
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()
noisy_text = """
RELI4NCE FR3SH
Sh0p N0 12, 5ect0r 18
D4te: O5-ll-2O24
Net P4y4ble: 34O.OO
"""
messages = [
{"role": "system", "content": INSTRUCTION},
{"role": "user", "content": noisy_text}
]
text = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.1,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
generated = outputs[0][inputs["input_ids"].shape[1]:]
result = tokenizer.decode(generated, skip_special_tokens=True)
print(json.loads(result))
Demo Space: https://huggingface.co/spaces/avatar63/receipt-extractor-demo
Training details
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen2.5-0.5B-Instruct |
| Method | LoRA via HuggingFace PEFT + TRL |
| LoRA rank | 16 |
| LoRA alpha | 32 |
| Trainable parameters | 8.8M / 502M (1.75%) |
| Training examples | ~2040 |
| Epochs | 3 |
| Learning rate | 2e-4 |
| Hardware | RTX 3060 12GB |
| Training time | ~28 minutes |
Limitations
- Partial character-level denoising โ item names and company suffixes may retain some OCR noise
- Address hallucination on sparse/ambiguous inputs
- Net payable vs subtotal ambiguity on some receipts
- Trained primarily on Malaysian and synthetic English receipts
Datasets
- SROIE โ ICDAR 2019
- High Quality Invoice Images for OCR โ Kaggle
Base model
- Qwen2.5-0.5B-Instruct โ Qwen Team, Alibaba Cloud
- Downloads last month
- 24