mitulshah/transaction-categorization
Viewer • Updated • 4.5M • 287 • 10
How to use maaz-zaidi/transaction-classifier-minilm with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="maaz-zaidi/transaction-classifier-minilm") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("maaz-zaidi/transaction-classifier-minilm")
model = AutoModelForSequenceClassification.from_pretrained("maaz-zaidi/transaction-classifier-minilm")How to use maaz-zaidi/transaction-classifier-minilm with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("maaz-zaidi/transaction-classifier-minilm")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]A fine-tuned sentence-transformers/all-MiniLM-L6-v2 model that classifies raw bank transaction strings into 10 budget categories using standard cross-entropy fine-tuning.
This is version 4 (Phase 4b) in a progressive model development series. It was the production model before being succeeded by the metadata-enriched variant (v7).
| Property | Value |
|---|---|
| Base model | sentence-transformers/all-MiniLM-L6-v2 (22M params) |
| Task | Multi-class text classification (10 categories) |
| Training samples | 8,000 |
| Epochs | 3 |
| Batch size | 64 |
| Learning rate | 2e-5 |
| Max sequence length | 64 tokens |
| Loss | Cross-entropy |
| Format | SafeTensors |
| Trained | 2026-03-29 |
| ID | Category |
|---|---|
| 0 | Food & Dining |
| 1 | Transportation |
| 2 | Shopping & Retail |
| 3 | Entertainment & Recreation |
| 4 | Healthcare & Medical |
| 5 | Utilities & Services |
| 6 | Financial Services |
| 7 | Income |
| 8 | Government & Legal |
| 9 | Charity & Donations |
Evaluated on 505 unique real-world RBC transactions (3,113 weighted, 2019-2026). Results shown are after Phase 4b preprocessing fixes.
| Metric | Score |
|---|---|
| Real-world accuracy (weighted) | 86.5% |
| ML-only accuracy | 78.7% |
| Validation accuracy | 93.0% |
| Category | Accuracy |
|---|---|
| Income | 100.0% |
| Healthcare & Medical | 100.0% |
| Financial Services | 94.7% |
| Food & Dining | 89.3% |
| Entertainment & Recreation | 88.6% |
| Transportation | 83.3% |
| Shopping & Retail | 78.9% |
| Utilities & Services | 68.4% |
| Government & Legal | 54.5% |
| Charity & Donations | 0.0% |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "maaz-zaidi/transaction-classifier-minilm"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
categories = [
"Food & Dining", "Transportation", "Shopping & Retail",
"Entertainment & Recreation", "Healthcare & Medical",
"Utilities & Services", "Financial Services", "Income",
"Government & Legal", "Charity & Donations"
]
text = "UBER TRIP HELP.UBER.COM ON"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=64)
with torch.no_grad():
logits = model(**inputs).logits
predicted = torch.argmax(logits, dim=-1).item()
print(f"Category: {categories[predicted]}")
# Output: Category: Transportation
See the Transaction Classifier collection for all 7 model versions.
@misc{zaidi2026txnclassifier,
title={Transaction Classifier: Multi-Stage Bank Transaction Categorization},
author={Maaz Zaidi},
year={2026},
url={https://huggingface.co/maaz-zaidi/transaction-classifier-minilm}
}
Base model
sentence-transformers/all-MiniLM-L6-v2