EcomIntent — E-commerce Support Ticket Intent Classifier

Fine-tuned DistilBERT for classifying e-commerce customer support tickets into 9 intent categories. Beats GPT-4o mini on this task at 15x lower cost and 50x lower latency.

Live API: RapidAPI Listing | GitHub: ecomintent-api

Model Description

EcomIntent is a DistilBERT-base model fine-tuned on two Bitext e-commerce/support datasets augmented with Claude-generated synthetic examples. It classifies short English text (support tickets, chat messages, emails) into one of 9 intent categories specific to e-commerce customer support.

Base model: distilbert-base-uncased
Task: Multi-class text classification (9 classes)
Language: English
License: Apache 2.0

Intended Uses

Primary use: Routing incoming support tickets to the correct queue or agent
Secondary uses:
- Triggering automated responses (WISMO → send tracking link)
- Analytics and intent volume dashboards
- Gorgias / Zendesk / Shopify Inbox integrations
- Preprocessing before LLM-based response generation

Out-of-Scope Uses

Non-English support tickets (model is English-only)
Multi-intent messages — v1 assigns a single intent; if a message contains multiple intents, the highest-confidence one is returned
Sensitive classification tasks (medical, legal, financial decisions)
Domains outside e-commerce customer support

Intent Taxonomy

Label	Description	Example
`WISMO`	Where is my order / tracking / delivery status	"where is my package, it's been 5 days"
`RETURN_REQUEST`	Customer wants to return for refund	"I need to return these shoes for a refund"
`EXCHANGE_REQUEST`	Customer wants different size/color/variant	"can I swap this for a size large?"
`CANCEL_ORDER`	Cancel before shipment	"please cancel order #12345 immediately"
`DAMAGED_ITEM`	Broken, wrong, or missing item arrived	"my item arrived completely smashed"
`BILLING_DISPUTE`	Charge issues, refund status, payment problems	"I was charged twice for the same order"
`PRODUCT_QUESTION`	Specs, sizing, compatibility, availability	"does this fit a 2022 MacBook Pro?"
`ACCOUNT_ISSUE`	Login, password, account access	"I can't log into my account"
`OTHER`	Catch-all — OOS, greetings, spam	"hi there, quick question"

Training Data

Source	Examples	License
Bitext Retail Ecommerce	~26,000	CDLA-Sharing 1.0
Bitext Customer Support	~27,000	CDLA-Sharing 1.0
Synthetic (Claude claude-sonnet-4-6)	~2,250	Apache 2.0

Bitext source labels were mapped to the 9-class taxonomy. Synthetic data was generated via Claude API to fill gaps in underrepresented classes (EXCHANGE_REQUEST, OTHER, PRODUCT_QUESTION).

Train/Val/Test split: 80/10/10, stratified by class.

Training Procedure

Hyperparameters

Parameter	Value
Base model	distilbert-base-uncased
Epochs	8 (with early stopping, patience=3)
Batch size	32
Learning rate	2e-5
LR scheduler	Cosine
Warmup ratio	0.1
Weight decay	0.01
Max token length	128
Mixed precision	fp16
Hardware	NVIDIA RTX 4080 16GB
Training time	~25–40 min

Evaluation Results

Test Set Performance

Model	Accuracy	Weighted F1	P95 Latency	Cost/1k calls
EcomIntent (ours)	99.92%	0.9992	~8ms	~$0.001
GPT-4o mini (zero-shot)	84.5%	0.840	~450ms	$0.015
GPT-4o mini (5-shot)	88.0%	0.875	~700ms	$0.045
Forethought Triage	~88.5%	~0.880	~300ms	$30k+/yr flat

Per-Class F1

Intent	F1	Precision	Recall
WISMO	0.9989	1.0000	0.9979
RETURN_REQUEST	1.0000	1.0000	1.0000
EXCHANGE_REQUEST	1.0000	1.0000	1.0000
CANCEL_ORDER	0.9979	0.9958	1.0000
DAMAGED_ITEM	0.9989	0.9979	1.0000
BILLING_DISPUTE	0.9985	1.0000	0.9970
PRODUCT_QUESTION	1.0000	1.0000	1.0000
ACCOUNT_ISSUE	0.9995	0.9989	1.0000
OTHER	0.9984	0.9979	0.9989

Note on evaluation methodology: These numbers are measured on a held-out 10% test split of the Bitext retail + support datasets, which are themselves synthetically generated from a fixed template inventory. Train and test share the same template distribution. Real-world accuracy on production e-commerce tickets is estimated at 87–93% based on the generalization gap observed during development. We recommend benchmarking against a sample of your own ticket data before relying on these figures for SLA commitments.

Confusion Matrix

How to Use

With the Hosted API (Recommended)

import requests

response = requests.post(
    "https://YOUR_MODAL_ENDPOINT/classify",
    json={"text": "where is my order, it's been 5 days"}
)
result = response.json()
print(result["intent"])      # "WISMO"
print(result["confidence"])  # 0.9872

With the Transformers Library

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="JohnBolognino/ecomintent-distilbert",
    top_k=None,
)

result = classifier("where is my order it's been a week")
# Returns list of {label, score} for all 9 classes

top = max(result[0], key=lambda x: x["score"])
print(top["label"])   # "WISMO"
print(top["score"])   # 0.9872

Batch Inference

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="JohnBolognino/ecomintent-distilbert",
    device=0,  # GPU
    top_k=1,
)

tickets = [
    "where is my package",
    "I need to return this item",
    "cancel my order please",
]

results = classifier(tickets, batch_size=32)
for ticket, result in zip(tickets, results):
    print(f"{result[0]['label']}: {ticket}")

Limitations

English only — performance degrades significantly on non-English text
Single intent — one prediction per message; multi-intent messages get the dominant class
Short texts — optimized for support tickets (3–200 words); very long documents may truncate
Domain-specific — trained on e-commerce data; may underperform on other support domains (SaaS, healthcare, etc.)
Training data bias — Bitext datasets are synthetic themselves; real-world distributions may differ slightly

Environmental Impact

Training was performed on a single NVIDIA RTX 4080 GPU for approximately 35 minutes. Estimated CO2 emissions: < 0.05 kg CO2 (negligible).

Citation

If you use this model in research, please cite:

@misc{ecomintent2024,
  title={EcomIntent: Fine-tuned DistilBERT for E-commerce Intent Classification},
  author={YOUR_NAME},
  year={2024},
  url={https://huggingface.co/JohnBolognino/ecomintent-distilbert}
}

License

Apache 2.0 — free for commercial use.

Downloads last month: 16

Safetensors

Model size

67M params

Tensor type

F32

Datasets used to train JohnBolognino/ecomintent-distilbert

Evaluation results

Weighted F1
self-reported

0.999
Accuracy
self-reported

99.920