RelayOps Intent Classifier (Qwen2.5-1.5B LoRA)

Tier-1 intent classifier for RelayOps, a production-shaped telecom customer-service agent. It maps a customer message to exactly one of six intents and emits strict JSON, intent only.

This is a small open-source model fine-tuned with LoRA โ€” it is not Claude. RelayOps keeps frontier (Claude) models for Tier-2 reasoning; this cheap classifier exists to handle the easy-majority of routing so the frontier model is reserved for hard / low-confidence / action cases.

Model details

  • Base (inference): Qwen/Qwen2.5-1.5B-Instruct โ€” RelayOps loads the adapter over this full-precision base.
  • Trained on: unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit (Unsloth 4-bit QLoRA); the adapter loads on either base.
  • Method: Unsloth + LoRA/QLoRA (adapter only)
  • Task: single-label intent classification, output {"intent": "<label>"}
  • Labels: reset_device, device_status, device_faq, billing, greeting, unknown
  • Dataset: 2,400 examples, 400 per intent, curated seeds + deterministic template paraphrases, with group ids so paraphrase families don't leak across splits.

Intended use & scope

  • Input: one customer chat message. Output: one intent label as JSON.
  • The model classifies intent only. It does not decide risk, route, permissions, billing, offers, or account access โ€” those are enforced by RelayOps' deterministic access gate and router (policy stays out of model weights).
  • Confidence is read from the model's own token probabilities at inference, not baked into labels.

Out-of-scope use

Do not use this model to make billing, payment, plan-change, access-control, offer, or customer-eligibility decisions. It predicts intent only; those decisions belong to RelayOps' deterministic access gate, router, and human escalation. It is not a general-purpose intent model โ€” it is trained on six telecom intents over synthetic data.

Evaluation

Split Accuracy Macro-F1
Held-out (seed-13, group-aware, 726 ex) 0.999 0.999
Hand-written adversarial / paraphrase (24 ex) 0.958 0.804

Baselines on the same sets: keyword 0.506 / 0.250 acc; Complement NB 0.933 / 0.667 acc.

Honest caveat. The held-out set is template-generated synthetic data, so high in-distribution scores are expected even with anti-leakage splits. Treat the held-out number as routing-slice validation, not a production benchmark; the adversarial set is the truer generalization signal, and the adversarial macro-F1 (0.804 < 0.958 accuracy) shows the model is still uneven on the hardest classes.

Limitations

  • Trained on synthetic telecom data for six intents; not a general intent model.
  • Out-of-taxonomy / mixed-intent / abusive messages map to unknown, which RelayOps escalates โ€” the model does not resolve them.
  • Adversarial set is small (24); per-class adversarial recall and a larger set are follow-ups.

How to use (in RelayOps)

RELAYOPS_INTENT_MODEL=<this-repo-or-local-adapter-dir> \
  python -m src.eval.run_intent_eval

or in code:

from src.router.registry import get_classifier
clf = get_classifier("finetuned")   # reads RELAYOPS_INTENT_MODEL
clf.classify("my internet is down")  # -> Classification(intent=reset_device, ...)

Reproduce

Training recipe: src/router/finetune_train.py (Unsloth LoRA). Data export: src/eval/export_finetune_data.py. Colab notebook: notebooks/finetune_intent_colab.ipynb.

Downloads last month
35
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for venkatamanideep/relayops-intent-qwen

Adapter
(1026)
this model