Instructions to use a1hmad23/mva-call-classifier-v5_1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use a1hmad23/mva-call-classifier-v5_1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="a1hmad23/mva-call-classifier-v5_1")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("a1hmad23/mva-call-classifier-v5_1") model = AutoModelForSequenceClassification.from_pretrained("a1hmad23/mva-call-classifier-v5_1") - Notebooks
- Google Colab
- Kaggle
MVA Call Classifier (v5_1)
Multi-class classifier for caller utterances on outbound AI-agent qualification calls
for personal injury (Motor Vehicle Accident) legal referrals in the United States.
Fine-tuned from distilbert-base-uncased on ~43k labeled utterances plus ~2k
synthetic counter-examples.
Use case
The model classifies short caller utterances (1-2 sentences, ASR-transcribed, lowercase) into one of 39 response types covering qualification answers (e.g. ACC, NACC, INJ, NINJ, AT, NAT), call-state labels (e.g. HOSTILE, CONF, BOT), and overrides (e.g. DNC, AM, BDNC).
Inputs
Lowercase, ASR-style transcripts. Truncated to 128 tokens.
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch
model_id = "a1hmad23/mva-call-classifier-v5-1"
tokenizer = DistilBertTokenizerFast.from_pretrained(model_id)
model = DistilBertForSequenceClassification.from_pretrained(model_id)
model.eval()
text = "yes i was in an accident last month"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
logits = model(**inputs).logits
pred_id = logits.argmax(-1).item()
print(model.config.id2label[pred_id])
Labels
41 classes. The full mapping is in label2id.json and embedded in config.json.
Label semantics, precedence rules, and confusable-neighbor decision rules are
documented internally and are not redistributed with this model.
Limitations
- Trained on US English ASR-style text only.
- Designed for short utterances (most under 25 tokens). Longer text is truncated.
- The catch-all label
N(residual / filler) has lower recall (~0.40) by design — it absorbs ambiguous content that doesn't fit the other 38 categories. - Test set was reviewed once for label noise but residual annotation errors remain.
Training data
Proprietary call transcripts. Not redistributed.
Citation
Internal model. No public citation.
- Downloads last month
- 29