language: - en license: apache-2.0 tags: - text-classification - customer-support - ticket-classification - distilbert datasets: - custom metrics: - accuracy model-index: - name: ticket-classification-v1 results: - task: type: text-classification name: Text Classification dataset: name: Custom Ticket Dataset type: custom metrics: - name: Accuracy type: accuracy value: 0.9485

Model Card for Dragneel/ticket-classification-v1

This model fine-tunes the DistilBERT base uncased model to classify customer support tickets into four categories. It achieves 94.85% accuracy on the evaluation dataset.

Model Details

Model Description

This model is designed to automatically categorize customer support tickets based on their content. It can classify tickets into the following categories:

Billing Question: Issues related to billing, payments, subscriptions, etc.
Feature Request: Suggestions for new features or improvements
General Inquiry: General questions about products or services
Technical Issue: Technical problems, bugs, errors, etc.

The model uses DistilBERT as its base architecture - a distilled version of BERT that is smaller, faster, and more efficient while retaining good performance.

Developed by: Dragneel
Model type: Text Classification
Language(s): English
License: Apache 2.0
Finetuned from model: distilbert/distilbert-base-uncased

Uses

Direct Use

This model can be directly used for:

Automated ticket routing and prioritization
Customer support workflow optimization
Analytics on ticket categories
Real-time ticket classification

Out-of-Scope Use

This model should not be used for:

Processing sensitive customer information without proper privacy measures
Making final decisions without human review for complex or critical issues
Classifying tickets in languages other than English
Categorizing content outside the customer support domain

Bias, Risks, and Limitations

The model was trained on a specific dataset and may not generalize well to significantly different customer support contexts
Performance may degrade for very technical or domain-specific tickets not represented in the training data
Very short or ambiguous tickets might be misclassified

Recommendations

Users should review classifications for accuracy, especially for tickets that fall on the boundary between categories. Consider retraining the model on domain-specific data if using in a specialized industry.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import pipeline

# Load the model
classifier = pipeline("text-classification", model="Dragneel/ticket-classification-v1")

# Example tickets
tickets = [
    "I was charged twice for my subscription this month. Can you help?",
    "The app keeps crashing whenever I try to upload a file",
    "Would it be possible to add dark mode to the dashboard?",
    "What are your business hours?"
]

# Classify tickets
for ticket in tickets:
    result = classifier(ticket)
    print(f"Ticket: {ticket}")
    print(f"Category: {result[0]['label']}")
    print(f"Confidence: {result[0]['score']:.4f}")
    print()

ID to Label Mapping

id_to_label = {
    0: 'Billing Question', 
    1: 'Feature Request', 
    2: 'General Inquiry', 
    3: 'Technical Issue'
}

Training Details

Training Data

The model was trained on a dataset of customer support tickets that include diverse examples across all four categories. Each ticket typically contains a title and description detailing the customer's issue or request.

Training Procedure

Training Hyperparameters

Learning rate: 0.001
Batch size: 2
Epochs: 10 (with early stopping)
Weight decay: 0.01
Early stopping patience: 2 epochs
Optimizer: AdamW
Training regime: fp32

Evaluation

Testing Data, Factors & Metrics

Metrics

The model is evaluated using the following metrics:

Accuracy: Percentage of correctly classified tickets
Loss: Cross-entropy loss on the evaluation dataset

Results

The model achieved the following metrics on the evaluation dataset:

Metric	Value
Accuracy	94.85%
Loss	0.248
Runtime	16.01s
Samples/second	23.05

Technical Specifications

Model Architecture and Objective

The model architecture is based on DistilBERT, a distilled version of BERT. It consists of the base DistilBERT model with a classification head layer on top. The model was fine-tuned using cross-entropy loss to predict the correct category for each ticket.

Model Card Contact

For inquiries about this model, please open an issue on the model repository.