YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

language: - en license: apache-2.0 tags: - text-classification - customer-support - ticket-classification - distilbert datasets: - custom metrics: - accuracy model-index: - name: ticket-classification-v1 results: - task: type: text-classification name: Text Classification dataset: name: Custom Ticket Dataset type: custom metrics: - name: Accuracy type: accuracy value: 0.9485

Model Card for Dragneel/ticket-classification-v1

This model fine-tunes the DistilBERT base uncased model to classify customer support tickets into four categories. It achieves 94.85% accuracy on the evaluation dataset.

Model Details

Model Description

This model is designed to automatically categorize customer support tickets based on their content. It can classify tickets into the following categories:

  • Billing Question: Issues related to billing, payments, subscriptions, etc.
  • Feature Request: Suggestions for new features or improvements
  • General Inquiry: General questions about products or services
  • Technical Issue: Technical problems, bugs, errors, etc.

The model uses DistilBERT as its base architecture - a distilled version of BERT that is smaller, faster, and more efficient while retaining good performance.

Uses

Direct Use

This model can be directly used for:

  • Automated ticket routing and prioritization
  • Customer support workflow optimization
  • Analytics on ticket categories
  • Real-time ticket classification

Out-of-Scope Use

This model should not be used for:

  • Processing sensitive customer information without proper privacy measures
  • Making final decisions without human review for complex or critical issues
  • Classifying tickets in languages other than English
  • Categorizing content outside the customer support domain

Bias, Risks, and Limitations

  • The model was trained on a specific dataset and may not generalize well to significantly different customer support contexts
  • Performance may degrade for very technical or domain-specific tickets not represented in the training data
  • Very short or ambiguous tickets might be misclassified

Recommendations

Users should review classifications for accuracy, especially for tickets that fall on the boundary between categories. Consider retraining the model on domain-specific data if using in a specialized industry.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import pipeline

# Load the model
classifier = pipeline("text-classification", model="Dragneel/ticket-classification-v1")

# Example tickets
tickets = [
    "I was charged twice for my subscription this month. Can you help?",
    "The app keeps crashing whenever I try to upload a file",
    "Would it be possible to add dark mode to the dashboard?",
    "What are your business hours?"
]

# Classify tickets
for ticket in tickets:
    result = classifier(ticket)
    print(f"Ticket: {ticket}")
    print(f"Category: {result[0]['label']}")
    print(f"Confidence: {result[0]['score']:.4f}")
    print()

ID to Label Mapping

id_to_label = {
    0: 'Billing Question', 
    1: 'Feature Request', 
    2: 'General Inquiry', 
    3: 'Technical Issue'
}

Training Details

Training Data

The model was trained on a dataset of customer support tickets that include diverse examples across all four categories. Each ticket typically contains a title and description detailing the customer's issue or request.

Training Procedure

Training Hyperparameters

  • Learning rate: 0.001
  • Batch size: 2
  • Epochs: 10 (with early stopping)
  • Weight decay: 0.01
  • Early stopping patience: 2 epochs
  • Optimizer: AdamW
  • Training regime: fp32

Evaluation

Testing Data, Factors & Metrics

Metrics

The model is evaluated using the following metrics:

  • Accuracy: Percentage of correctly classified tickets
  • Loss: Cross-entropy loss on the evaluation dataset

Results

The model achieved the following metrics on the evaluation dataset:

Metric Value
Accuracy 94.85%
Loss 0.248
Runtime 16.01s
Samples/second 23.05

Technical Specifications

Model Architecture and Objective

The model architecture is based on DistilBERT, a distilled version of BERT. It consists of the base DistilBERT model with a classification head layer on top. The model was fine-tuned using cross-entropy loss to predict the correct category for each ticket.

Model Card Contact

For inquiries about this model, please open an issue on the model repository.


Downloads last month
66
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.