Instructions to use Vishesh062/customer-support-tweet-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Vishesh062/customer-support-tweet-classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="Vishesh062/customer-support-tweet-classifier")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("Vishesh062/customer-support-tweet-classifier") model = AutoModelForSequenceClassification.from_pretrained("Vishesh062/customer-support-tweet-classifier") - Notebooks
- Google Colab
- Kaggle
Customer Support Tweet Classifier (DistilBERT)
Fine-tuned distilbert-base-uncased for routing customer support tweets into seven categories: billing, technical, account, delivery, product, support, general.
Trained on 50,000 stratified examples from the Customer Support on Twitter dataset.
Results
- Accuracy: 99.5%
- Macro F1: 0.991
- Weighted F1: 0.994
Evaluated on a held-out test set of 307,569 tweets. The full analysis and side-by-side comparison with a TF-IDF + Logistic Regression baseline lives in the GitHub repository.
Usage
import pickle
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from huggingface_hub import hf_hub_download
REPO = "Vishesh062/customer-support-tweet-classifier"
tokenizer = AutoTokenizer.from_pretrained(REPO)
model = AutoModelForSequenceClassification.from_pretrained(REPO)
model.eval()
# Label encoder maps class indices to category names
le_path = hf_hub_download(repo_id=REPO, filename="label_encoder.pkl")
with open(le_path, "rb") as f:
le = pickle.load(f)
def classify(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
with torch.no_grad():
logits = model(**inputs).logits
return le.inverse_transform([logits.argmax(-1).item()])[0]
classify("I've been overcharged on my last bill")
# โ 'billing'
Limitations
The training labels are synthetic โ generated by keyword matching, not human annotation. This model is therefore a keyword detector with extra steps. Real-world deployment needs human-annotated training data to be meaningful.
See the GitHub repository for the full methodology, baseline comparison, error analysis, and known limitations.
Citation
If you use this model, please link to the main repository.
- Downloads last month
- 34