Edit model card

English Sarcasm Detector

English Sarcasm Detector is a text classification model built to detect sarcasm from news article titles. It is fine-tuned on bert-base-uncased and the training data consists of ready-made dataset available on Kaggle.

Labels: 0 -> Not Sarcastic; 1 -> Sarcastic

Source Data

Datasets:

Training Dataset

Codebase:


Example of classification

from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer
import string

def preprocess_data(text: str) -> str:
   return text.lower().translate(str.maketrans("", "", string.punctuation)).strip()

MODEL_PATH = "helinivan/english-sarcasm-detector"

tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_PATH)

text = "CIA Realizes It's Been Using Black Highlighters All These Years."
tokenized_text = tokenizer([preprocess_data(text)], padding=True, truncation=True, max_length=256, return_tensors="pt")
output = model(**tokenized_text)
probs = output.logits.softmax(dim=-1).tolist()[0]
confidence = max(probs)
prediction = probs.index(confidence)
results = {"is_sarcastic": prediction, "confidence": confidence}

Output:

{'is_sarcastic': 1, 'confidence': 0.9337034225463867}

Performance

Model-Name F1 Precision Recall Accuracy
helinivan/english-sarcasm-detector 92.38 92.75 92.38 92.42
helinivan/italian-sarcasm-detector 88.26 87.66 89.66 88.69
helinivan/multilingual-sarcasm-detector 87.23 88.65 86.33 88.30
helinivan/dutch-sarcasm-detector 83.02 84.27 82.01 86.81
Downloads last month
19
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.