Report for cardiffnlp/twitter-roberta-base-irony

#117
by giskard-bot - opened
Giskard org

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 3 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset tweet_eval (subset irony, split test).

👉Performance issues (3)

For records in the dataset where text contains "user", the Recall is 34.98% lower than the global Recall.

Level Data slice Metric Deviation
major 🔴 text contains "user" Recall = 0.366 -34.98% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text label Predicted label
19 @user Guess they didn't get the memo reg non-nuclear Baltic sea #sarcasm irony non_irony (p = 0.51)
25 @user hmm... let me think about that #sarcasm irony non_irony (p = 0.91)
47 @user 180 dead on 26/11 n more than 10k our ppl killed in terror attacks till date but not 1 paki show sympathy 2 them #irony irony non_irony (p = 0.71)

For records in the dataset where text contains "irony", the Accuracy is 27.73% lower than the global Accuracy.

Level Data slice Metric Deviation
major 🔴 text contains "irony" Accuracy = 0.531 -27.73% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text label Predicted label
23 Who told the #hipsters that #irony was a thing of the Clinton years? Do they not carry history books in used bookstores in #brooklyn ? irony non_irony (p = 0.65)
47 @user 180 dead on 26/11 n more than 10k our ppl killed in terror attacks till date but not 1 paki show sympathy 2 them #irony irony non_irony (p = 0.71)
65 #Irony RT @user If you're going to give someone a scathing, 1-Star review for poor grammar, FFS use proper grammar. irony non_irony (p = 0.71)

For records in the dataset where text contains "sarcasm", the Accuracy is 12.15% lower than the global Accuracy.

Level Data slice Metric Deviation
major 🔴 text contains "sarcasm" Accuracy = 0.645 -12.15% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text label Predicted label
4 So much #sarcasm at work mate 10/10 #boring 100% #dead mate full on #shit absolutely #sleeping mate can't handle the #sarcasm irony non_irony (p = 0.93)
6 People complain about my backround pic and all I feel is like "hey don't blame me, Albert E might have spoken those words" #sarcasm #life irony non_irony (p = 0.73)
19 @user Guess they didn't get the memo reg non-nuclear Baltic sea #sarcasm irony non_irony (p = 0.51)

Checkout out the Giskard Space and test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment