Report for cardiffnlp/twitter-roberta-base-offensive

#49
by giskard-bot - opened
Giskard org

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 2 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset tweet_eval (subset offensive, split test).

👉Overconfidence issues (2)
Vulnerability Level Data slice Metric Transformation Deviation
Overconfidence major 🔴 avg_word_length(text) < 4.815 Overconfidence rate = 0.600 +29.73% than global
🔍✨Examples For records in the dataset where `avg_word_length(text)` < 4.815, we found a significantly higher number of overconfident wrong predictions (63 samples, corresponding to 60.0% of the wrong predictions in the data slice).
text avg_word_length(text) label Predicted label
441 @user nigga are you stupid your trash dont play with him play with your bitch 😂 4 offensive non-offensive (p = 0.95)
offensive (p = 0.05)
756 #ArianaAsesina? Is that serious?! Holy shit, please your fucking assholes, don't blame someone for the death of other one. She is sad enough for today, don't you see? It isn't fault of none, he had an overdose and died. End. Stop wanting someone to blame, fuckers. 4.76087 offensive non-offensive (p = 0.94)
offensive (p = 0.06)
96 #Liberals / #Democrats THIS is what you stand for. If not, then #WalkAway 4.69231 offensive non-offensive (p = 0.93)
offensive (p = 0.07)
Vulnerability Level Data slice Metric Transformation Deviation
Overconfidence major 🔴 avg_whitespace(text) >= 0.159 Overconfidence rate = 0.556 +20.30% than global
🔍✨Examples For records in the dataset where `avg_whitespace(text)` >= 0.159, we found a significantly higher number of overconfident wrong predictions (74 samples, corresponding to 55.639097744360896% of the wrong predictions in the data slice).
text avg_whitespace(text) label Predicted label
441 @user nigga are you stupid your trash dont play with him play with your bitch 😂 0.189873 offensive non-offensive (p = 0.95)
offensive (p = 0.05)
756 #ArianaAsesina? Is that serious?! Holy shit, please your fucking assholes, don't blame someone for the death of other one. She is sad enough for today, don't you see? It isn't fault of none, he had an overdose and died. End. Stop wanting someone to blame, fuckers. 0.170455 offensive non-offensive (p = 0.94)
offensive (p = 0.06)
96 #Liberals / #Democrats THIS is what you stand for. If not, then #WalkAway 0.164384 offensive non-offensive (p = 0.93)
offensive (p = 0.07)

Checkout out the Giskard Space and improve your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

Sign up or log in to comment