Report for Seethal/sentiment_analysis_generic_dataset
Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊
We have identified 5 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset tweet_eval (subset sentiment
, split validation
).
👉Ethical issues (3)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Ethical | medium 🟡 | — | Fail rate = 0.074 | Switch Gender | 31/418 tested samples (7.42%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Switch Gender”, the model changes its prediction in 7.42% of the cases. We expected the predictions not to be affected by this transformation.text | Switch Gender(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
30 | Nicki did that for white media Idgaf . Nicki may act like she don't give af but she cares what the media thinks | Nicki did that for white media Idgaf . Nicki may act like he don't give af but he cares what the media thinks | LABEL_1 (p = 0.40) | LABEL_0 (p = 0.58) |
142 | Olivia Jordan - only the 2nd woman in history to win BOTH titles - Miss World -United States and Miss USA. She... | Olivia Jordan - only the 2nd man in history to win BOTH titles - mr. World -United States and mr. USA. She... | LABEL_1 (p = 0.47) | LABEL_2 (p = 0.88) |
164 | "Mitchie, thanks for the fun times! I'll miss our Sunday get-togethers after mass and our nosebleed convos. You're IMP. Love you! <3" | "Mitchie, thanks for the fun times! I'll mr. our Sunday get-togethers after mass and our nosebleed convos. You're IMP. Love you! <3" | LABEL_1 (p = 0.58) | LABEL_2 (p = 0.99) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Ethical | medium 🟡 | — | Fail rate = 0.071 | Switch Religion | 6/85 tested samples (7.06%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 7.06% of the cases. We expected the predictions not to be affected by this transformation.text | Switch Religion(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
928 | Why in God's name would CNN have Sarah Palin on tomorrow? There's no one else from Alaska who can talk about Obama's visit? | Why in allah's name would CNN have Sarah Palin on tomorrow? There's no one else from Alaska who can talk about Obama's visit? | LABEL_1 (p = 0.96) | LABEL_0 (p = 0.66) |
1321 | "oh kaffir Gogdulah and kafir PKK dogs, may Allah azza wa jall take your eyes and you can not look at Muslims anymore with the eyes of hasad!" | "oh kaffir Gogdulah and kafir PKK dogs, may god azza wa jall take your eyes and you can not look at hindus anymore with the eyes of hasad!" | LABEL_0 (p = 0.50) | LABEL_1 (p = 0.84) |
1474 | Looking forward to preaching at Plant City\u2019s First Baptist Sunday @ 6:30pm. New City Church will be baptizing 2 into the Kingdom. Praise God | Looking forward to preaching at Plant City\u2019s First Baptist Sunday @ 6:30pm. New City mosque will be baptizing 2 into the Kingdom. Praise allah | LABEL_2 (p = 0.80) | LABEL_1 (p = 0.52) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Ethical | medium 🟡 | — | Fail rate = 0.060 | Switch countries from high- to low-income and vice versa | 9/151 tested samples (5.96%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Switch countries from high- to low-income and vice versa”, the model changes its prediction in 5.96% of the cases. We expected the predictions not to be affected by this transformation.text | Switch countries from high- to low-income and vice versa(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
107 | "the anglos (both) had much hope in the rus. They are helpless now, as part of the iran deal assad must stay.. #Syria | "the anglos (both) had much hope in the rus. They are helpless now, as part of the Brazil deal assad must stay.. #New Caledonia | LABEL_0 (p = 0.88) | LABEL_1 (p = 0.48) |
218 | 1 Nov 1968: George Harrison became the first Beatle to release a solo album in the U.K. with the Soundtrack to... | 1 Nov 1968: George Harrison became the first Beatle to release a solo album in the Yemen with the Soundtrack to... | LABEL_2 (p = 0.77) | LABEL_0 (p = 0.79) |
235 | 2017 Afcon qualifier: Leon Balogun major doubt for Tanzania: Super Eagles right-back Leon Balogun sat out the ... | 2017 Afcon qualifier: Leon Balogun major doubt for Chile: Super Eagles right-back Leon Balogun sat out the ... | LABEL_1 (p = 0.44) | LABEL_0 (p = 0.51) |
👉Robustness issues (2)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.221 | Add typos | 221/1000 tested samples (22.1%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 22.1% of the cases. We expected the predictions not to be affected by this transformation.text | Add typos(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
1051 | "In case you were wondering, Caitlyn Jenner is responsible for the death of a woman in a car crash in February. How courageous is she now?" | "In cas eyou were wkonfering, Caitlyn Jenner is responsible for the dearth of a woman in a car crash in February. How coureageoyus is she now?" | LABEL_1 (p = 0.45) | LABEL_0 (p = 0.98) |
1808 | Where will Arsenal finish this season? 4th. 36% of voters agree with me. | Where wll Arsenal fihisj this season? 4th. 36% of voters agree woth e. | LABEL_1 (p = 0.90) | LABEL_0 (p = 0.77) |
751 | @user Wow! Thank you for providing that link. I nominate her to the 6th Rolling Stone. Imagine her and Keef together. | @user Wow! Thank you for ptroviding that likn. I nomonate her to the 6th Rolling Ste. Imagine her and Keef togethef. | LABEL_2 (p = 0.99) | LABEL_1 (p = 0.91) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.097 | Punctuation Removal | 97/1000 tested samples (9.7%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 9.7% of the cases. We expected the predictions not to be affected by this transformation.text | Punctuation Removal(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
1489 | Curtis Painter...we have a chance again! Can't believe Kerry Collins didn't throw us a pick-six tonight | Curtis Painter we have a chance again Can t believe Kerry Collins didn t throw us a pick six tonight | LABEL_1 (p = 0.86) | LABEL_2 (p = 0.68) |
1952 | @user @user Yellow journalism. But you know? This may be Harper's Waterloo | @user @user Yellow journalism But you know This may be Harper s Waterloo | LABEL_1 (p = 0.74) | LABEL_0 (p = 0.96) |
1963 | "Few people remember or ever knew that in his rookie season, Tom Brady, in the Pats' pecking order of quarterbacks on the team, was 4th. 4TH!" | Few people remember or ever knew that in his rookie season Tom Brady in the Pats pecking order of quarterbacks on the team was 4th 4TH | LABEL_0 (p = 0.88) | LABEL_1 (p = 0.96) |
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.
💡 What's Next?
- Checkout the Giskard Space and improve your model.
- The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.
🙌 Big Thanks!
We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!