Report for cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual
Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊
We have identified 6 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset tweet_eval (subset sentiment
, split validation
).
👉Ethical issues (1)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Ethical | medium 🟡 | — | Fail rate = 0.060 | Switch countries from high- to low-income and vice versa | 9/151 tested samples (5.96%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Switch countries from high- to low-income and vice versa”, the model changes its prediction in 5.96% of the cases. We expected the predictions not to be affected by this transformation.text | Switch countries from high- to low-income and vice versa(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
209 | WTI crude at a premium to Brent out to July. Supply glut focus going global as Iran gets ready to pump and dump | WTI crude at a premium to Brent out to July. Supply glut focus going global as Tuvalu gets ready to pump and dump | neutral (p = 0.60) | positive (p = 0.52) |
218 | 1 Nov 1968: George Harrison became the first Beatle to release a solo album in the U.K. with the Soundtrack to... | 1 Nov 1968: George Harrison became the first Beatle to release a solo album in the Cameroon with the Soundtrack to... | positive (p = 0.58) | neutral (p = 0.50) |
308 | Lord Sugar named best business role model in the UK + Kim Kardashian came 3rd as voted by students. Was Santa 2nd? | Lord Sugar named best business role model in the Uzbekistan + Kim Kardashian came 3rd as voted by students. Was Santa 2nd? | positive (p = 0.62) | neutral (p = 0.50) |
👉Robustness issues (5)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.233 | Transform to uppercase | 233/1000 tested samples (23.3%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 23.3% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to uppercase(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
1816 | Guys... I'm seriously... #Stonehill right now... unranked and beating #3 #NewHaven in the 4th quarter... CBS College Sports... | GUYS... I'M SERIOUSLY... #STONEHILL RIGHT NOW... UNRANKED AND BEATING #3 #NEWHAVEN IN THE 4TH QUARTER... CBS COLLEGE SPORTS... | negative (p = 0.55) | positive (p = 0.69) |
1681 | """Why America May Go To Hell""- wish it wouldve been completed and i wish i could read the contents of it... by MLK" | """WHY AMERICA MAY GO TO HELL""- WISH IT WOULDVE BEEN COMPLETED AND I WISH I COULD READ THE CONTENTS OF IT... BY MLK" | neutral (p = 0.55) | negative (p = 0.69) |
198 | @user @user November 9th, marked it down. Golden St. comes to L.A., we'll see then. ;)" | @USER @USER NOVEMBER 9TH, MARKED IT DOWN. GOLDEN ST. COMES TO L.A., WE'LL SEE THEN. ;)" | neutral (p = 0.55) | positive (p = 0.66) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.152 | Transform to title case | 152/1000 tested samples (15.2%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 15.2% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to title case(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
1816 | Guys... I'm seriously... #Stonehill right now... unranked and beating #3 #NewHaven in the 4th quarter... CBS College Sports... | Guys... I'M Seriously... #Stonehill Right Now... Unranked And Beating #3 #Newhaven In The 4Th Quarter... Cbs College Sports... | negative (p = 0.55) | positive (p = 0.54) |
99 | omg then I sat on my floor in front of the TV and bawled over Shawn when he was performing on that one show | Omg Then I Sat On My Floor In Front Of The Tv And Bawled Over Shawn When He Was Performing On That One Show | positive (p = 0.60) | neutral (p = 0.54) |
1666 | "If it ain't broke don't fix it, why move kris Bryant up to 3rd when he's hitting as good as he has all season at 5" | "If It Ain'T Broke Don'T Fix It, Why Move Kris Bryant Up To 3Rd When He'S Hitting As Good As He Has All Season At 5" | negative (p = 0.52) | neutral (p = 0.81) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.142 | Add typos | 142/1000 tested samples (14.2%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 14.2% of the cases. We expected the predictions not to be affected by this transformation.text | Add typos(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
1225 | Great story about Sam Smith on CBS Sunday Morning...Sam talked about how his success came when he exposed his... | rGeat story about am Smith on CBS Sunday Morning...Sam talked about hoa his success cake when he exposed his... | positive (p = 0.94) | neutral (p = 0.88) |
1442 | "Zack, Type 1 for too long, Wishing it was Friday so I can listen to Iron Maiden's new album. #dcde" | "Zack, Type 1 for yoo long, Wishing it wzs Friday so I can kisten to Iron Maiden's bew album. #dcde" | neutral (p = 0.70) | positive (p = 0.95) |
1613 | @user I just read your diagnosis on Pistorius back from July 3. I searched google for """"pistorius psychopath"""" because I see his pic" | @user I just resd your diagnosis on Pidtorius back from July 3. I searched google for """"pistorius psychopath"""" because I see hois pic" | neutral (p = 0.77) | negative (p = 0.51) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.087 | Transform to lowercase | 87/1000 tested samples (8.7%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 8.7% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to lowercase(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
99 | omg then I sat on my floor in front of the TV and bawled over Shawn when he was performing on that one show | omg then i sat on my floor in front of the tv and bawled over shawn when he was performing on that one show | positive (p = 0.60) | neutral (p = 0.40) |
1704 | ".@LenKasper: ""Bryant has hit some big home runs..."" [Kris Bryant hits a game-tying two-run HR in the 8th]" | ".@lenkasper: ""bryant has hit some big home runs..."" [kris bryant hits a game-tying two-run hr in the 8th]" | positive (p = 0.87) | neutral (p = 0.57) |
900 | "For the 1st time, Hindus declined to less than 80% population whereas Muslims increased by 0.8%. #Census2011 | "for the 1st time, hindus declined to less than 80% population whereas muslims increased by 0.8%. #census2011 | neutral (p = 0.60) | negative (p = 0.72) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.080 | Punctuation Removal | 80/1000 tested samples (8.0%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 8.0% of the cases. We expected the predictions not to be affected by this transformation.text | Punctuation Removal(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
1798 | Mariah Carey's Twins Hilariously Stole the Show at Their Mom's Walk of Fame Ceremony | Fox News Insider | Mariah Carey s Twins Hilariously Stole the Show at Their Mom s Walk of Fame Ceremony | Fox News Insider |
1329 | "Jacob I'm going to see Sam Smith tomorrow, wanna come with?" | Jacob I m going to see Sam Smith tomorrow wanna come with | positive (p = 0.67) | neutral (p = 0.65) |
601 | If I celebrate it wrong will Thor beat me with his hammer? | If I celebrate it wrong will Thor beat me with his hammer | neutral (p = 0.69) | negative (p = 0.85) |
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.
💡 What's Next?
- Checkout the Giskard Space and improve your model.
- The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.
🙌 Big Thanks!
We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!