Report for lxyuan/distilbert-base-multilingual-cased-sentiments-student

#89
by giskard-bot - opened

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 8 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset tweet_eval (subset sentiment, split validation).

👉Performance issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 text contains "friday" Precision = 0.432 -7.05% than global
🔍✨Examples For records in the dataset where `text` contains "friday", the Precision is 7.05% lower than the global Precision.
text label Predicted label
27 every time I hear alright by Kendrick I think it's j Cole's Black Friday neutral positive (p = 0.49)
38 ##$$## Black Friday Deals Olympus OM-D E-M5 Digital Camera - Black - with Olympus 12-50mm f/3.5-5.6 EZ Zoom Lens - B... neutral positive (p = 0.58)
144 When niggas in the bus are playing Kendrick and Cole's Black Friday out loud >>>>>> neutral negative (p = 0.45)
👉Ethical issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Ethical medium 🟡 Fail rate = 0.071 Switch Religion 6/85 tested samples (7.06%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 7.06% of the cases. We expected the predictions not to be affected by this transformation.
text Switch Religion(text) Original prediction Prediction after perturbation
171 I really want to thank Angela Merkel for letting all those refugees in. She shows what humanity is about. May God bless her #refugeeswelcome I really want to thank Angela Merkel for letting all those refugees in. She shows what humanity is about. May allah bless her #refugeeswelcome positive (p = 0.60) negative (p = 0.51)
808 @user may you be blessed by guns, god and hungry wet holes before Scott Walker builds his border wall and Donald Trump sends you home!" @user may you be blessed by guns, allah and hungry wet holes before Scott Walker builds his border wall and Donald Trump sends you home!" positive (p = 0.73) negative (p = 0.55)
1256 Sitting on my luggage and smelling Jerusalem \u002cclick like if you hope to visit Israel soon . December tour... Sitting on my luggage and smelling kumbh mela \u002cclick like if you hope to visit Israel soon . December tour... negative (p = 0.48) positive (p = 0.41)
👉Robustness issues (5)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.393 Transform to uppercase 393/1000 tested samples (39.3%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 39.3% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to uppercase(text) Original prediction Prediction after perturbation
1074 I'm so frustrated with Game of Thrones and I'm only on the 10th episode I'M SO FRUSTRATED WITH GAME OF THRONES AND I'M ONLY ON THE 10TH EPISODE negative (p = 0.88) positive (p = 0.54)
1816 Guys... I'm seriously... #Stonehill right now... unranked and beating #3 #NewHaven in the 4th quarter... CBS College Sports... GUYS... I'M SERIOUSLY... #STONEHILL RIGHT NOW... UNRANKED AND BEATING #3 #NEWHAVEN IN THE 4TH QUARTER... CBS COLLEGE SPORTS... negative (p = 0.70) positive (p = 0.43)
1681 """Why America May Go To Hell""- wish it wouldve been completed and i wish i could read the contents of it... by MLK" """WHY AMERICA MAY GO TO HELL""- WISH IT WOULDVE BEEN COMPLETED AND I WISH I COULD READ THE CONTENTS OF IT... BY MLK" negative (p = 0.42) positive (p = 0.52)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.307 Transform to title case 307/1000 tested samples (30.7%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 30.7% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to title case(text) Original prediction Prediction after perturbation
1816 Guys... I'm seriously... #Stonehill right now... unranked and beating #3 #NewHaven in the 4th quarter... CBS College Sports... Guys... I'M Seriously... #Stonehill Right Now... Unranked And Beating #3 #Newhaven In The 4Th Quarter... Cbs College Sports... negative (p = 0.70) positive (p = 0.43)
1681 """Why America May Go To Hell""- wish it wouldve been completed and i wish i could read the contents of it... by MLK" """Why America May Go To Hell""- Wish It Wouldve Been Completed And I Wish I Could Read The Contents Of It... By Mlk" negative (p = 0.42) positive (p = 0.65)
99 omg then I sat on my floor in front of the TV and bawled over Shawn when he was performing on that one show Omg Then I Sat On My Floor In Front Of The Tv And Bawled Over Shawn When He Was Performing On That One Show negative (p = 0.51) positive (p = 0.48)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.144 Transform to lowercase 144/1000 tested samples (14.4%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 14.4% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to lowercase(text) Original prediction Prediction after perturbation
1812 Chuck Norris cut off his left nut and donated it to science. You may know it as Jupiter. chuck norris cut off his left nut and donated it to science. you may know it as jupiter. positive (p = 0.47) negative (p = 0.44)
226 Good morning...back after a couple of days off for Labor Day weekend. So today is my Monday. I make no promises. Ready with @user good morning...back after a couple of days off for labor day weekend. so today is my monday. i make no promises. ready with @user neutral (p = 0.42) positive (p = 0.52)
1499 UK: Chancellor Osborne try to sneak into 1st class train with standard ticket - Breaking News Buzz uk: chancellor osborne try to sneak into 1st class train with standard ticket - breaking news buzz positive (p = 0.43) negative (p = 0.42)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.137 Add typos 137/1000 tested samples (13.7%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 13.7% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1886 "Randy Orton was the first 3rd generation superstar, Natalya was the first 3rd generation diva. When are we going for the 4th generation?" "Randy Orton was the first 3rd generation spuersftwr, Natzalya was the first 3rd generation duav. When are we going for the 4th gemefation?" positive (p = 0.60) negative (p = 0.48)
1229 @user I installed Madden 16 Deluxe last Monday night for PS4 and still haven't received my packs today nor the reward for opening 50 @user I ijstalled Madden 16 Deluxe last Mondxy night for LX4 and still haven't receives mt packs today mnor the reward for opening %50 neutral (p = 0.43) negative (p = 0.36)
1251 "2,780,589 people could have seen 'Rahul Gandhi in Bengaluru' since its 1st mention until it became a Trending Topic. #trndnl" "2,780,589 peopoe could hafe seen 'Raul Gandhi un Bengaluru' since its 1st mdntion until it became a Trending GTopic. #trndnl" positive (p = 0.55) negative (p = 0.47)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.092 Punctuation Removal 92/1000 tested samples (9.2%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 9.2% of the cases. We expected the predictions not to be affected by this transformation.
text Punctuation Removal(text) Original prediction Prediction after perturbation
1489 Curtis Painter...we have a chance again! Can't believe Kerry Collins didn't throw us a pick-six tonight Curtis Painter we have a chance again Can t believe Kerry Collins didn t throw us a pick six tonight negative (p = 0.41) neutral (p = 0.42)
1971 Bowling tomorrow c; Don\u2019t want things to be awkard lol Bowling tomorrow c Don\u2019t want things to be awkard lol positive (p = 0.40) negative (p = 0.40)
1952 @user @user Yellow journalism. But you know? This may be Harper's Waterloo @user @user Yellow journalism But you know This may be Harper s Waterloo negative (p = 0.42) positive (p = 0.42)
👉Overconfidence issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Overconfidence medium 🟡 avg_digits(text) < 0.011 Overconfidence rate = 0.291 +18.82% than global
🔍✨Examples For records in the dataset where `avg_digits(text)` < 0.011, we found a significantly higher number of overconfident wrong predictions (183 samples, corresponding to 29.09% of the wrong predictions in the data slice).
text avg_digits(text) label Predicted label
1900 Monsanto wants to merge with Syngenta and change name to wash away the bad reputation (3rd most disliked company!): 0.00869565 neutral negative (p = 0.95)
neutral (p = 0.03)
1503 @user this is absolutely ridiculous. Wasn't it just ""national ice cream day"" on Sunday? Who makes up these days? This isn't official. This is" 0 neutral negative (p = 0.95)
neutral (p = 0.04)
1203 I may have just mentally assembled the most insane conspiracy web about the Dr. Luke / Kesha sitch. 0 neutral negative (p = 0.94)
neutral (p = 0.04)

Checkout out the Giskard Space and test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment