Report for mrm8488/bert-tiny-finetuned-sms-spam-detection

#104
by giskard-bot - opened
Giskard org

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 9 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset sms_spam (subset plain_text, split train).

👉Performance issues (7)

For records in the dataset where avg_digits(text) < 0.005, the Recall is 83.05% lower than the global Recall.

Level Data slice Metric Deviation
major 🔴 avg_digits(text) < 0.005 Recall = 0.154 -83.05% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text avg_digits(text) label Predicted label
54 SMS. ac Sptv: The New Jersey Devils and the Detroit Red Wings play Ice Hockey. Correct or Incorrect? End? Reply END SPTV 0 LABEL_1 LABEL_0 (p = 0.62)
68 Did you hear about the new "Divorce Barbie"? It comes with all of Ken's stuff! 0 LABEL_1 LABEL_0 (p = 0.94)
270 Ringtone Club: Get the UK singles chart on your mobile each week and choose any top quality ringtone! This message is free of charge. 0 LABEL_1 LABEL_0 (p = 0.57)

For records in the dataset where avg_whitespace(text) >= 0.225, the Balanced Accuracy is 21.32% lower than the global Balanced Accuracy.

Level Data slice Metric Deviation
major 🔴 avg_whitespace(text) >= 0.225 Balanced Accuracy = 0.749 -21.32% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text avg_whitespace(text) label Predicted label
323 cud u tell ppl im gona b a bit l8 cos 2 buses hav gon past cos they were full & im still waitin 4 1. Pete x 0.259259 LABEL_0 LABEL_1 (p = 0.66)
4514 Money i have won wining number 946 wot do i do next 0.230769 LABEL_1 LABEL_0 (p = 0.92)
4873 Hi dis is yijue i would be happy to work wif ü all for gek1510... 0.227273 LABEL_0 LABEL_1 (p = 0.71)

For records in the dataset where text contains "ok", the Balanced Accuracy is 15.95% lower than the global Balanced Accuracy.

Level Data slice Metric Deviation
major 🔴 text contains "ok" Balanced Accuracy = 0.800 -15.95% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text label Predicted label
5 FreeMsg Hey there darling it's been 3 week's now and no word back! I'd like some fun you up for it still? Tb ok! XxX std chgs to send, £1.50 to rcv LABEL_1 LABEL_0 (p = 0.78)
4249 accordingly. I repeat, just text the word ok on your mobile phone and send LABEL_1 LABEL_0 (p = 0.93)

For records in the dataset where avg_word_length(text) < 3.891 AND avg_word_length(text) >= 3.306, the Recall is 13.59% lower than the global Recall.

Level Data slice Metric Deviation
major 🔴 avg_word_length(text) < 3.891 AND avg_word_length(text) >= 3.306 Recall = 0.784 -13.59% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text avg_word_length(text) label Predicted label
5 FreeMsg Hey there darling it's been 3 week's now and no word back! I'd like some fun you up for it still? Tb ok! XxX std chgs to send, £1.50 to rcv 3.625 LABEL_1 LABEL_0 (p = 0.78)
227 Will u meet ur dream partner soon? Is ur career off 2 a flyng start? 2 find out free, txt HORO followed by ur star sign, e. g. HORO ARIES 3.6 LABEL_1 LABEL_0 (p = 0.91)
263 MY NO. IN LUTON 0125698789 RING ME IF UR AROUND! H* 3.72727 LABEL_0 LABEL_1 (p = 0.87)

For records in the dataset where text_length(text) < 51.500 AND text_length(text) >= 40.500, the Balanced Accuracy is 11.19% lower than the global Balanced Accuracy.

Level Data slice Metric Deviation
major 🔴 text_length(text) < 51.500 AND text_length(text) >= 40.500 Balanced Accuracy = 0.845 -11.19% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text text_length(text) label Predicted label
955 Filthy stories and GIRLS waiting for your 42 LABEL_1 LABEL_0 (p = 0.94)
3094 staff.science.nus.edu.sg/~phyhcmk/teaching/pc1323 50 LABEL_0 LABEL_1 (p = 0.88)
3302 RCT' THNQ Adrian for U text. Rgds Vatian 41 LABEL_1 LABEL_0 (p = 0.92)

For records in the dataset where avg_whitespace(text) >= 0.212 AND avg_whitespace(text) < 0.223, the Balanced Accuracy is 9.62% lower than the global Balanced Accuracy.

Level Data slice Metric Deviation
medium 🟡 avg_whitespace(text) >= 0.212 AND avg_whitespace(text) < 0.223 Balanced Accuracy = 0.860 -9.62% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text avg_whitespace(text) label Predicted label
5 FreeMsg Hey there darling it's been 3 week's now and no word back! I'd like some fun you up for it still? Tb ok! XxX std chgs to send, £1.50 to rcv 0.216216 LABEL_1 LABEL_0 (p = 0.78)
227 Will u meet ur dream partner soon? Is ur career off 2 a flyng start? 2 find out free, txt HORO followed by ur star sign, e. g. HORO ARIES 0.217391 LABEL_1 LABEL_0 (p = 0.91)
2402 Babe: U want me dont u baby! Im nasty and have a thing 4 filthyguys. Fancy a rude time with a sexy bitch. How about we go slo n hard! Txt XXX SLO(4msgs) 0.215686 LABEL_1 LABEL_0 (p = 0.81)

For records in the dataset where avg_word_length(text) < 4.258 AND avg_word_length(text) >= 4.102, the Recall is 5.56% lower than the global Recall.

Level Data slice Metric Deviation
medium 🟡 avg_word_length(text) < 4.258 AND avg_word_length(text) >= 4.102 Recall = 0.857 -5.56% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text avg_word_length(text) label Predicted label
2003 TheMob>Yo yo yo-Here comes a new selection of hot downloads for our members to get for FREE! Just click & open the next link sent to ur fone... 4.14286 LABEL_1 LABEL_0 (p = 0.92)
3302 RCT' THNQ Adrian for U text. Rgds Vatian 4.125 LABEL_1 LABEL_0 (p = 0.92)
4676 Hi babe its Chloe, how r u? I was smashed on saturday night, it was great! How was your weekend? U been missing me? SP visionsms.com Text stop to stop 150p/text 4.19355 LABEL_1 LABEL_0 (p = 0.84)
👉Spurious Correlation issues (2)

Data slice avg_digits(text) < 0.032 seems to be highly associated to prediction label = LABEL_0 (99.09% of predictions in the data slice).

Level Data slice Metric Deviation
minor 🟡 avg_digits(text) < 0.032 Nominal association (Theil's U) = 0.684 Prediction label = LABEL_0 for 99.09% of samples in the slice

Taxonomy

avid-effect:performance:P0103
🔍✨Examples
text avg_digits(text) label Predicted label
0 Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat... 0 LABEL_0 LABEL_0 (p = 0.94)
1 Ok lar... Joking wif u oni... 0 LABEL_0 LABEL_0 (p = 0.94)
3 U dun say so early hor... U c already then say... 0 LABEL_0 LABEL_0 (p = 0.94)

Data slice avg_digits(text) >= 0.084 seems to be highly associated to prediction label = LABEL_1 (96.11% of predictions in the data slice).

Level Data slice Metric Deviation
minor 🟡 avg_digits(text) >= 0.084 Nominal association (Theil's U) = 0.630 Prediction label = LABEL_1 for 96.11% of samples in the slice

Taxonomy

avid-effect:performance:P0103
🔍✨Examples
text avg_digits(text) label Predicted label
2 Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 08452810075over18's 0.160256 LABEL_1 LABEL_1 (p = 0.91)
8 WINNER!! As a valued network customer you have been selected to receivea £900 prize reward! To claim call 09061701461. Claim code KL341. Valid 12 hours only. 0.120253 LABEL_1 LABEL_1 (p = 0.91)
9 Had your mobile 11 months or more? U R entitled to Update to the latest colour mobiles with camera for Free! Call The Mobile Update Co FREE on 08002986030 0.083871 LABEL_1 LABEL_1 (p = 0.90)

Checkout out the Giskard Space and test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment