Report for Sigma/financial-sentiment-analysis

#64
by inoki-giskard - opened

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 9 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_66agree, split train).

👉Robustness issues (3)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.411 Transform to uppercase 411/1000 tested samples (41.1%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 41.1% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to uppercase(text) Original prediction Prediction after perturbation
853 Strong growth has continued also in China . STRONG GROWTH HAS CONTINUED ALSO IN CHINA . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
1573 `` P&O Ferries now has a very efficient and powerful vessel for its Dover to Calais route , '' head of the shipbuilder 's Rauma yard , Timo Suistio , said . `` P&O FERRIES NOW HAS A VERY EFFICIENT AND POWERFUL VESSEL FOR ITS DOVER TO CALAIS ROUTE , '' HEAD OF THE SHIPBUILDER 'S RAUMA YARD , TIMO SUISTIO , SAID . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
256 Revenue grew 12 percent to ( x20ac ) 3.6 billion ( US$ 4.5 billion ) . REVENUE GREW 12 PERCENT TO ( X20AC ) 3.6 BILLION ( US$ 4.5 BILLION ) . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.405 Transform to title case 405/1000 tested samples (40.5%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 40.5% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to title case(text) Original prediction Prediction after perturbation
852 Sales VAT inclusive expanded by 19 percent , to 351 million euros . Sales Vat Inclusive Expanded By 19 Percent , To 351 Million Euros . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
1573 `` P&O Ferries now has a very efficient and powerful vessel for its Dover to Calais route , '' head of the shipbuilder 's Rauma yard , Timo Suistio , said . `` P&O Ferries Now Has A Very Efficient And Powerful Vessel For Its Dover To Calais Route , '' Head Of The Shipbuilder 'S Rauma Yard , Timo Suistio , Said . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
3622 Finnish Suominen Flexible Packaging is cutting 48 jobs in its unit in Tampere and two in Nastola , in Finland . Finnish Suominen Flexible Packaging Is Cutting 48 Jobs In Its Unit In Tampere And Two In Nastola , In Finland . LABEL_0 (p = 0.85) LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.082 Add typos 82/1000 tested samples (8.2%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 8.2% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
254 Profit for the period was EUR 5.9 mn , up from EUR 1.3 mn . Profot for the period was EUR 5.9 mn , ulp from EUR 1. mj . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
3952 The Baltimore Police and Fire Pension , which has about $ 1.5 billion , lost about $ 3.5 million in Madoff Ponzi scheme . The Baltimore Police and Firr Pension , wich has about $ 1.5 billion , lkst about $ 3.5 million in Madoff Pnzi scheme . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
1803 Net profit was 35.5 mln compared with 29.8 mln . Net profit as 35.5 mln comlred with 29.8 mln . LABEL_2 (p = 0.70) LABEL_1 (p = 1.00)
👉Performance issues (6)
Vulnerability Level Data slice Metric Transformation Deviation
Performance major 🔴 avg_whitespace(text) < 0.150 AND avg_whitespace(text) >= 0.144 Balanced Accuracy = 0.738 -15.16% than global
🔍✨Examples For records in the dataset where `avg_whitespace(text)` < 0.150 AND `avg_whitespace(text)` >= 0.144, the Balanced Accuracy is 15.16% lower than the global Balanced Accuracy.
text avg_whitespace(text) label Predicted label
71 Under this agreement Biohit becomes a focus supplier of pipettors and disposable pipettor tips to VWR customers throughout Europe . 0.145038 LABEL_2 LABEL_1 (p = 1.00)
134 South African Sappi will become the largest foreign forest industry company operating in Finland as a result of the acquisition Finnish M-real Corporation 's Graphic Papers Business unit . 0.148936 LABEL_2 LABEL_1 (p = 0.90)
261 Sales of clothing developed best . 0.147059 LABEL_2 LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Performance major 🔴 avg_word_length(text) >= 5.351 AND avg_word_length(text) < 5.659 Balanced Accuracy = 0.754 -13.28% than global
🔍✨Examples For records in the dataset where `avg_word_length(text)` >= 5.351 AND `avg_word_length(text)` < 5.659, the Balanced Accuracy is 13.28% lower than the global Balanced Accuracy.
text avg_word_length(text) label Predicted label
43 The agreement was signed with Biohit Healthcare Ltd , the UK-based subsidiary of Biohit Oyj , a Finnish public company which develops , manufactures and markets liquid handling products and diagnostic test systems . 5.35294 LABEL_2 LABEL_1 (p = 0.77)
71 Under this agreement Biohit becomes a focus supplier of pipettors and disposable pipettor tips to VWR customers throughout Europe . 5.6 LABEL_2 LABEL_1 (p = 1.00)
134 South African Sappi will become the largest foreign forest industry company operating in Finland as a result of the acquisition Finnish M-real Corporation 's Graphic Papers Business unit . 5.51724 LABEL_2 LABEL_1 (p = 0.90)
Vulnerability Level Data slice Metric Transformation Deviation
Performance major 🔴 avg_whitespace(text) < 0.140 Balanced Accuracy = 0.770 -11.42% than global
🔍✨Examples For records in the dataset where `avg_whitespace(text)` < 0.140, the Balanced Accuracy is 11.42% lower than the global Balanced Accuracy.
text avg_whitespace(text) label Predicted label
66 Other measures included increasing synergies and economies of scale within the Grimaldi Group and personnel adjustments , divestments and redelivery of excess tonnage . 0.136905 LABEL_1 LABEL_2 (p = 1.00)
498 The new agreement , which expands a long-established cooperation between the companies , involves the transfer of certain engineering and documentation functions from Larox to Etteplan . 0.139785 LABEL_2 LABEL_1 (p = 0.98)
734 DMASIA-16 August 2006-Benefon extends manufacturing capability with ASMobile -® 2006 Digitalmediaasia.com & DMA Ltd. . 0.118644 LABEL_2 LABEL_1 (p = 0.95)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_word_length(text) >= 4.957 AND avg_word_length(text) < 5.090 Balanced Accuracy = 0.784 -9.92% than global
🔍✨Examples For records in the dataset where `avg_word_length(text)` >= 4.957 AND `avg_word_length(text)` < 5.090, the Balanced Accuracy is 9.92% lower than the global Balanced Accuracy.
text avg_word_length(text) label Predicted label
418 A corresponding increase of 85,432.50 euros in Ahlstrom 's share capital has been entered in the Trade Register today . 5 LABEL_1 LABEL_2 (p = 0.99)
451 The company plans to spend the proceeds from the rights offering for strengthening its balance sheet . 5.05882 LABEL_1 LABEL_2 (p = 0.99)
654 `` After the share purchase is completed , financing will also be provided to expand Latvia 's broadband infrastructure and to develop new areas of business , including acquisitions of other companies . '' 5.05882 LABEL_2 LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_digits(text) < 0.005 Balanced Accuracy = 0.802 -7.74% than global
🔍✨Examples For records in the dataset where `avg_digits(text)` < 0.005, the Balanced Accuracy is 7.74% lower than the global Balanced Accuracy.
text avg_digits(text) label Predicted label
43 The agreement was signed with Biohit Healthcare Ltd , the UK-based subsidiary of Biohit Oyj , a Finnish public company which develops , manufactures and markets liquid handling products and diagnostic test systems . 0 LABEL_2 LABEL_1 (p = 0.77)
56 The company supports its global customers in developing new technologies and offers a fast route from product development to applications and volume production . 0 LABEL_1 LABEL_2 (p = 0.99)
66 Other measures included increasing synergies and economies of scale within the Grimaldi Group and personnel adjustments , divestments and redelivery of excess tonnage . 0 LABEL_1 LABEL_2 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_whitespace(text) < 0.162 AND avg_whitespace(text) >= 0.159 Balanced Accuracy = 0.806 -7.32% than global
🔍✨Examples For records in the dataset where `avg_whitespace(text)` < 0.162 AND `avg_whitespace(text)` >= 0.159, the Balanced Accuracy is 7.32% lower than the global Balanced Accuracy.
text avg_whitespace(text) label Predicted label
138 After the takeover , Cramo will become the second largest rental services provider in the Latvian market . 0.160377 LABEL_2 LABEL_1 (p = 0.89)
418 A corresponding increase of 85,432.50 euros in Ahlstrom 's share capital has been entered in the Trade Register today . 0.159664 LABEL_1 LABEL_2 (p = 0.99)
654 `` After the share purchase is completed , financing will also be provided to expand Latvia 's broadband infrastructure and to develop new areas of business , including acquisitions of other companies . '' 0.160976 LABEL_2 LABEL_1 (p = 1.00)

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • Checkout the Giskard Space and improve your model.
  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

Sign up or log in to comment