Report for soleimanian/financial-roberta-large-sentiment

#34
by giskard-bot - opened
Giskard org

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 4 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree, split train).

👉Robustness issues (2)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.086 Add typos 86/1000 tested samples (8.6%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 8.6% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1919 Cash flow from operations totalled EUR 7.4 mn , compared to a negative EUR 68.6 mn in the second quarter of 2008 . Cash dlow from operations totalled WUR 7.4 mhn , comlared to a negative EUR 68.6 mn in the second wquarter of 2008 . positive (p = 0.99) negative (p = 1.00)
1143 A huge issue for us is the button placement . A huge isue for us is the button placment . negative (p = 0.98) neutral (p = 1.00)
2130 Device volume in the area decreased by 21 % to 2.7 mn units . Device volum in the area decreased by 21 % to 2.7 mn units . negative (p = 1.00) positive (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.075 Transform to uppercase 75/1000 tested samples (7.5%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 7.5% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to uppercase(text) Original prediction Prediction after perturbation
580 Okmetic Board of Directors has also decided on a new share ownership program directed to the company 's top management . OKMETIC BOARD OF DIRECTORS HAS ALSO DECIDED ON A NEW SHARE OWNERSHIP PROGRAM DIRECTED TO THE COMPANY 'S TOP MANAGEMENT . neutral (p = 0.70) positive (p = 0.79)
823 In the end of 2006 , the number of outlets will rise to 60-70 . IN THE END OF 2006 , THE NUMBER OF OUTLETS WILL RISE TO 60-70 . positive (p = 1.00) negative (p = 1.00)
1444 The group reiterated its forecast that handset manufacturers will sell around 915 mln units this year globally . THE GROUP REITERATED ITS FORECAST THAT HANDSET MANUFACTURERS WILL SELL AROUND 915 MLN UNITS THIS YEAR GLOBALLY . neutral (p = 1.00) positive (p = 1.00)
👉Ethical issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Ethical major 🔴 Fail rate = 0.025 Switch countries from high- to low-income and vice versa 13/516 tested samples (2.52%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Switch countries from high- to low-income and vice versa”, the model changes its prediction in 2.52% of the cases. We expected the predictions not to be affected by this transformation.
text Switch countries from high- to low-income and vice versa(text) Original prediction Prediction after perturbation
336 ( ADP News ) - Feb 9 , 2009 - Finnish computer services company Proha Oyj ( HEL : ART1V ) said today its net loss narrowed to EUR 113,000 ( USD 146,000 ) for 2008 from EUR 1.2 million for 2007 . ( ADP News ) - Feb 9 , 2009 - Mosotho computer services company Proha Oyj ( HEL : ART1V ) said today its net loss narrowed to EUR 113,000 ( USD 146,000 ) for 2008 from EUR 1.2 million for 2007 . positive (p = 1.00) negative (p = 0.72)
441 Savon koulutuskuntayhtyma , Finland based company has awarded contract for specialist agricultural or forestry machinery . Savon koulutuskuntayhtyma , Haiti based company has awarded contract for specialist agricultural or forestry machinery . neutral (p = 0.56) positive (p = 0.60)
994 China Unicom , NYSE : CHU , HKSE : 0762 , and SHSE : 600050 , the second largest mobile carrier in the country . Congo Unicom , NYSE : CHU , HKSE : 0762 , and SHSE : 600050 , the second largest mobile carrier in the country . neutral (p = 0.96) positive (p = 0.95)
👉Performance issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_word_length(text) < 3.860 AND avg_word_length(text) >= 3.699 Balanced Accuracy = 0.892 -5.29% than global
🔍✨Examples For records in the dataset where `avg_word_length(text)` < 3.860 AND `avg_word_length(text)` >= 3.699, the Balanced Accuracy is 5.29% lower than the global Balanced Accuracy.
text avg_word_length(text) label Predicted label
567 It will provide heating in the form of hot water for the sawmill 's needs . 3.75 neutral positive (p = 0.64)
1121 Upon completion of the sale Proha would get some USD12 .7 m for its stake in Artemis . 3.83333 neutral positive (p = 0.99)
1140 3 January 2011 - Scandinavian lenders Sampo Bank ( HEL : SAMAS ) , Pohjola Bank ( HEL : POH1S ) and Svenska Handelsbanken ( STO : SHB A ) have provided a EUR160m ( USD213m ) line of credit to Lemminkainen Oyj ( HEL : LEM1S ) , the Finnish construction firm said on Friday . 3.80702 neutral positive (p = 0.99)

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • Checkout the Giskard Space and improve your model.
  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

Sign up or log in to comment