Report for soleimanian/financial-roberta-large-sentiment

#87
by giskard-bot - opened

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 3 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree, split train).

👉Performance issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_word_length(text) < 3.860 AND avg_word_length(text) >= 3.699 Balanced Accuracy = 0.892 -5.29% than global
🔍✨Examples For records in the dataset where `avg_word_length(text)` < 3.860 AND `avg_word_length(text)` >= 3.699, the Balanced Accuracy is 5.29% lower than the global Balanced Accuracy.
text avg_word_length(text) label Predicted label
567 It will provide heating in the form of hot water for the sawmill 's needs . 3.75 neutral positive (p = 0.64)
1121 Upon completion of the sale Proha would get some USD12 .7 m for its stake in Artemis . 3.83333 neutral positive (p = 0.99)
1140 3 January 2011 - Scandinavian lenders Sampo Bank ( HEL : SAMAS ) , Pohjola Bank ( HEL : POH1S ) and Svenska Handelsbanken ( STO : SHB A ) have provided a EUR160m ( USD213m ) line of credit to Lemminkainen Oyj ( HEL : LEM1S ) , the Finnish construction firm said on Friday . 3.80702 neutral positive (p = 0.99)
👉Robustness issues (2)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.086 Add typos 86/1000 tested samples (8.6%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 8.6% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1919 Cash flow from operations totalled EUR 7.4 mn , compared to a negative EUR 68.6 mn in the second quarter of 2008 . Cash dlow from operations totalled WUR 7.4 mhn , comlared to a negative EUR 68.6 mn in the second wquarter of 2008 . positive (p = 0.99) negative (p = 1.00)
1143 A huge issue for us is the button placement . A huge isue for us is the button placment . negative (p = 0.98) neutral (p = 1.00)
2130 Device volume in the area decreased by 21 % to 2.7 mn units . Device volum in the area decreased by 21 % to 2.7 mn units . negative (p = 1.00) positive (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.075 Transform to uppercase 75/1000 tested samples (7.5%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 7.5% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to uppercase(text) Original prediction Prediction after perturbation
580 Okmetic Board of Directors has also decided on a new share ownership program directed to the company 's top management . OKMETIC BOARD OF DIRECTORS HAS ALSO DECIDED ON A NEW SHARE OWNERSHIP PROGRAM DIRECTED TO THE COMPANY 'S TOP MANAGEMENT . neutral (p = 0.70) positive (p = 0.79)
823 In the end of 2006 , the number of outlets will rise to 60-70 . IN THE END OF 2006 , THE NUMBER OF OUTLETS WILL RISE TO 60-70 . positive (p = 1.00) negative (p = 1.00)
1444 The group reiterated its forecast that handset manufacturers will sell around 915 mln units this year globally . THE GROUP REITERATED ITS FORECAST THAT HANDSET MANUFACTURERS WILL SELL AROUND 915 MLN UNITS THIS YEAR GLOBALLY . neutral (p = 1.00) positive (p = 1.00)

Checkout out the Giskard Space and test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment