Report for ProsusAI/finbert

#85
by giskard-bot - opened

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 3 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree, split train).

👉Performance issues (2)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_whitespace(text) < 0.160 AND avg_whitespace(text) >= 0.156 Precision = 0.917 -5.67% than global
🔍✨Examples For records in the dataset where `avg_whitespace(text)` < 0.160 AND `avg_whitespace(text)` >= 0.156, the Precision is 5.67% lower than the global Precision.
text avg_whitespace(text) label Predicted label
533 According to Finnair Technical Services , the measure is above all due to the employment situation . 0.16 neutral positive (p = 0.50)
841 Previously , EB delivered a custom solution for LG Electronics and now is making it commercially available for other mobile terminal vendors as well as to wireless operators . 0.16 positive neutral (p = 0.76)
1062 The contract covers turnkey deliveries to all five airports operated by the authority -- John F Kennedy , LaGuardia , Newark , Teterboro and Stewart International . 0.158537 neutral positive (p = 0.70)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 text_length(text) >= 99.500 AND text_length(text) < 106.500 Precision = 0.921 -5.22% than global
🔍✨Examples For records in the dataset where `text_length(text)` >= 99.500 AND `text_length(text)` < 106.500, the Precision is 5.22% lower than the global Precision.
text text_length(text) label Predicted label
533 According to Finnair Technical Services , the measure is above all due to the employment situation . 100 neutral positive (p = 0.50)
1413 The contract also includes installation work in a new multistorey carpark for close on 1,000 vehicles . 103 neutral positive (p = 0.55)
1496 The repo rate will gradually reach 2 % at the end of 2010 , according to Nordea 's Economic Outlook . 101 neutral positive (p = 0.51)
👉Robustness issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.062 Add typos 62/1000 tested samples (6.2%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 6.2% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1919 Cash flow from operations totalled EUR 7.4 mn , compared to a negative EUR 68.6 mn in the second quarter of 2008 . Cash dlow from operations totalled WUR 7.4 mhn , comlared to a negative EUR 68.6 mn in the second wquarter of 2008 . positive (p = 0.86) negative (p = 0.95)
1584 `` These developments partly reflect the government 's higher activity in the field of dividend policy . '' `` These developmsents partly reflect he government 's higher activity in the field of dividend policty . '' positive (p = 0.75) neutral (p = 0.58)
628 The share of the share capital of both above mentioned shareholders remains below 5 % . The share of the share capiral of both above mentioned sharehooders remains below 5 % . neutral (p = 0.72) negative (p = 0.83)

Checkout out the Giskard Space and test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment