Report for mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis

#6
by giskard-bot - opened

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 3 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_50agree, split train).

You can find a full version of scan report here.

👉Robustness issues (3)

When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 32.2% of the cases. We expected the predictions not to be affected by this transformation.

Level Metric Transformation Deviation
major 🔴 Fail rate = 0.322 Transform to uppercase 322/1000 tested samples (32.2%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Transform to uppercase(text) Original prediction Prediction after perturbation
996 These moderate but significant changes resulted in a significant 24-32 % reduction in the estimated CVD risk . THESE MODERATE BUT SIGNIFICANT CHANGES RESULTED IN A SIGNIFICANT 24-32 % REDUCTION IN THE ESTIMATED CVD RISK . positive (p = 1.00) neutral (p = 1.00)
300 The stock rose for a second day on Wednesday bringing its two-day rise to GBX12 .0 or 2.0 % . THE STOCK ROSE FOR A SECOND DAY ON WEDNESDAY BRINGING ITS TWO-DAY RISE TO GBX12 .0 OR 2.0 % . positive (p = 1.00) neutral (p = 1.00)
4737 In food trade , sales amounted to EUR320 .1 m , a decline of 1.1 % . IN FOOD TRADE , SALES AMOUNTED TO EUR320 .1 M , A DECLINE OF 1.1 % . negative (p = 1.00) neutral (p = 1.00)

When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 9.5% of the cases. We expected the predictions not to be affected by this transformation.

Level Metric Transformation Deviation
medium 🟡 Fail rate = 0.095 Transform to title case 95/1000 tested samples (9.5%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Transform to title case(text) Original prediction Prediction after perturbation
4737 In food trade , sales amounted to EUR320 .1 m , a decline of 1.1 % . In Food Trade , Sales Amounted To Eur320 .1 M , A Decline Of 1.1 % . negative (p = 1.00) neutral (p = 0.99)
4512 Bioheapleaching makes extraction of metals from low grade ore economically viable . Bioheapleaching Makes Extraction Of Metals From Low Grade Ore Economically Viable . positive (p = 1.00) neutral (p = 0.92)
2222 The earnings in the comparative period included a capital gain of EUR 8mn from the sale of OMX shares . The Earnings In The Comparative Period Included A Capital Gain Of Eur 8Mn From The Sale Of Omx Shares . positive (p = 0.98) neutral (p = 0.97)

When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 9.4% of the cases. We expected the predictions not to be affected by this transformation.

Level Metric Transformation Deviation
medium 🟡 Fail rate = 0.094 Add typos 94/1000 tested samples (9.4%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Add typos(text) Original prediction Prediction after perturbation
296 The German company has also signed a code share agreement with another Oneworld member -- American Airlines Inc , part of US-based AMR Corp ( NYSE : AMR ) . The German xompany has aso signed a code dshare ageement wjtgh anothef Onewold member -- American Airines Inc , part of USbased AMF Corp ( NYSE : AMR ) . positive (p = 1.00) neutral (p = 0.99)
678 ADPnews - Feb 5 , 2010 - Finnish real estate investor Sponda Oyj HEL : SDA1V said today that it slipped to a net loss of EUR 81.5 million USD 11.8 m in 2009 from a profit of EUR 29.3 million in 2008 . ADPnews - Fbe 5 , 2010 - innish real estate investor Sponda Oyk HEL : SDA1V said today that it slippes to a net lsos of EUR 81.5 million USD 11.8 m in 2009 from a profit of EUR 29.3 million in 2008 . negative (p = 1.00) neutral (p = 0.92)
3533 As a result , the number of personnel in Finland will be reduced by 158 . As a result , rthe number of personnel in Finland wull be redhuced by 158 . negative (p = 0.99) neutral (p = 1.00)

Checkout out the Giskard Space and Giskard Documentation to learn more about how to test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment