Report for ProsusAI/finbert

#73
by inoki-giskard - opened

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 3 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree, split train).

👉Robustness issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.062 Add typos 62/1000 tested samples (6.2%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 6.2% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1919 Cash flow from operations totalled EUR 7.4 mn , compared to a negative EUR 68.6 mn in the second quarter of 2008 . Cash dlow from operations totalled WUR 7.4 mhn , comlared to a negative EUR 68.6 mn in the second wquarter of 2008 . positive (p = 0.86) negative (p = 0.95)
1584 `` These developments partly reflect the government 's higher activity in the field of dividend policy . '' `` These developmsents partly reflect he government 's higher activity in the field of dividend policty . '' positive (p = 0.75) neutral (p = 0.58)
628 The share of the share capital of both above mentioned shareholders remains below 5 % . The share of the share capiral of both above mentioned sharehooders remains below 5 % . neutral (p = 0.72) negative (p = 0.83)
👉Performance issues (2)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_whitespace(text) < 0.160 AND avg_whitespace(text) >= 0.156 Precision = 0.917 -5.67% than global
🔍✨Examples For records in the dataset where `avg_whitespace(text)` < 0.160 AND `avg_whitespace(text)` >= 0.156, the Precision is 5.67% lower than the global Precision.
text avg_whitespace(text) label Predicted label
533 According to Finnair Technical Services , the measure is above all due to the employment situation . 0.16 neutral positive (p = 0.50)
841 Previously , EB delivered a custom solution for LG Electronics and now is making it commercially available for other mobile terminal vendors as well as to wireless operators . 0.16 positive neutral (p = 0.76)
1062 The contract covers turnkey deliveries to all five airports operated by the authority -- John F Kennedy , LaGuardia , Newark , Teterboro and Stewart International . 0.158537 neutral positive (p = 0.70)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 text_length(text) >= 99.500 AND text_length(text) < 106.500 Precision = 0.921 -5.22% than global
🔍✨Examples For records in the dataset where `text_length(text)` >= 99.500 AND `text_length(text)` < 106.500, the Precision is 5.22% lower than the global Precision.
text text_length(text) label Predicted label
533 According to Finnair Technical Services , the measure is above all due to the employment situation . 100 neutral positive (p = 0.50)
1413 The contract also includes installation work in a new multistorey carpark for close on 1,000 vehicles . 103 neutral positive (p = 0.55)
1496 The repo rate will gradually reach 2 % at the end of 2010 , according to Nordea 's Economic Outlook . 101 neutral positive (p = 0.51)

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • Checkout the Giskard Space and improve your model.
  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

Sign up or log in to comment