Report for soleimanian/financial-roberta-large-sentiment on financial_phrasebank (sentences_allagree, train set)

#2
by giskard-bot - opened
Giskard org
Performance issues (1)
Vulnerability Level Data slice Metric Transformation Deviation Description
Performance medium avg_word_length(text) < 3.860 AND avg_word_length(text) >= 3.699 Balanced Accuracy = 0.892 -5.29% than global For records in the dataset where avg_word_length(text) < 3.860 AND avg_word_length(text) >= 3.699, the Balanced Accuracy is 5.29% lower than the global Balanced Accuracy.
Robustness issues (2)
Vulnerability Level Data slice Metric Transformation Deviation Description
Robustness medium Fail rate = 0.075 Transform to uppercase 75/1000 tested samples (7.5%) changed prediction after perturbation When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 7.5% of the cases. We expected the predictions not to be affected by this transformation.
Robustness medium Fail rate = 0.071 Add typos 71/1000 tested samples (7.1%) changed prediction after perturbation When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 7.1% of the cases. We expected the predictions not to be affected by this transformation.

Sign up or log in to comment