Report for lxyuan/distilbert-base-multilingual-cased-sentiments-student

#3
by inoki-giskard - opened
Overconfidence issues (1)
Vulnerability Level Data slice Metric Transformation Deviation Description
Overconfidence medium avg_digits(text) < 0.011 Overconfidence rate = 0.291 +18.82% than global For records in the dataset where avg_digits(text) < 0.011, we found a significantly higher number of overconfident wrong predictions (183 samples, corresponding to 29.093799682034977% of the wrong predictions in the data slice).
Robustness issues (5)
Vulnerability Level Data slice Metric Transformation Deviation Description
Robustness major Fail rate = 0.393 Transform to uppercase 393/1000 tested samples (39.3%) changed prediction after perturbation When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 39.3% of the cases. We expected the predictions not to be affected by this transformation.
Robustness major Fail rate = 0.307 Transform to title case 307/1000 tested samples (30.7%) changed prediction after perturbation When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 30.7% of the cases. We expected the predictions not to be affected by this transformation.
Robustness major Fail rate = 0.153 Add typos 153/1000 tested samples (15.3%) changed prediction after perturbation When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 15.3% of the cases. We expected the predictions not to be affected by this transformation.
Robustness major Fail rate = 0.144 Transform to lowercase 144/1000 tested samples (14.4%) changed prediction after perturbation When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 14.4% of the cases. We expected the predictions not to be affected by this transformation.
Robustness medium Fail rate = 0.092 Punctuation Removal 92/1000 tested samples (9.2%) changed prediction after perturbation When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 9.2% of the cases. We expected the predictions not to be affected by this transformation.
Performance issues (1)
Vulnerability Level Data slice Metric Transformation Deviation Description
Performance medium text contains "friday" Precision = 0.432 -7.05% than global For records in the dataset where text contains "friday", the Precision is 7.05% lower than the global Precision.
inoki-giskard changed discussion status to closed

Sign up or log in to comment