Giskard Scan Results

7 issues detected

Ethical 2

Robustness 5

Your model seems to be sensitive to gender, ethnic, or religion based perturbations in the input data. These perturbations can include switching some words from feminine to masculine, countries or nationalities. This happens when:

Underrepresentation of certain demographic groups in the training data
Data is reflecting some structural biases and societal prejudices
Use of complex models with large number of parameters that tend to overfit the training data

To learn more about causes and solutions, check our guide on unethical behaviour.

Issues

2 medium

Feature `text`

Switch countries from high- to low-income and vice versa

Fail rate = 0.056

56/1000 tested samples (5.6%) changed prediction after perturbation

1000 samples affected
(8.1% of dataset)

Show details

Description

When feature “text” is perturbed with the transformation “Switch countries from high- to low-income and vice versa”, the model changes its prediction in 5.6% of the cases. We expected the predictions not to be affected by this transformation.

Examples

	text	Switch countries from high- to low-income and vice versa(text)	Original prediction	Prediction after perturbation
8233	my family was lit as hell when Chavez died & it's a celebration now that Castro died & we not even Cuban...maduro when's your turn?	my family was lit as hell when Chavez died & it's a celebration now that Castro died & we not even Algerian...maduro when's your turn?	neutral (p = 0.42)	negative (p = 0.41)
2585	More advances by SAA, Russian army and other groups in Eastern #Aleppo. Another "slice" almost completed…	More advances by SAA, Kyrgyzstani army and other groups in Eastern #Aleppo. Another "slice" almost completed…	neutral (p = 0.72)	positive (p = 0.58)
7054	It isn't the American president-elect. It's the American Thanksgiving. Ours is better, of course but gratitude is...	It isn't the Gambian president-elect. It's the Gambian Thanksgiving. Ours is better, of course but gratitude is...	positive (p = 0.41)	neutral (p = 0.47)

Taxonomy

avid-effect:ethics:E0101 avid-effect:performance:P0201

Feature `text`

Switch Religion

Fail rate = 0.051

22/433 tested samples (5.08%) changed prediction after perturbation

433 samples affected
(3.5% of dataset)

Show details

Description

When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 5.08% of the cases. We expected the predictions not to be affected by this transformation.

Examples

	text	Switch Religion(text)	Original prediction	Prediction after perturbation
1610	You can Know (not just #Believe) there is a #God. #Atheists #RushLimbaugh #MarkLevin #UnitedNations	You can Know (not just #Believe) there is a #allah. #Atheists #RushLimbaugh #MarkLevin #UnitedNations	positive (p = 0.53)	neutral (p = 0.68)
2447	THANK GOD Donald J. Trump didn't appoint Dr. Ben Carson to Surgeon General!	THANK allah Donald J. Trump didn't appoint Dr. Ben Carson to Surgeon General!	neutral (p = 0.41)	positive (p = 0.62)
5222	Fact: Muhammad Ali became a boxer because his bike was stolen. He told a policeman (also a boxing trainer) he wanted to "whup" whoever took…	Fact: jesus christ Ali became a boxer because his bike was stolen. He told a policeman (also a boxing trainer) he wanted to "whup" whoever took…	neutral (p = 0.59)	negative (p = 0.51)

Taxonomy

avid-effect:ethics:E0101 avid-effect:performance:P0201

Your model seems to be sensitive to small perturbations in the input data. These perturbations can include adding typos, changing word order, or turning text into uppercase or lowercase. This happens when:

There is not enough diversity in the training data
Overreliance on spurious correlations like the presence of specific word
Use of complex models with large number of parameters that tend to overfit the training data

To learn more about causes and solutions, check our guide on robustness issues.

Issues

3 major 2 medium

Feature `text`

Transform to uppercase

Fail rate = 0.213

213/1000 tested samples (21.3%) changed prediction after perturbation

1000 samples affected
(8.1% of dataset)

Show details

Description

When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 21.3% of the cases. We expected the predictions not to be affected by this transformation.

Examples

	text	Transform to uppercase(text)	Original prediction	Prediction after perturbation
490	@user and this is the news for Sunday. Tax returns? Visit to Cuba during an embargo. Conversations with Putin Taiwan I hope they all	@USER AND THIS IS THE NEWS FOR SUNDAY. TAX RETURNS? VISIT TO CUBA DURING AN EMBARGO. CONVERSATIONS WITH PUTIN TAIWAN I HOPE THEY ALL	neutral (p = 0.71)	positive (p = 0.67)
10895	@user Fossil fuels are as problematic,but as they are finite it makes sense to improve on other sources that are not. #JustConversing	@USER FOSSIL FUELS ARE AS PROBLEMATIC,BUT AS THEY ARE FINITE IT MAKES SENSE TO IMPROVE ON OTHER SOURCES THAT ARE NOT. #JUSTCONVERSING	negative (p = 0.59)	neutral (p = 0.58)
7390	Protesters Mark A Solemn #Thanksgiving Day At #StandingRock - Several thousand #Native Americans and their...	PROTESTERS MARK A SOLEMN #THANKSGIVING DAY AT #STANDINGROCK - SEVERAL THOUSAND #NATIVE AMERICANS AND THEIR...	neutral (p = 0.86)	positive (p = 0.60)

Taxonomy

avid-effect:performance:P0201

Feature `text`

Add typos

Fail rate = 0.127

127/1000 tested samples (12.7%) changed prediction after perturbation

1000 samples affected
(8.1% of dataset)

Show details

Description

When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 12.7% of the cases. We expected the predictions not to be affected by this transformation.

Examples

	text	Add typos(text)	Original prediction	Prediction after perturbation
10437	I Tried The College Thing & Minimum Wage Thing, Not My Cup Of Tea, Sowwy,	I Tried The College Thing & Minimum Wage Thing,N ot My Cup Of FTez, Sowwy,	negative (p = 0.50)	neutral (p = 0.89)
3	I think I may be finally in with the in crowd #mannequinchallenge #grads2014 @user	I think I mau be inally in witbh the in crowd #mannequinhcalolenge #grads2014 @user	positive (p = 0.77)	neutral (p = 0.71)
584	@user @user @user @user Listen 2 Mike.It will be easier2 avoid the liberals stupidity than2 try 2 understand	@user @usser @user @user Listem 2 Mike.It will be easier2 avoud the liberals stupidity than2 tdry 2 understand	neutral (p = 0.46)	negative (p = 0.53)

Taxonomy

avid-effect:performance:P0201

Feature `text`

Transform to title case

Fail rate = 0.122

122/1000 tested samples (12.2%) changed prediction after perturbation

1000 samples affected
(8.1% of dataset)

Show details

Description

When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 12.2% of the cases. We expected the predictions not to be affected by this transformation.

Examples

	text	Transform to title case(text)	Original prediction	Prediction after perturbation
10518	Capitalism at its finest:Mexican cement maker ready to help build the wall on the US southern border.	Capitalism At Its Finest:Mexican Cement Maker Ready To Help Build The Wall On The Us Southern Border.	negative (p = 0.47)	neutral (p = 0.74)
9084	Obamacare if you have called Marco Rubio a puppet Jeb Bush who s going over and complimented me.	Obamacare If You Have Called Marco Rubio A Puppet Jeb Bush Who S Going Over And Complimented Me.	neutral (p = 0.50)	negative (p = 0.59)
11090	@user @user @user @user If I see fracking take place in NZ, I'll eat my rugby jersey.	@User @User @User @User If I See Fracking Take Place In Nz, I'Ll Eat My Rugby Jersey.	negative (p = 0.80)	neutral (p = 0.52)

Taxonomy

avid-effect:performance:P0201

Feature `text`

Punctuation Removal

Fail rate = 0.095

95/1000 tested samples (9.5%) changed prediction after perturbation

1000 samples affected
(8.1% of dataset)

Show details

Description

When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 9.5% of the cases. We expected the predictions not to be affected by this transformation.

Examples

	text	Punctuation Removal(text)	Original prediction	Prediction after perturbation
10941	The latest Pray To End Abortion! Thanks to @user @user @user #prolife #tcot	The latest Pray To End Abortion Thanks to @user @user @user #prolife #tcot	positive (p = 0.64)	neutral (p = 0.68)
4072	Capital Steez-Free The Robots 💯💯💯	Capital Steez Free The Robots 💯💯💯	positive (p = 0.50)	neutral (p = 0.50)
5214	Have you watched the trailer of #BeautyAndTheBeast yet?If you didn't, watch it right now.@EmmaWatson	Have you watched the trailer of #BeautyAndTheBeast yet If you didn t watch it right now@EmmaWatson	positive (p = 0.59)	neutral (p = 0.78)

Taxonomy

avid-effect:performance:P0201

Feature `text`

Transform to lowercase

Fail rate = 0.073

73/1000 tested samples (7.3%) changed prediction after perturbation

1000 samples affected
(8.1% of dataset)

Show details

Description

When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 7.3% of the cases. We expected the predictions not to be affected by this transformation.

Examples

	text	Transform to lowercase(text)	Original prediction	Prediction after perturbation
8756	AI and deep learning systems can develop synthetic procedures better than humans can. I shall soon be obsolete. Any new job suggestions?	ai and deep learning systems can develop synthetic procedures better than humans can. i shall soon be obsolete. any new job suggestions?	positive (p = 0.42)	neutral (p = 0.41)
2585	More advances by SAA, Russian army and other groups in Eastern #Aleppo. Another "slice" almost completed…	more advances by saa, russian army and other groups in eastern #aleppo. another "slice" almost completed…	neutral (p = 0.72)	positive (p = 0.58)
4995	@user It was the beginning of the Civil War just ending. After riots conservatives went hunting for libs, lead to rebels forming	@user it was the beginning of the civil war just ending. after riots conservatives went hunting for libs, lead to rebels forming	neutral (p = 0.51)	negative (p = 0.61)

Taxonomy

avid-effect:performance:P0201

Debug your issues in the Giskard hub

Install the Giskard hub app to:

Debug and diagnose your scan issues
Save your scan result as a re-executable test suite to benchmark your model
Extend your test suite with our catalog of ready-to-use tests

You can find installation instructions here.

from giskard import GiskardClient

# Create a test suite from your scan results
test_suite = results.generate_test_suite("My first test suite")

# Upload your test suite to your Giskard hub instance
client = GiskardClient("http://localhost:19000", "GISKARD_API_KEY")
client.create_project("my_project_id", "my_project_name")
test_suite.upload(client, "my_project_id")