{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Autism" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Autism, also known as Autism Spectrum Disorder (ASD), is a neurodevelopmental condition characterized by challenges in social interaction, communication, and repetitive behaviors. Individuals with autism may exhibit a wide range of abilities and symptoms, forming a spectrum. Understanding and accurately classifying autism can be a complex task due to the diversity within the spectrum. Machine Learning (ML) plays a vital role in addressing this challenge by leveraging algorithms to analyze patterns and make predictions based on data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data\n", "\n", "ID - ID of the patient\n", "\n", "A1_Score to A10_Score - Score based on Autism Spectrum Quotient (AQ) 10 item screening tool\n", "\n", "age - Age of the patient in years\n", "\n", "gender - Gender of the patient\n", "\n", "ethnicity - Ethnicity of the patient\n", "\n", "jaundice - Whether the patient had jaundice at the time of birth\n", "\n", "autism - Whether an immediate family member has been diagnosed with autism\n", "\n", "contry_of_res - Country of residence of the patient\n", "\n", "used_app_before - Whether the patient has undergone a screening test before\n", "\n", "result - Score for AQ1-10 screening test\n", "\n", "age_desc - Age of the patient\n", "\n", "relation - Relation of patient who completed the test\n", "\n", "Class/ASD - Classified result as 0 or 1. Here 0 represents No and 1 represents Yes. This is the target column, and during submission submit the values as 0 or 1 only." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.preprocessing import StandardScaler\n", "from sklearn.metrics import confusion_matrix, accuracy_score, classification_report\n", "from sklearn.linear_model import LogisticRegression \n", "import joblib\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "data=pd.read_csv(\"data.csv\")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | ID | \n", "gender | \n", "A1_Score | \n", "A2_Score | \n", "A3_Score | \n", "A4_Score | \n", "A5_Score | \n", "A6_Score | \n", "A7_Score | \n", "A8_Score | \n", "... | \n", "jaundice | \n", "austim | \n", "age | \n", "ethnicity | \n", "contry_of_res | \n", "used_app_before | \n", "result | \n", "relation | \n", "age_desc | \n", "Class/ASD | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1 | \n", "f | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "... | \n", "no | \n", "no | \n", "18.605397 | \n", "White-European | \n", "United States | \n", "no | \n", "7.819715 | \n", "Self | \n", "18 and more | \n", "0 | \n", "
1 | \n", "2 | \n", "f | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "no | \n", "13.829369 | \n", "South Asian | \n", "Australia | \n", "no | \n", "10.544296 | \n", "? | \n", "18 and more | \n", "0 | \n", "
2 | \n", "3 | \n", "f | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "no | \n", "14.679893 | \n", "White-European | \n", "United Kingdom | \n", "no | \n", "13.167506 | \n", "Self | \n", "18 and more | \n", "1 | \n", "
3 | \n", "4 | \n", "f | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "no | \n", "61.035288 | \n", "South Asian | \n", "New Zealand | \n", "no | \n", "1.530098 | \n", "? | \n", "18 and more | \n", "0 | \n", "
4 | \n", "5 | \n", "m | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "yes | \n", "14.256686 | \n", "Black | \n", "Italy | \n", "no | \n", "7.949723 | \n", "Self | \n", "18 and more | \n", "0 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
795 | \n", "796 | \n", "f | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "... | \n", "no | \n", "yes | \n", "42.084907 | \n", "White-European | \n", "United States | \n", "no | \n", "13.390868 | \n", "Self | \n", "18 and more | \n", "1 | \n", "
796 | \n", "797 | \n", "f | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "no | \n", "17.669291 | \n", "Asian | \n", "New Zealand | \n", "no | \n", "9.454201 | \n", "Self | \n", "18 and more | \n", "0 | \n", "
797 | \n", "798 | \n", "m | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "... | \n", "yes | \n", "no | \n", "18.242557 | \n", "White-European | \n", "Jordan | \n", "no | \n", "6.805509 | \n", "Self | \n", "18 and more | \n", "1 | \n", "
798 | \n", "799 | \n", "f | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "... | \n", "no | \n", "yes | \n", "19.241473 | \n", "Middle Eastern | \n", "United States | \n", "no | \n", "3.682732 | \n", "Relative | \n", "18 and more | \n", "0 | \n", "
799 | \n", "800 | \n", "f | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "... | \n", "no | \n", "no | \n", "32.170098 | \n", "Asian | \n", "New Zealand | \n", "no | \n", "12.060168 | \n", "Self | \n", "18 and more | \n", "0 | \n", "
800 rows × 22 columns
\n", "\n", " | ID | \n", "A1_Score | \n", "A2_Score | \n", "A3_Score | \n", "A4_Score | \n", "A5_Score | \n", "A6_Score | \n", "A7_Score | \n", "A8_Score | \n", "A9_Score | \n", "A10_Score | \n", "age | \n", "result | \n", "Class/ASD | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | \n", "800.0000 | \n", "800.000000 | \n", "800.00000 | \n", "800.000000 | \n", "800.00000 | \n", "800.000000 | \n", "800.00000 | \n", "800.000000 | \n", "800.000000 | \n", "800.000000 | \n", "800.000000 | \n", "800.000000 | \n", "800.000000 | \n", "800.000000 | \n", "
mean | \n", "400.5000 | \n", "0.582500 | \n", "0.28625 | \n", "0.321250 | \n", "0.41500 | \n", "0.457500 | \n", "0.20875 | \n", "0.273750 | \n", "0.717500 | \n", "0.316250 | \n", "0.460000 | \n", "28.612306 | \n", "7.058530 | \n", "0.231250 | \n", "
std | \n", "231.0844 | \n", "0.493455 | \n", "0.45229 | \n", "0.467249 | \n", "0.49303 | \n", "0.498502 | \n", "0.40667 | \n", "0.446161 | \n", "0.450497 | \n", "0.465303 | \n", "0.498709 | \n", "12.872373 | \n", "3.788969 | \n", "0.421896 | \n", "
min | \n", "1.0000 | \n", "0.000000 | \n", "0.00000 | \n", "0.000000 | \n", "0.00000 | \n", "0.000000 | \n", "0.00000 | \n", "0.000000 | \n", "0.000000 | \n", "0.000000 | \n", "0.000000 | \n", "9.560505 | \n", "-2.594654 | \n", "0.000000 | \n", "
25% | \n", "200.7500 | \n", "0.000000 | \n", "0.00000 | \n", "0.000000 | \n", "0.00000 | \n", "0.000000 | \n", "0.00000 | \n", "0.000000 | \n", "0.000000 | \n", "0.000000 | \n", "0.000000 | \n", "19.282082 | \n", "4.527556 | \n", "0.000000 | \n", "
50% | \n", "400.5000 | \n", "1.000000 | \n", "0.00000 | \n", "0.000000 | \n", "0.00000 | \n", "0.000000 | \n", "0.00000 | \n", "0.000000 | \n", "1.000000 | \n", "0.000000 | \n", "0.000000 | \n", "25.479960 | \n", "6.893472 | \n", "0.000000 | \n", "
75% | \n", "600.2500 | \n", "1.000000 | \n", "1.00000 | \n", "1.000000 | \n", "1.00000 | \n", "1.000000 | \n", "0.00000 | \n", "1.000000 | \n", "1.000000 | \n", "1.000000 | \n", "1.000000 | \n", "33.154755 | \n", "9.892981 | \n", "0.000000 | \n", "
max | \n", "800.0000 | \n", "1.000000 | \n", "1.00000 | \n", "1.000000 | \n", "1.00000 | \n", "1.000000 | \n", "1.00000 | \n", "1.000000 | \n", "1.000000 | \n", "1.000000 | \n", "1.000000 | \n", "72.402488 | \n", "13.390868 | \n", "1.000000 | \n", "
\n", " | ID | \n", "gender | \n", "A1_Score | \n", "A2_Score | \n", "A3_Score | \n", "A4_Score | \n", "A5_Score | \n", "A6_Score | \n", "A7_Score | \n", "A8_Score | \n", "... | \n", "jaundice | \n", "austim | \n", "age | \n", "ethnicity | \n", "contry_of_res | \n", "used_app_before | \n", "result | \n", "relation | \n", "age_desc | \n", "Class/ASD | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1 | \n", "f | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "... | \n", "no | \n", "no | \n", "18.605397 | \n", "White-European | \n", "United States | \n", "no | \n", "7.819715 | \n", "Self | \n", "18 and more | \n", "0 | \n", "
1 | \n", "2 | \n", "f | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "no | \n", "13.829369 | \n", "South Asian | \n", "Australia | \n", "no | \n", "10.544296 | \n", "? | \n", "18 and more | \n", "0 | \n", "
2 | \n", "3 | \n", "f | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "no | \n", "14.679893 | \n", "White-European | \n", "United Kingdom | \n", "no | \n", "13.167506 | \n", "Self | \n", "18 and more | \n", "1 | \n", "
3 | \n", "4 | \n", "f | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "no | \n", "61.035288 | \n", "South Asian | \n", "New Zealand | \n", "no | \n", "1.530098 | \n", "? | \n", "18 and more | \n", "0 | \n", "
4 | \n", "5 | \n", "m | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "no | \n", "yes | \n", "14.256686 | \n", "Black | \n", "Italy | \n", "no | \n", "7.949723 | \n", "Self | \n", "18 and more | \n", "0 | \n", "
5 rows × 22 columns
\n", "\n", " | ID | \n", "gender | \n", "A1_Score | \n", "A2_Score | \n", "A3_Score | \n", "A4_Score | \n", "A5_Score | \n", "A6_Score | \n", "A7_Score | \n", "A8_Score | \n", "... | \n", "jaundice | \n", "austim | \n", "age | \n", "ethnicity | \n", "contry_of_res | \n", "used_app_before | \n", "result | \n", "relation | \n", "age_desc | \n", "Class/ASD | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "... | \n", "0 | \n", "0 | \n", "18.605397 | \n", "10 | \n", "58 | \n", "0 | \n", "7.819715 | \n", "5 | \n", "0 | \n", "0 | \n", "
1 | \n", "2 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "13.829369 | \n", "8 | \n", "6 | \n", "0 | \n", "10.544296 | \n", "0 | \n", "0 | \n", "0 | \n", "
2 | \n", "3 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "14.679893 | \n", "10 | \n", "57 | \n", "0 | \n", "13.167506 | \n", "5 | \n", "0 | \n", "1 | \n", "
3 | \n", "4 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "61.035288 | \n", "8 | \n", "39 | \n", "0 | \n", "1.530098 | \n", "0 | \n", "0 | \n", "0 | \n", "
4 | \n", "5 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "1 | \n", "14.256686 | \n", "2 | \n", "32 | \n", "0 | \n", "7.949723 | \n", "5 | \n", "0 | \n", "0 | \n", "
5 rows × 22 columns
\n", "LogisticRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LogisticRegression()
\n", " | gender | \n", "A1_Score | \n", "A2_Score | \n", "A3_Score | \n", "A4_Score | \n", "A5_Score | \n", "A6_Score | \n", "A7_Score | \n", "A8_Score | \n", "A9_Score | \n", "A10_Score | \n", "jaundice | \n", "austim | \n", "age | \n", "ethnicity | \n", "Class/ASD | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "18.605397 | \n", "10 | \n", "0 | \n", "
1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "13.829369 | \n", "8 | \n", "0 | \n", "
2 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "14.679893 | \n", "10 | \n", "1 | \n", "
3 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "61.035288 | \n", "8 | \n", "0 | \n", "
4 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "14.256686 | \n", "2 | \n", "0 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
795 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "42.084907 | \n", "10 | \n", "1 | \n", "
796 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "17.669291 | \n", "1 | \n", "0 | \n", "
797 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "18.242557 | \n", "10 | \n", "1 | \n", "
798 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "1 | \n", "19.241473 | \n", "5 | \n", "0 | \n", "
799 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "32.170098 | \n", "1 | \n", "0 | \n", "
800 rows × 16 columns
\n", "