# Autism

Autism, also known as Autism Spectrum Disorder (ASD), is a neurodevelopmental condition characterized by challenges in social interaction, communication, and repetitive behaviors. Individuals with autism may exhibit a wide range of abilities and symptoms, forming a spectrum. Understanding and accurately classifying autism can be a complex task due to the diversity within the spectrum. Machine Learning (ML) plays a vital role in addressing this challenge by leveraging algorithms to analyze patterns and make predictions based on data.

## Data

ID - ID of the patient

A1_Score to A10_Score - Score based on Autism Spectrum Quotient (AQ) 10 item screening tool

age - Age of the patient in years

gender - Gender of the patient

ethnicity - Ethnicity of the patient

jaundice - Whether the patient had jaundice at the time of birth

autism - Whether an immediate family member has been diagnosed with autism

contry_of_res - Country of residence of the patient

used_app_before - Whether the patient has undergone a screening test before

result - Score for AQ1-10 screening test

age_desc - Age of the patient

relation - Relation of patient who completed the test

Class/ASD - Classified result as 0 or 1. Here 0 represents No and 1 represents Yes. This is the target column, and during submission submit the values as 0 or 1 only.

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
from sklearn.linear_model import LogisticRegression 
import joblib


In [2]:
data=pd.read_csv("data.csv")

In [3]:
data

Unnamed: 0,ID,gender,A1_Score,A2_Score,A3_Score,A4_Score,A5_Score,A6_Score,A7_Score,A8_Score,...,jaundice,austim,age,ethnicity,contry_of_res,used_app_before,result,relation,age_desc,Class/ASD
0,1,f,1,0,1,1,1,1,0,1,...,no,no,18.605397,White-European,United States,no,7.819715,Self,18 and more,0
1,2,f,0,0,0,0,0,0,0,0,...,no,no,13.829369,South Asian,Australia,no,10.544296,?,18 and more,0
2,3,f,1,1,1,1,1,1,0,0,...,no,no,14.679893,White-European,United Kingdom,no,13.167506,Self,18 and more,1
3,4,f,0,0,0,1,0,0,0,0,...,no,no,61.035288,South Asian,New Zealand,no,1.530098,?,18 and more,0
4,5,m,0,0,0,0,1,0,0,0,...,no,yes,14.256686,Black,Italy,no,7.949723,Self,18 and more,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,796,f,1,1,1,1,1,1,1,1,...,no,yes,42.084907,White-European,United States,no,13.390868,Self,18 and more,1
796,797,f,1,1,0,0,1,0,0,0,...,no,no,17.669291,Asian,New Zealand,no,9.454201,Self,18 and more,0
797,798,m,0,0,0,0,0,0,1,0,...,yes,no,18.242557,White-European,Jordan,no,6.805509,Self,18 and more,1
798,799,f,1,1,1,1,1,1,0,1,...,no,yes,19.241473,Middle Eastern,United States,no,3.682732,Relative,18 and more,0


In [4]:
data.describe()

Unnamed: 0,ID,A1_Score,A2_Score,A3_Score,A4_Score,A5_Score,A6_Score,A7_Score,A8_Score,A9_Score,A10_Score,age,result,Class/ASD
count,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0
mean,400.5,0.5825,0.28625,0.32125,0.415,0.4575,0.20875,0.27375,0.7175,0.31625,0.46,28.612306,7.05853,0.23125
std,231.0844,0.493455,0.45229,0.467249,0.49303,0.498502,0.40667,0.446161,0.450497,0.465303,0.498709,12.872373,3.788969,0.421896
min,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,9.560505,-2.594654,0.0
25%,200.75,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,19.282082,4.527556,0.0
50%,400.5,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,25.47996,6.893472,0.0
75%,600.25,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,33.154755,9.892981,0.0
max,800.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,72.402488,13.390868,1.0


In [5]:
data.isna().sum()

ID                 0
gender             0
A1_Score           0
A2_Score           0
A3_Score           0
A4_Score           0
A5_Score           0
A6_Score           0
A7_Score           0
A8_Score           0
A9_Score           0
A10_Score          0
jaundice           0
austim             0
age                0
ethnicity          0
contry_of_res      0
used_app_before    0
result             0
relation           0
age_desc           0
Class/ASD          0
dtype: int64

# Preprocessing

### 1.Data dosen't have a null value

### 2.Data Encoding

In [6]:
data.head()

Unnamed: 0,ID,gender,A1_Score,A2_Score,A3_Score,A4_Score,A5_Score,A6_Score,A7_Score,A8_Score,...,jaundice,austim,age,ethnicity,contry_of_res,used_app_before,result,relation,age_desc,Class/ASD
0,1,f,1,0,1,1,1,1,0,1,...,no,no,18.605397,White-European,United States,no,7.819715,Self,18 and more,0
1,2,f,0,0,0,0,0,0,0,0,...,no,no,13.829369,South Asian,Australia,no,10.544296,?,18 and more,0
2,3,f,1,1,1,1,1,1,0,0,...,no,no,14.679893,White-European,United Kingdom,no,13.167506,Self,18 and more,1
3,4,f,0,0,0,1,0,0,0,0,...,no,no,61.035288,South Asian,New Zealand,no,1.530098,?,18 and more,0
4,5,m,0,0,0,0,1,0,0,0,...,no,yes,14.256686,Black,Italy,no,7.949723,Self,18 and more,0


In [7]:
cat = {'ethnicity':'category',
       'gender':'category', 
       'jaundice':'category',
       'austim':'category',
       'contry_of_res':'category', 
       'used_app_before':'category',
        'age_desc':'category',
        'relation':'category'}
data = data.astype(cat)

In [8]:
cat_columns = ['ethnicity', 'gender', 'jaundice', 'austim', 'contry_of_res', 'used_app_before', 'age_desc', 'relation']

for col in cat_columns:
    data[col] = data[col].cat.codes


In [9]:
data.head()

Unnamed: 0,ID,gender,A1_Score,A2_Score,A3_Score,A4_Score,A5_Score,A6_Score,A7_Score,A8_Score,...,jaundice,austim,age,ethnicity,contry_of_res,used_app_before,result,relation,age_desc,Class/ASD
0,1,0,1,0,1,1,1,1,0,1,...,0,0,18.605397,10,58,0,7.819715,5,0,0
1,2,0,0,0,0,0,0,0,0,0,...,0,0,13.829369,8,6,0,10.544296,0,0,0
2,3,0,1,1,1,1,1,1,0,0,...,0,0,14.679893,10,57,0,13.167506,5,0,1
3,4,0,0,0,0,1,0,0,0,0,...,0,0,61.035288,8,39,0,1.530098,0,0,0
4,5,1,0,0,0,0,1,0,0,0,...,0,1,14.256686,2,32,0,7.949723,5,0,0


In [10]:
print(data.dtypes)

ID                   int64
gender                int8
A1_Score             int64
A2_Score             int64
A3_Score             int64
A4_Score             int64
A5_Score             int64
A6_Score             int64
A7_Score             int64
A8_Score             int64
A9_Score             int64
A10_Score            int64
jaundice              int8
austim                int8
age                float64
ethnicity             int8
contry_of_res         int8
used_app_before       int8
result             float64
relation              int8
age_desc              int8
Class/ASD            int64
dtype: object


### 3.Drob unusiful data

In [11]:
data=data.drop('ID', axis=1)
data=data.drop('used_app_before', axis=1)
data=data.drop('relation', axis=1)
data=data.drop('result', axis=1)
data=data.drop('age_desc', axis=1)
data=data.drop('contry_of_res', axis=1)

### 4.Split Data

In [12]:
X = data.drop('Class/ASD', axis=1)
y = data['Class/ASD']
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=3)

### 5.Standard Scaler


In [13]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)

# Classification

In [14]:
model = LogisticRegression()

In [15]:
model.fit(X_train, y_train)

In [16]:
y_pred = model.predict(X_val)

In [17]:
accuracy = accuracy_score(y_val, y_pred)
print("Accuracy:", accuracy)

Accuracy: 0.9


In [18]:
data

Unnamed: 0,gender,A1_Score,A2_Score,A3_Score,A4_Score,A5_Score,A6_Score,A7_Score,A8_Score,A9_Score,A10_Score,jaundice,austim,age,ethnicity,Class/ASD
0,0,1,0,1,1,1,1,0,1,1,1,0,0,18.605397,10,0
1,0,0,0,0,0,0,0,0,0,0,1,0,0,13.829369,8,0
2,0,1,1,1,1,1,1,0,0,1,1,0,0,14.679893,10,1
3,0,0,0,0,1,0,0,0,0,0,0,0,0,61.035288,8,0
4,1,0,0,0,0,1,0,0,0,1,1,0,1,14.256686,2,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,0,1,1,1,1,1,1,1,1,1,1,0,1,42.084907,10,1
796,0,1,1,0,0,1,0,0,0,1,1,0,0,17.669291,1,0
797,1,0,0,0,0,0,0,1,0,1,1,1,0,18.242557,10,1
798,0,1,1,1,1,1,1,0,1,1,1,0,1,19.241473,5,0


Test the model

In [19]:
input_data = np.array([1	,0	,0	,0	,0	,0	,0	,1	,0	,1	,1	,1	,0	,18.242557	,10])
input_data_reshaped = input_data.reshape(1, -1)
predication = model.predict(input_data_reshaped)
print(predication)

[0]


Save the model and the scaler


In [20]:
joblib.dump(model, 'autism_model.pkl')
joblib.dump(scaler, 'scaler.pkl')

['scaler.pkl']

In [21]:
import joblib
import pandas as pd

In [22]:
model = joblib.load("autism_model.pkl")

In [27]:
sample_input ={
    "A1_Score": 1,
  "A2_Score": 0,
  "A3_Score": 0,
  "A4_Score": 0,
  "A5_Score": 0,
  "A6_Score": 0,
  "A7_Score": 0,
  "A8_Score": 1,
  "A9_Score": 0,
  "A10_Score": 1,
  "jaundice": 1,
  "autism": 1,
  "age": 18.242557,
  "gender": 1,
  "ethnicity":1
}

In [29]:
import numpy as np

# Given dictionary
data = {
    'A1_Score': 1, 'A2_Score': 0, 'A3_Score': 0, 'A4_Score': 0, 'A5_Score': 0, 
    'A6_Score': 0, 'A7_Score': 0, 'A8_Score': 1, 'A9_Score': 0, 'A10_Score': 1, 
    'jaundice': 1, 'autism': 1, 'age': 18.242557, "gender": 1,
  "ethnicity":1
}

# Convert the dictionary values to a list
data_list = list(data.values())

# Convert the list to a 2D numpy array
data_array = np.array(data_list).reshape(1, -1)

print(data_array)


[[ 1.        0.        0.        0.        0.        0.        0.
   1.        0.        1.        1.        1.       18.242557  1.
   1.      ]]


In [34]:
model.predict(data_array)[0]

1