Spaces:
Sleeping
Sleeping
File size: 8,557 Bytes
4826002 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
# Import necessary libraries
import streamlit as st
import pandas as pd
import altair as alt
import matplotlib.pyplot as plt
import seaborn as sns
from duckduckgo_search import DDGS
# Function to load the dataset
@st.cache_data # Cache the function to enhance performance
def load_data():
# Define the file path
file_path = 'https://raw.githubusercontent.com/aaubs/ds-master/main/apps/M1-attrition-streamlit/HR-Employee-Attrition-synth.csv'
# Load the CSV file into a pandas dataframe
df = pd.read_csv(file_path)
# Create age groups and add as a new column
bin_edges = [18, 25, 35, 45, 60]
bin_labels = ['18-24', '25-34', '35-44', '45-60']
df['AgeGroup'] = pd.cut(df['Age'], bins=bin_edges, labels=bin_labels, right=False)
return df
# Load the data using the defined function
df = load_data()
# Set the app title and sidebar header
st.title("Employee Attrition Dashboard ππ")
st.sidebar.header("Filters π")
# Introduction
# HR Attrition Dashboard
st.markdown("""
Welcome to the HR Attrition Dashboard. In the backdrop of rising employee turnovers, HR departments are stressing the significance of predicting and understanding employee departures. Through the lens of data analytics, this dashboard unveils the deeper causes of employee churn and proposes strategies to boost employee retention.
""")
with st.expander("π **Objective**"):
st.markdown("""
At the heart of this dashboard is the mission to visually decode data, equipping HR experts with insights to tackle these queries:
- Which company factions face a greater likelihood of employee exits?
- What might be pushing these individuals to part ways?
- Observing the discerned trends, what incentives might hold the key to decreasing the attrition rate?
"""
)
# Tutorial Expander
with st.expander("How to Use the Dashboard π"):
st.markdown("""
1. **Filter Data** - Use the sidebar filters to narrow down specific data sets.
2. **Visualize Data** - From the dropdown, select a visualization type to view patterns.
3. **Insights & Recommendations** - Scroll down to see insights derived from the visualizations and actionable recommendations.
""")
# Sidebar filter: Age Group
selected_age_group = st.sidebar.multiselect("Select Age Groups π°οΈ", df['AgeGroup'].unique().tolist(), default=df['AgeGroup'].unique().tolist())
if not selected_age_group:
st.warning("Please select an age group from the sidebar β οΈ")
st.stop()
filtered_df = df[df['AgeGroup'].isin(selected_age_group)]
# Sidebar filter: Department
departments = df['Department'].unique().tolist()
selected_department = st.sidebar.multiselect("Select Departments π’", departments, default=departments)
if not selected_department:
st.warning("Please select a department from the sidebar β οΈ")
st.stop()
filtered_df = filtered_df[filtered_df['Department'].isin(selected_department)]
# Sidebar filter: Monthly Income Range
min_income = int(df['MonthlyIncome'].min())
max_income = int(df['MonthlyIncome'].max())
income_range = st.sidebar.slider("Select Monthly Income Range π°", min_income, max_income, (min_income, max_income))
filtered_df = filtered_df[(filtered_df['MonthlyIncome'] >= income_range[0]) & (filtered_df['MonthlyIncome'] <= income_range[1])]
# Sidebar filter: Job Satisfaction Level
satisfaction_levels = sorted(df['JobSatisfaction'].unique().tolist())
selected_satisfaction = st.sidebar.multiselect("Select Job Satisfaction Levels π", satisfaction_levels, default=satisfaction_levels)
if not selected_satisfaction:
st.warning("Please select a job satisfaction level from the sidebar β οΈ")
st.stop()
filtered_df = filtered_df[filtered_df['JobSatisfaction'].isin(selected_satisfaction)]
# Displaying the Attrition Analysis header
st.header("Attrition Analysis π")
# Dropdown to select the type of visualization
visualization_option = st.selectbox(
"Select Visualization π¨",
["Attrition by Age Group",
"KDE Plot: Distance from Home by Attrition",
"Attrition by Job Role",
"Attrition Distribution by Gender",
"MonthlyRate and DailyRate by JobLevel"]
)
# Visualizations based on user selection
if visualization_option == "Attrition by Age Group":
# Bar chart for attrition by age group
chart = alt.Chart(filtered_df).mark_bar().encode(
x='AgeGroup',
y='count()',
color='Attrition'
).properties(
title='Attrition Rate by Age Group'
)
st.altair_chart(chart, use_container_width=True)
elif visualization_option == "KDE Plot: Distance from Home by Attrition":
# KDE plot for Distance from Home based on Attrition
plt.figure(figsize=(10, 6))
sns.kdeplot(data=filtered_df, x='DistanceFromHome', hue='Attrition', fill=True, palette='Set2')
plt.xlabel('Distance From Home')
plt.ylabel('Density')
plt.title('KDE Plot of Distance From Home by Attrition')
st.pyplot(plt)
elif visualization_option == "Attrition by Job Role":
# Bar chart for attrition by job role
chart = alt.Chart(filtered_df).mark_bar().encode(
y='JobRole',
x='count()',
color='Attrition'
).properties(
title='Attrition by Job Role'
)
st.altair_chart(chart, use_container_width=True)
elif visualization_option == "Attrition Distribution by Gender":
# Pie chart for attrition distribution by gender
pie_chart_data = filtered_df[filtered_df['Attrition'] == 'Yes']['Gender'].value_counts().reset_index()
pie_chart_data.columns = ['Gender', 'count']
chart = alt.Chart(pie_chart_data).mark_arc().encode(
theta='count:Q',
color='Gender:N',
tooltip=['Gender', 'count']
).properties(
title='Attrition Distribution by Gender',
width=300,
height=300
).project('identity')
st.altair_chart(chart, use_container_width=True)
elif visualization_option == "MonthlyRate and DailyRate by JobLevel":
# Boxplots for MonthlyRate and DailyRate by JobLevel
fig, ax = plt.subplots(1, 2, figsize=(15, 7))
# MonthlyRate by JobLevel
sns.boxplot(x="JobLevel", y="MonthlyRate", data=filtered_df, ax=ax[0], hue="JobLevel", palette='Set2', legend=False)
ax[0].set_title('MonthlyRate by JobLevel')
ax[0].set_xlabel('Job Level')
ax[0].set_ylabel('Monthly Rate')
# DailyRate by JobLevel
sns.boxplot(x="JobLevel", y="DailyRate", data=filtered_df, ax=ax[1], hue="JobLevel", palette='Set2', legend=False)
ax[1].set_title('DailyRate by JobLevel')
ax[1].set_xlabel('Job Level')
ax[1].set_ylabel('Daily Rate')
plt.tight_layout()
st.pyplot(fig)
# Display dataset overview
st.header("Dataset Overview")
st.dataframe(df.describe())
# Insights from Visualization Section Expander
with st.expander("Insights from Visualization π§ "):
st.markdown("""
1. **Age Groups & Attrition** - The 'Attrition by Age Group' plot showcases which age brackets face higher attrition.
2. **Home Distance's Impact** - The 'KDE Plot: Distance from Home by Attrition' visualizes if being farther away influences leaving tendencies.
3. **Roles & Attrition** - 'Attrition by Job Role' reveals which roles might be more attrition-prone.
4. **Gender & Attrition** - The pie chart for 'Attrition Distribution by Gender' provides insights into any gender-based patterns.
5. **Earnings Patterns** - 'MonthlyRate and DailyRate by JobLevel' boxplots display the compensation distribution across job levels.
""")
# Recommendations Expander
with st.expander("Recommendations for Action π"):
st.markdown("""
- π **Incentive Programs:** Introduce incentives tailored for groups showing higher attrition tendencies.
- π‘ **Remote Work Options:** Providing flexibility, especially for those living farther from the workplace, could reduce attrition.
- π **Training & Growth:** Invest in employee development, especially in roles with higher attrition rates.
- π« **Gender Equality:** Foster an environment that supports equal opportunities regardless of gender.
- πΈ **Compensation Review:** Regularly review and adjust compensation structures to stay competitive and retain talent.
""")
if st.button("AI Little Inayah"):
with st.expander("AI Evalution"):
st.markdown(DDGS().chat("You are a smart HR person: Provide a concise 3 sentences evaluation of HR situation given some certain datapoints here: "+str(df.describe()), model='claude-3-haiku'))
|