wanwanlin0521 commited on
Commit
30fc316
·
verified ·
1 Parent(s): 35111f7

Update src/streamlit_app.py

Browse files
Files changed (1) hide show
  1. src/streamlit_app.py +23 -18
src/streamlit_app.py CHANGED
@@ -1,13 +1,3 @@
1
- import os
2
- from pathlib import Path
3
-
4
- # point Streamlit at a writable folder
5
- os.environ["STREAMLIT_CONFIG_DIR"] = "/tmp/.streamlit"
6
- Path(os.environ["STREAMLIT_CONFIG_DIR"]).mkdir(parents=True, exist_ok=True)
7
-
8
- import streamlit as st
9
- # …the rest of your imports and code…
10
-
11
  # Imports.
12
  import streamlit as st
13
  import seaborn as sns
@@ -20,7 +10,12 @@ import geopandas as gpd
20
  import folium
21
  from streamlit_folium import st_folium
22
  import json
 
 
23
 
 
 
 
24
 
25
  # ── 0. Page configuration ──
26
  st.set_page_config(
@@ -160,9 +155,14 @@ fig.update_layout(
160
  title_x=0.5
161
  )
162
 
 
163
  st.plotly_chart(fig, use_container_width=True)
 
 
164
  st.markdown(""" The donut chart shows the share of the ten most frequent crime categories in the selected year. At the center, you can see that Vehicle ­– Stolen is the single largest slice, accounting for roughly 18.7% of all incidents, The remaining five categories each represent between 3%–5% of total incidents—these include miscellaneous crimes, criminal threats, assault with a deadly weapon, burglary, and minor vandalism. By displaying both slice size and percentage labels, the chart makes it easy to compare how dominant property‐related offenses are, versus violent or lesser‐common crimes, in that year’s LAPD data. """)
165
 
 
 
166
  top_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index
167
  df_top = df[df['crm_cd_desc'].isin(top_crimes)]
168
 
@@ -196,12 +196,13 @@ fig.tight_layout()
196
  # 5. Render in Streamlit
197
  st.pyplot(fig)
198
 
199
-
200
  st.markdown("""
201
  This heatmap shows the frequency of the top 10 crimes from 2020 to 2025. The x axis is year and the y axis is crime type. The colormap is 'YlOrRd' to create a distinct visual difference in number of incidents. Dark red means that the incident frequency is high while light yellow means that the incident frequency is low. 'Vehicle Stolen' seems to be the most prevalent crime for all five years, given its values are highlighted in deeper shades of red. 'Vehicle Stolen' also seems to fluctuate between 20000 and 24000 throughout the five years. 'Thief of identity' also saw a spike in incident frequency for 2022, recording 21251 crimes. Limiting the heatmap to top 10 crimes addressed the most prominent crimes in LA. Since 2025 is not over, data for that year is still relatively inclusive. This visualization can help law enforcement easily detect trends of different crimes for a specific year. This data may allow them to predict future rates and be able to allocate resources accordingly to mitigate these crimes.
202
  """)
203
 
204
- ### Use this one!!!
 
205
  # Count the crime type and list out the top 10 crime type that have the most cases.
206
  top_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index
207
  df = df[df['year'] != 2025]
@@ -223,8 +224,7 @@ alt.Chart(stacked_year_df).mark_bar().encode(
223
  )
224
 
225
 
226
- ### Use this one!!!
227
- # Plot 3: Line chart.
228
  df = df[df['year'] != 2025] # 2025 is not end, so the trend can't be see
229
 
230
  # Group the each crime type by year.
@@ -253,12 +253,14 @@ line_chart = alt.Chart(filtered_crimes).mark_line(point=True).encode(
253
  # Display the plot.
254
  line_chart
255
 
 
256
  st.markdown(""" This plot is a line chart visualizing the annual number of incidents for the top 5 most frequent crime types over a five-year period, from 2020 to 2024. Each line represents a distinct crime type, allowing for easy comparison of trends across different categories. The x-axis represents the year, the y-axis indicates the number of incidents, and a legend identifies the color corresponding to each specific crime type: Battery - Simple Assault, Burglary From Vehicle, Theft of Identity, Vandalism - Felony , and Vehicle - Stolen. The plot highlights the fluctuations and overall trajectories of these major crime categories across the years.""")
257
 
 
 
258
  # Load data
259
  with open(GEOJSON_PATH, "r", encoding="utf-8") as f:
260
  geojson_data = json.load(f)
261
-
262
 
263
  # Identify top 10 crime types
264
  top_10_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index.tolist()
@@ -317,11 +319,12 @@ for _, row in df_filtered.iterrows():
317
  # Display the new map.
318
  st_folium(new_map, width=1000, height=800)
319
 
 
320
  st.markdown("""
321
  This visualization uses Folium to build an interactive map of crime distribution in Los Angeles, highlighting the geospatial clustering characteristics of different years and crime types, and emphasizing the user's experience of freely exploring the map. The base map uses real streets and geographic backgrounds to enhance the spatial visualization of the image. The map shows the administrative boundaries of Los Angeles County in blue polygons, which are loaded with GeoJSON data and overlaid on the map to specify the geographic boundaries of crime locations. The red dots on the map represent the location of individual crimes, and the system samples no more than 300 data items from this category for visualization, with each dot pinpointed by latitude and longitude coordinates. The map supports full Leaflet.js functionality, including zooming, dragging, layer control, and other operations, which greatly enhances the flexibility of data exploration. A drop-down menu in the upper left corner of the page allows users to customize filters for specific years and crime types, enabling instant updates to the map content.
322
  """)
323
 
324
- ### Use this one!!!
325
  # Count the crime type and list out the top 10 crime type that have the most cases.
326
  top_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index
327
  df = df[df['year'] != 2025]
@@ -342,15 +345,17 @@ bar_chart = alt.Chart(stacked_year_df).mark_bar().encode(
342
  title='Stacked Crime Composition by Year (Top 10 Crime Types)'
343
  )
344
 
 
345
  st.altair_chart(bar_chart, use_container_width=True)
346
 
 
347
  st.markdown("""
348
  Description: Our stacked bar chart shows the number of reported crimes for the top 10 most common crime types from 2020 to 2024. Each bar represents a year, and the different colors in the bars show different types of crimes, like stolen vehicles, burglary, vandalism, and assault. The taller the colored section, the more incidents of that crime there were in that year.
349
 
350
  By observing the plot, we can find out that 2022 had the most crimes, the year had the second most crimes is 2023, and etc. Besides that, we can also find out that some crimes, like vehicle theft, petty theft, and burglary from vehicles, happened a lot every year and make up a big part of the total.
351
  """)
352
 
353
-
354
  top_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index
355
  df_top = df[df['crm_cd_desc'].isin(top_crimes)]
356
 
@@ -380,7 +385,7 @@ barchart = alt.Chart(heatmap1_df).mark_bar().encode(
380
  # Display the plot.
381
  barchart
382
 
383
-
384
  st.markdown(""" This interactive bar chart allows users to explore the most frequently reported crime types in Los Angeles by year. By adjusting the slider below the chart, the visualization updates in real time to show the top ten crime categories for the selected year. Each bar represents the total number of incidents, with color coding used to distinguish different crime types and a legend on the right for reference.
385
  This visualization makes it easy to compare how the composition of major crime types evolves over time and to detect emerging issues that may require further investigation or policy response.
386
  """)
 
 
 
 
 
 
 
 
 
 
 
1
  # Imports.
2
  import streamlit as st
3
  import seaborn as sns
 
10
  import folium
11
  from streamlit_folium import st_folium
12
  import json
13
+ import os
14
+ from pathlib import Path
15
 
16
+ # point Streamlit at a writable folder
17
+ os.environ["STREAMLIT_CONFIG_DIR"] = "/tmp/.streamlit"
18
+ Path(os.environ["STREAMLIT_CONFIG_DIR"]).mkdir(parents=True, exist_ok=True)
19
 
20
  # ── 0. Page configuration ──
21
  st.set_page_config(
 
155
  title_x=0.5
156
  )
157
 
158
+ # Display the plot.
159
  st.plotly_chart(fig, use_container_width=True)
160
+
161
+ # Description.
162
  st.markdown(""" The donut chart shows the share of the ten most frequent crime categories in the selected year. At the center, you can see that Vehicle ­– Stolen is the single largest slice, accounting for roughly 18.7% of all incidents, The remaining five categories each represent between 3%–5% of total incidents—these include miscellaneous crimes, criminal threats, assault with a deadly weapon, burglary, and minor vandalism. By displaying both slice size and percentage labels, the chart makes it easy to compare how dominant property‐related offenses are, versus violent or lesser‐common crimes, in that year’s LAPD data. """)
163
 
164
+
165
+ # -------------------------------- Plot 2: Heat Map --------------------------------
166
  top_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index
167
  df_top = df[df['crm_cd_desc'].isin(top_crimes)]
168
 
 
196
  # 5. Render in Streamlit
197
  st.pyplot(fig)
198
 
199
+ # Description.
200
  st.markdown("""
201
  This heatmap shows the frequency of the top 10 crimes from 2020 to 2025. The x axis is year and the y axis is crime type. The colormap is 'YlOrRd' to create a distinct visual difference in number of incidents. Dark red means that the incident frequency is high while light yellow means that the incident frequency is low. 'Vehicle Stolen' seems to be the most prevalent crime for all five years, given its values are highlighted in deeper shades of red. 'Vehicle Stolen' also seems to fluctuate between 20000 and 24000 throughout the five years. 'Thief of identity' also saw a spike in incident frequency for 2022, recording 21251 crimes. Limiting the heatmap to top 10 crimes addressed the most prominent crimes in LA. Since 2025 is not over, data for that year is still relatively inclusive. This visualization can help law enforcement easily detect trends of different crimes for a specific year. This data may allow them to predict future rates and be able to allocate resources accordingly to mitigate these crimes.
202
  """)
203
 
204
+
205
+ # -------------------------------- ???????????????? --------------------------------
206
  # Count the crime type and list out the top 10 crime type that have the most cases.
207
  top_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index
208
  df = df[df['year'] != 2025]
 
224
  )
225
 
226
 
227
+ # -------------------------------- Plot 3: Line Chart --------------------------------
 
228
  df = df[df['year'] != 2025] # 2025 is not end, so the trend can't be see
229
 
230
  # Group the each crime type by year.
 
253
  # Display the plot.
254
  line_chart
255
 
256
+ # Description.
257
  st.markdown(""" This plot is a line chart visualizing the annual number of incidents for the top 5 most frequent crime types over a five-year period, from 2020 to 2024. Each line represents a distinct crime type, allowing for easy comparison of trends across different categories. The x-axis represents the year, the y-axis indicates the number of incidents, and a legend identifies the color corresponding to each specific crime type: Battery - Simple Assault, Burglary From Vehicle, Theft of Identity, Vandalism - Felony , and Vehicle - Stolen. The plot highlights the fluctuations and overall trajectories of these major crime categories across the years.""")
258
 
259
+
260
+ # -------------------------------- Plot 4: Map --------------------------------
261
  # Load data
262
  with open(GEOJSON_PATH, "r", encoding="utf-8") as f:
263
  geojson_data = json.load(f)
 
264
 
265
  # Identify top 10 crime types
266
  top_10_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index.tolist()
 
319
  # Display the new map.
320
  st_folium(new_map, width=1000, height=800)
321
 
322
+ # Description.
323
  st.markdown("""
324
  This visualization uses Folium to build an interactive map of crime distribution in Los Angeles, highlighting the geospatial clustering characteristics of different years and crime types, and emphasizing the user's experience of freely exploring the map. The base map uses real streets and geographic backgrounds to enhance the spatial visualization of the image. The map shows the administrative boundaries of Los Angeles County in blue polygons, which are loaded with GeoJSON data and overlaid on the map to specify the geographic boundaries of crime locations. The red dots on the map represent the location of individual crimes, and the system samples no more than 300 data items from this category for visualization, with each dot pinpointed by latitude and longitude coordinates. The map supports full Leaflet.js functionality, including zooming, dragging, layer control, and other operations, which greatly enhances the flexibility of data exploration. A drop-down menu in the upper left corner of the page allows users to customize filters for specific years and crime types, enabling instant updates to the map content.
325
  """)
326
 
327
+ # -------------------------------- Plot 4: Stacked Bar Chart --------------------------------
328
  # Count the crime type and list out the top 10 crime type that have the most cases.
329
  top_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index
330
  df = df[df['year'] != 2025]
 
345
  title='Stacked Crime Composition by Year (Top 10 Crime Types)'
346
  )
347
 
348
+ # Display the plot.
349
  st.altair_chart(bar_chart, use_container_width=True)
350
 
351
+ # Description.
352
  st.markdown("""
353
  Description: Our stacked bar chart shows the number of reported crimes for the top 10 most common crime types from 2020 to 2024. Each bar represents a year, and the different colors in the bars show different types of crimes, like stolen vehicles, burglary, vandalism, and assault. The taller the colored section, the more incidents of that crime there were in that year.
354
 
355
  By observing the plot, we can find out that 2022 had the most crimes, the year had the second most crimes is 2023, and etc. Besides that, we can also find out that some crimes, like vehicle theft, petty theft, and burglary from vehicles, happened a lot every year and make up a big part of the total.
356
  """)
357
 
358
+ # -------------------------------- Plot 5: Bar Chart --------------------------------
359
  top_crimes = df['crm_cd_desc'].value_counts().nlargest(10).index
360
  df_top = df[df['crm_cd_desc'].isin(top_crimes)]
361
 
 
385
  # Display the plot.
386
  barchart
387
 
388
+ # Description.
389
  st.markdown(""" This interactive bar chart allows users to explore the most frequently reported crime types in Los Angeles by year. By adjusting the slider below the chart, the visualization updates in real time to show the top ten crime categories for the selected year. Each bar represents the total number of incidents, with color coding used to distinguish different crime types and a legend on the right for reference.
390
  This visualization makes it easy to compare how the composition of major crime types evolves over time and to detect emerging issues that may require further investigation or policy response.
391
  """)