simonraj commited on
Commit
81a5ff5
1 Parent(s): 2544dd9

Upload 4 files

Browse files
Files changed (4) hide show
  1. README.md +48 -12
  2. app.py +288 -0
  3. custom_theme.css +171 -0
  4. requirements.txt +5 -0
README.md CHANGED
@@ -1,12 +1,48 @@
1
- ---
2
- title: GradebookReport
3
- emoji: 🏆
4
- colorFrom: yellow
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 4.38.1
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Gradebook Data Processor
2
+
3
+ This application is designed to process and visualize educational data from Excel files. It allows users to upload Excel files containing course and user activity data, processes this data to generate insights, and visualizes these insights through various charts and tables.
4
+
5
+ ## Features
6
+
7
+ - **File Upload and Processing**: Users can upload Excel files (.xls, .xlsx) with specific required columns. The application checks for the presence of these columns, processes the data, and generates individual CSV files for each user with their respective data.
8
+ - **Data Insights Dashboard**: After processing the data, the application provides a dashboard with various insights including summary statistics, user activity levels, courses per user, and a scatter plot visualizing courses vs. activity level.
9
+
10
+ ## How to Use
11
+
12
+ 1. **Start the Application**: Run the application. This will launch a web interface.
13
+ 2. **Upload Excel File**: Navigate to the "File Upload and Processing" tab, and upload an Excel file containing the required data.
14
+ 3. **Process Data**: Click on the "Process Data" button to process the uploaded file. The application will display the processing result and the location of generated CSV files.
15
+ 4. **View Insights**: Switch to the "Data Insights Dashboard" tab to view the generated insights and visualizations.
16
+
17
+ ## Required Excel File Format
18
+
19
+ The Excel file must contain the following columns:
20
+ - `user_id`
21
+ - `lastname`
22
+ - `course_id`
23
+
24
+ Additional columns are required for generating individual CSV files for each user.
25
+
26
+ ## Visualizations
27
+
28
+ - **Summary Statistics**: Displays total users, total courses, total activity, average courses per user, and average activity per user.
29
+ - **User Activity Levels**: A bar chart showing the activity levels of users.
30
+ - **Courses per User**: A bar chart showing the number of courses per user.
31
+ - **Courses vs. Activity Level**: A scatter plot visualizing the relationship between the number of courses and activity levels.
32
+
33
+ ## Technologies Used
34
+
35
+ - Python
36
+ - Pandas for data processing
37
+ - Gradio for the web interface
38
+ - Plotly for data visualization
39
+
40
+ ## Installation
41
+
42
+ To run this application, you will need Python installed on your system. Clone the repository, install the required dependencies using `pip install -r requirements.txt`, and run `app.py`.
43
+
44
+ ```sh
45
+ git clone <repository-url>
46
+ cd <repository-directory>
47
+ pip install -r requirements.txt
48
+ python app.py
app.py ADDED
@@ -0,0 +1,288 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import os
3
+ import gradio as gr
4
+ import plotly.express as px
5
+ from typing import Tuple, List, Union
6
+ import traceback
7
+
8
+ # NTU Singapore colors
9
+ NTU_BLUE = "#003D7C"
10
+ NTU_RED = "#C11E38"
11
+ NTU_GOLD = "#E7B820"
12
+
13
+ def process_data(file: gr.File, progress=gr.Progress()) -> Tuple[str, str, pd.DataFrame]:
14
+ try:
15
+ # Check if file is uploaded
16
+ if file is None:
17
+ raise ValueError("No file uploaded. Please upload an Excel file.")
18
+
19
+ # Check file extension
20
+ if not file.name.lower().endswith(('.xls', '.xlsx')):
21
+ raise ValueError("Invalid file format. Please upload an Excel file (.xls or .xlsx).")
22
+
23
+ # Load the raw Excel file
24
+ try:
25
+ raw_data = pd.read_excel(file.name)
26
+ except Exception as e:
27
+ raise ValueError(f"Error reading Excel file: {str(e)}")
28
+
29
+ # Check if required columns are present
30
+ required_columns = ['user_id', 'lastname', 'course_id']
31
+ missing_columns = [col for col in required_columns if col not in raw_data.columns]
32
+ if missing_columns:
33
+ raise ValueError(f"Missing required columns: {', '.join(missing_columns)}")
34
+
35
+ # Extract filename without extension
36
+ base_filename = os.path.splitext(os.path.basename(file.name))[0]
37
+
38
+ # Define output paths
39
+ final_file_path = f'mailmerge {base_filename}.xlsx'
40
+ base_path = 'mailmerge'
41
+
42
+ # Step 1: Extract User Information
43
+ user_info = raw_data[['user_id', 'lastname']].drop_duplicates().copy()
44
+ user_info['Username'] = user_info['user_id']
45
+ user_info['Name'] = user_info['lastname']
46
+ user_info['Email'] = user_info['user_id'] + '@ntu.edu.sg'
47
+
48
+ progress(0.2, desc="Extracting user information")
49
+
50
+ # Step 2: Calculate Course Count
51
+ course_counts = raw_data.groupby('user_id')['course_id'].nunique().reset_index()
52
+ course_counts.columns = ['Username', 'Courses']
53
+ user_info = user_info.merge(course_counts, on='Username', how='left')
54
+
55
+ progress(0.4, desc="Calculating course counts")
56
+
57
+ # Step 3: Calculate Grand Total
58
+ event_counts = raw_data.groupby('user_id').size().reset_index(name='Grand Total')
59
+ event_counts.columns = ['Username', 'Grand Total']
60
+ user_info = user_info.merge(event_counts, on='Username', how='left')
61
+
62
+ progress(0.6, desc="Calculating grand totals")
63
+
64
+ # Step 4: Generate Filenames and Paths
65
+ user_info['File'] = 'User_' + user_info['Username'] + '_data.csv'
66
+ user_info['Path'] = user_info['File'].apply(lambda x: os.path.join(base_path, x))
67
+
68
+ # Remove extra columns and summary rows
69
+ user_info = user_info[['Username', 'Name', 'Courses', 'Grand Total', 'Email', 'File', 'Path']]
70
+ user_info = user_info[user_info['Username'].notna()]
71
+ user_info.drop_duplicates(subset=['Username'], inplace=True)
72
+ user_info.sort_values(by='Username', inplace=True)
73
+
74
+ progress(0.8, desc="Generating individual CSV files")
75
+
76
+ # Generate individual CSV files for each user
77
+ required_columns = ['course_id', 'course_pk1', 'data', 'event_type', 'internal_handle', 'lastname', 'session_id', 'timestamp', 'user_id', 'system_role']
78
+ missing_columns = [col for col in required_columns if col not in raw_data.columns]
79
+ if missing_columns:
80
+ raise ValueError(f"Missing columns for individual CSV files: {', '.join(missing_columns)}")
81
+
82
+ if not os.path.exists(base_path):
83
+ try:
84
+ os.makedirs(base_path)
85
+ except PermissionError:
86
+ raise PermissionError(f"Unable to create directory {base_path}. Please check your permissions.")
87
+
88
+ for user_id in user_info['Username'].unique():
89
+ user_data = raw_data[raw_data['user_id'] == user_id][required_columns]
90
+ user_file_path = os.path.join(base_path, f'User_{user_id}_data.csv')
91
+ try:
92
+ user_data.to_csv(user_file_path, index=False)
93
+ except PermissionError:
94
+ raise PermissionError(f"Unable to save file {user_file_path}. Please check your permissions.")
95
+
96
+ progress(0.9, desc="Saving final Excel file")
97
+
98
+ # Save the final dataframe to the output Excel file
99
+ try:
100
+ with pd.ExcelWriter(final_file_path, engine='xlsxwriter') as writer:
101
+ user_info.to_excel(writer, index=False, sheet_name='Sheet1')
102
+ workbook = writer.book
103
+ worksheet = writer.sheets['Sheet1']
104
+
105
+ # Find the last row number dynamically
106
+ last_row = len(user_info) + 1 # Account for header row in Excel
107
+
108
+ # Write the total values in columns B, C, and D of the first empty row after the user data
109
+ worksheet.write(f'B{last_row + 1}', 'Total')
110
+ worksheet.write(f'C{last_row + 1}', user_info['Courses'].sum())
111
+ worksheet.write(f'D{last_row + 1}', user_info['Grand Total'].sum())
112
+
113
+ progress(1.0, desc="Processing complete")
114
+ return f"Processing complete. Output saved to {final_file_path}", f"Individual CSV files saved in {base_path} directory", user_info
115
+ except PermissionError:
116
+ raise PermissionError(f"Unable to save file {final_file_path}. Please check if the file is open or if you have the necessary permissions.")
117
+ except Exception as e:
118
+ raise Exception(f"An error occurred while saving the final Excel file: {str(e)}")
119
+
120
+ except Exception as e:
121
+ error_msg = f"Error: {str(e)}\n\nTraceback:\n{traceback.format_exc()}"
122
+ return error_msg, "Processing failed", pd.DataFrame()
123
+
124
+ def create_summary_stats(df: pd.DataFrame) -> dict:
125
+ try:
126
+ return {
127
+ "Total Users": len(df),
128
+ "Total Courses": df['Courses'].sum(),
129
+ "Total Activity": df['Grand Total'].sum(),
130
+ "Avg Courses per User": df['Courses'].mean(),
131
+ "Avg Activity per User": df['Grand Total'].mean()
132
+ }
133
+ except Exception as e:
134
+ return {"Error": f"Failed to create summary stats: {str(e)}"}
135
+
136
+ def create_bar_chart(df: pd.DataFrame, x: str, y: str, title: str) -> Union[px.bar, None]:
137
+ try:
138
+ if df.empty:
139
+ return None
140
+ fig = px.bar(df, x=x, y=y, title=title)
141
+ fig.update_layout(
142
+ plot_bgcolor='white',
143
+ paper_bgcolor='white',
144
+ font_color=NTU_BLUE
145
+ )
146
+ fig.update_traces(marker_color=NTU_BLUE)
147
+ return fig
148
+ except Exception as e:
149
+ print(f"Error creating bar chart: {str(e)}")
150
+ return None
151
+
152
+ def create_scatter_plot(df: pd.DataFrame) -> Union[px.scatter, None]:
153
+ try:
154
+ if df.empty:
155
+ return None
156
+ fig = px.scatter(df, x='Courses', y='Grand Total', title='Courses vs. Activity Level',
157
+ hover_data=['Username', 'Name'])
158
+ fig.update_layout(
159
+ plot_bgcolor='white',
160
+ paper_bgcolor='white',
161
+ font_color=NTU_BLUE
162
+ )
163
+ fig.update_traces(marker_color=NTU_RED)
164
+ return fig
165
+ except Exception as e:
166
+ print(f"Error creating scatter plot: {str(e)}")
167
+ return None
168
+
169
+ def update_insights(df: pd.DataFrame) -> List[Union[gr.components.Component, None]]:
170
+ try:
171
+ if df.empty:
172
+ return [gr.Markdown("No data available. Please upload and process a file first.")] + [None] * 4
173
+
174
+ stats = create_summary_stats(df)
175
+ stats_md = gr.Markdown("\n".join([f"**{k}**: {v:.2f}" for k, v in stats.items()]))
176
+
177
+ users_activity_chart = create_bar_chart(df, 'Username', 'Grand Total', 'User Activity Levels')
178
+ users_courses_chart = create_bar_chart(df, 'Username', 'Courses', 'Courses per User')
179
+ scatter_plot = create_scatter_plot(df)
180
+
181
+ user_table = gr.DataFrame(value=df)
182
+
183
+ return [stats_md, users_activity_chart, users_courses_chart, scatter_plot, user_table]
184
+ except Exception as e:
185
+ error_msg = f"Error updating insights: {str(e)}\n\nTraceback:\n{traceback.format_exc()}"
186
+ return [gr.Markdown(error_msg)] + [None] * 4
187
+
188
+ def process_and_update(file):
189
+ try:
190
+ result_msg, csv_loc, df = process_data(file)
191
+ insights = update_insights(df)
192
+ return [result_msg, csv_loc] + insights
193
+ except Exception as e:
194
+ error_msg = f"Error in process_and_update: {str(e)}\n\nTraceback:\n{traceback.format_exc()}"
195
+ return [error_msg, "Processing failed"] + [gr.Markdown(error_msg)] + [None] * 4 # 4 is the number of plot components
196
+
197
+ # ... (previous code remains the same)
198
+
199
+ # Create a custom theme
200
+ custom_theme = gr.themes.Base().set(
201
+ body_background_fill="#E6F3FF",
202
+ body_text_color="#003D7C",
203
+ button_primary_background_fill="#C11E38",
204
+ button_primary_background_fill_hover="#A5192F",
205
+ button_primary_text_color="white",
206
+ block_title_text_color="#003D7C",
207
+ block_label_background_fill="#E6F3FF",
208
+ input_background_fill="white",
209
+ input_border_color="#003D7C",
210
+ input_border_color_focus="#C11E38",
211
+ )
212
+
213
+ # Load custom CSS
214
+ custom_css = """
215
+ .gr-button-secondary {
216
+ background-color: #F0F0F0;
217
+ color: #003D7C;
218
+ border: 1px solid #003D7C;
219
+ border-radius: 12px;
220
+ padding: 8px 16px;
221
+ font-size: 16px;
222
+ font-weight: bold;
223
+ cursor: pointer;
224
+ transition: background-color 0.3s, color 0.3s, border-color 0.3s;
225
+ }
226
+
227
+ .gr-button-secondary:hover {
228
+ background-color: #003D7C;
229
+ color: white;
230
+ border-color: #003D7C;
231
+ }
232
+
233
+ .gr-button-secondary:active {
234
+ transform: translateY(1px);
235
+ }
236
+
237
+ .app-title {
238
+ color: #003D7C;
239
+ font-size: 24px;
240
+ font-weight: bold;
241
+ text-align: center;
242
+ margin-bottom: 20px;
243
+ }
244
+ """
245
+
246
+ def clear_outputs():
247
+ return [""] * 2 + [None] * 5 # 2 text outputs and 5 graph/table outputs
248
+
249
+ with gr.Blocks(theme=custom_theme, css=custom_css) as iface:
250
+ gr.Markdown("# Gradebook Data Processor", elem_classes=["app-title"])
251
+
252
+ with gr.Tabs():
253
+ with gr.TabItem("File Upload and Processing"):
254
+ file_input = gr.File(label="Upload Excel File")
255
+ with gr.Row():
256
+ process_btn = gr.Button("Process Data", variant="primary")
257
+ clear_btn = gr.Button("Clear", variant="secondary")
258
+ output_msg = gr.Textbox(label="Processing Result")
259
+ csv_location = gr.Textbox(label="CSV Files Location")
260
+
261
+ with gr.TabItem("Data Insights Dashboard"):
262
+ with gr.Row():
263
+ summary_stats = gr.Markdown("Upload and process a file to see summary statistics.")
264
+
265
+ with gr.Row():
266
+ users_activity_chart = gr.Plot()
267
+ users_courses_chart = gr.Plot()
268
+
269
+ with gr.Row():
270
+ scatter_plot = gr.Plot()
271
+
272
+ with gr.Row():
273
+ user_table = gr.DataFrame()
274
+
275
+ process_btn.click(
276
+ process_and_update,
277
+ inputs=[file_input],
278
+ outputs=[output_msg, csv_location, summary_stats, users_activity_chart, users_courses_chart, scatter_plot, user_table]
279
+ )
280
+
281
+ clear_btn.click(
282
+ clear_outputs,
283
+ inputs=[],
284
+ outputs=[output_msg, csv_location, summary_stats, users_activity_chart, users_courses_chart, scatter_plot, user_table]
285
+ )
286
+
287
+ if __name__ == "__main__":
288
+ iface.launch()
custom_theme.css ADDED
@@ -0,0 +1,171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ :root {
2
+ --ntu-blue: #003D7C;
3
+ --ntu-red: #C11E38;
4
+ --ntu-gold: #E7B820;
5
+ --light-blue: #E6F3FF;
6
+ --white: #FFFFFF;
7
+ --light-gray: #F0F0F0;
8
+
9
+ /* Core Sizing */
10
+ --spacing-sm: 4px;
11
+ --spacing-md: 8px;
12
+ --spacing-lg: 16px;
13
+ --radius-sm: 4px;
14
+ --radius-md: 8px;
15
+ --radius-lg: 12px;
16
+ }
17
+
18
+ body {
19
+ font-family: 'Arial', sans-serif;
20
+ background-color: var(--light-blue);
21
+ color: var(--ntu-blue);
22
+ }
23
+
24
+ .gradio-container {
25
+ max-width: 1200px;
26
+ margin: 0 auto;
27
+ }
28
+
29
+ .app-title {
30
+ color: var(--ntu-blue);
31
+ font-size: 24px;
32
+ font-weight: bold;
33
+ text-align: center;
34
+ margin-bottom: 20px;
35
+ }
36
+
37
+ .gr-button-primary {
38
+ background-color: var(--ntu-red);
39
+ color: var(--white);
40
+ border: none;
41
+ border-radius: var(--radius-lg);
42
+ padding: var(--spacing-md) var(--spacing-lg);
43
+ font-size: 16px;
44
+ font-weight: bold;
45
+ cursor: pointer;
46
+ transition: background-color 0.3s, transform 0.1s;
47
+ }
48
+
49
+ .gr-button-primary:hover {
50
+ background-color: #A5192F; /* Darker red on hover */
51
+ color: var(--white);
52
+ }
53
+
54
+ .gr-button-primary:active {
55
+ transform: translateY(1px);
56
+ }
57
+
58
+ .gr-form {
59
+ background-color: var(--white);
60
+ border: 1px solid var(--ntu-blue);
61
+ border-radius: var(--radius-lg);
62
+ padding: var(--spacing-lg);
63
+ }
64
+
65
+ .gr-box, .gr-input, .gr-file-drop {
66
+ background-color: var(--white);
67
+ border: 1px solid var(--ntu-blue);
68
+ border-radius: var(--radius-md);
69
+ padding: var(--spacing-md);
70
+ }
71
+
72
+ .gr-input:focus, .gr-file-drop:hover {
73
+ border-color: var(--ntu-red);
74
+ outline: none;
75
+ }
76
+
77
+ .gr-panel {
78
+ background-color: var(--light-gray);
79
+ border: 1px solid var(--ntu-blue);
80
+ border-radius: var(--radius-lg);
81
+ padding: var(--spacing-lg);
82
+ }
83
+
84
+ .gr-box {
85
+ background-color: var(--white);
86
+ border: 1px solid var(--ntu-blue);
87
+ border-radius: var(--radius-md);
88
+ padding: var(--spacing-md);
89
+ }
90
+
91
+ .gr-form {
92
+ background-color: var(--white);
93
+ border-radius: var(--radius-lg);
94
+ padding: var(--spacing-lg);
95
+ }
96
+
97
+ .gr-block-label {
98
+ color: var(--ntu-blue);
99
+ font-weight: bold;
100
+ }
101
+
102
+ .tabs {
103
+ border-bottom: 2px solid var(--ntu-blue);
104
+ margin-bottom: 20px;
105
+ }
106
+
107
+ .tab-nav {
108
+ background-color: var(--white);
109
+ border: 1px solid var(--ntu-blue);
110
+ border-bottom: none;
111
+ padding: var(--spacing-md) var(--spacing-lg);
112
+ margin-right: 5px;
113
+ border-radius: var(--radius-md) var(--radius-md) 0 0;
114
+ color: var(--ntu-blue);
115
+ font-weight: bold;
116
+ transition: background-color 0.3s, color 0.3s;
117
+ }
118
+
119
+ .tab-nav:hover, .tab-nav.selected {
120
+ background-color: var(--ntu-blue);
121
+ color: var(--white);
122
+ }
123
+
124
+ .gr-file-drop {
125
+ border: 2px dashed var(--ntu-blue);
126
+ background-color: var(--light-blue);
127
+ transition: border-color 0.3s, background-color 0.3s;
128
+ }
129
+
130
+ .gr-file-drop:hover {
131
+ border-color: var(--ntu-red);
132
+ background-color: var(--white);
133
+ }
134
+
135
+ .gr-file-drop .icon {
136
+ color: var(--ntu-gold);
137
+ }
138
+
139
+ .gr-file-drop:hover .icon {
140
+ color: var(--ntu-red);
141
+ }
142
+
143
+ /* Ensure text is always legible on hover */
144
+ .gr-button-primary:hover,
145
+ .tab-nav:hover,
146
+ .gr-file-drop:hover {
147
+ color: var(--white);
148
+ text-shadow: 1px 1px 2px rgba(0,0,0,0.2);
149
+ }
150
+
151
+ .gr-button-secondary {
152
+ background-color: var(--light-gray);
153
+ color: var(--ntu-blue);
154
+ border: 1px solid var(--ntu-blue);
155
+ border-radius: var(--radius-lg);
156
+ padding: var(--spacing-md) var(--spacing-lg);
157
+ font-size: 16px;
158
+ font-weight: bold;
159
+ cursor: pointer;
160
+ transition: background-color 0.3s, color 0.3s, border-color 0.3s;
161
+ }
162
+
163
+ .gr-button-secondary:hover {
164
+ background-color: var(--ntu-blue);
165
+ color: var(--white);
166
+ border-color: var(--ntu-blue);
167
+ }
168
+
169
+ .gr-button-secondary:active {
170
+ transform: translateY(1px);
171
+ }
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ pandas
2
+ gradio
3
+ plotly
4
+ xlsxwriter
5
+ openpyxl