|
# Dataset Plan for AI/ML Models |
|
|
|
This document outlines the datasets required for implementing the AI/ML functionalities described in the ai_plan.md file. For each model, we'll provide the CSV structure and example records. |
|
|
|
## 1. Intelligent Routing and Workflow Automation |
|
|
|
### 1.1 Grievance Data (grievances.csv) |
|
|
|
Fields: |
|
- grievance_id (string) |
|
- category (string) |
|
- submission_timestamp (datetime) |
|
- student_room_no (string) |
|
- hostel_name (string) |
|
- floor_number (integer) |
|
|
|
Examples: |
|
```csv |
|
grievance_id,category,submission_timestamp,student_room_no,hostel_name,floor_number |
|
G12345,plumber,2023-10-01T10:30:00Z,204,bh1,2 |
|
G12346,electricity,2023-10-02T08:15:00Z,101,bh2,1 |
|
G12347,internet,2023-10-03T14:20:00Z,305,bh3,3 |
|
G12348,water cooler,2023-10-04T09:45:00Z,102,ivh,1 |
|
G12349,sweeper,2023-10-05T11:00:00Z,201,gh,2 |
|
G12350,carpenter,2023-10-06T13:30:00Z,303,bh1,3 |
|
``` |
|
|
|
### 1.2 Staff Data (staff.csv) |
|
|
|
Fields: |
|
- staff_id (string) |
|
- department (string) |
|
- current_workload (integer) |
|
- availability_status (string) |
|
- past_resolution_rate (float) |
|
|
|
Examples: |
|
```csv |
|
staff_id,department,current_workload,availability_status,past_resolution_rate |
|
S67890,plumber,5,Available,0.95 |
|
S67891,electricity,3,Available,0.92 |
|
S67892,internet,4,Unavailable,0.88 |
|
S67893,water cooler,2,Available,0.97 |
|
S67894,sweeper,6,Available,0.91 |
|
S67895,carpenter,1,Available,0.94 |
|
``` |
|
|
|
### 1.3 Historical Assignments (historical_assignments.csv) |
|
|
|
Fields: |
|
- grievance_id (string) |
|
- staff_id (string) |
|
- assignment_timestamp (datetime) |
|
- resolution_time (string) |
|
- outcome (string) |
|
- floor_number (integer) |
|
|
|
Examples: |
|
```csv |
|
grievance_id,staff_id,assignment_timestamp,resolution_time,outcome,floor_number |
|
G12340,S67890,2023-09-25T09:00:00Z,2 hours,Resolved Successfully,1 |
|
G12341,S67891,2023-09-26T10:30:00Z,1 hour,Resolved Successfully,2 |
|
G12342,S67892,2023-09-27T14:15:00Z,3 hours,Unresolved,3 |
|
G12343,S67893,2023-09-28T11:45:00Z,30 minutes,Resolved Successfully,1 |
|
G12344,S67894,2023-09-29T08:30:00Z,1 hour,Resolved Successfully,2 |
|
G12345,S67895,2023-09-30T13:00:00Z,2 hours,Pending,3 |
|
``` |
|
|
|
### 1.4 Floor Metrics (floor_metrics.csv) |
|
|
|
Fields: |
|
- hostel_name (string) |
|
- floor_number (integer) |
|
- date (date) |
|
- number_of_requests (integer) |
|
- total_delays (integer) |
|
|
|
Examples: |
|
```csv |
|
hostel_name,floor_number,date,number_of_requests,total_delays |
|
bh1,2,2023-10-01,20,2 |
|
bh2,1,2023-10-02,15,1 |
|
bh3,3,2023-10-03,18,3 |
|
ivh,1,2023-10-04,12,0 |
|
gh,2,2023-10-05,22,4 |
|
``` |
|
|
|
### 1.5 Availability Data (availability.csv) |
|
|
|
Fields: |
|
- id (string) |
|
- type (string) # 'staff' or 'student' |
|
- date (date) |
|
- time_slot (string) |
|
- availability_status (string) |
|
|
|
Examples: |
|
```csv |
|
id,type,date,time_slot,availability_status |
|
S67890,staff,2023-10-02,09:00-12:00,Available |
|
S67891,staff,2023-10-02,14:00-17:00,Unavailable |
|
STU123,student,2023-10-02,10:00-12:00,Available |
|
STU124,student,2023-10-02,14:00-16:00,Unavailable |
|
S67892,staff,2023-10-03,08:00-11:00,Available |
|
STU125,student,2023-10-03,09:00-11:00,Available |
|
``` |
|
|
|
## 2. Advanced Sentiment and Emotional Intelligence Analysis |
|
|
|
### 2.1 Grievance Sentiment Data (grievance_sentiment.csv) |
|
|
|
Fields: |
|
- grievance_id (string) |
|
- text (string) |
|
- emotional_label (string) |
|
|
|
Example: |
|
```csv |
|
grievance_id,text,emotional_label |
|
G12347,"I am extremely frustrated with the constant power outages in my room.",Frustration |
|
G12348,"Thank you for resolving my issue so quickly.",Satisfaction |
|
``` |
|
|
|
## 3. Context-Aware Chatbots for Initial Grievance Handling |
|
|
|
### 3.1 Conversation Dataset (conversations.csv) |
|
|
|
Fields: |
|
- conversation_id (string) |
|
- user_message (string) |
|
- intent (string) |
|
- entities (string, JSON format) |
|
- bot_response (string) |
|
|
|
Example: |
|
```csv |
|
conversation_id,user_message,intent,entities,bot_response |
|
C12345,"I want to report a broken heater in my room.",submit_grievance,"{'grievance_type': 'Broken Heater', 'room_number': '305'}","I'm sorry to hear that. Could you please provide your room number?" |
|
``` |
|
|
|
### 3.2 FAQs Dataset (faqs.csv) |
|
|
|
Fields: |
|
- question_id (string) |
|
- question (string) |
|
- answer (string) |
|
|
|
Example: |
|
```csv |
|
question_id,question,answer |
|
Q12345,"How can I check the status of my grievance?","You can check the status by logging into your dashboard and navigating to the 'My Grievances' section." |
|
``` |
|
|
|
## 4. Anomaly Detection in Grievance Patterns |
|
|
|
### 4.1 Grievance Logs (grievance_logs.csv) |
|
|
|
Fields: |
|
- grievance_id (string) |
|
- category (string) |
|
- timestamp (datetime) |
|
- description (string) |
|
|
|
Example: |
|
```csv |
|
grievance_id,category,timestamp,description |
|
G12350,Plumbing,2023-10-03T14:20:00Z,"Clogged drain in the kitchen area." |
|
``` |
|
|
|
### 4.2 Anomaly Indicators (anomaly_indicators.csv) |
|
|
|
Fields: |
|
- date (date) |
|
- category (string) |
|
- grievance_count (integer) |
|
- is_anomaly (boolean) |
|
|
|
Example: |
|
```csv |
|
date,category,grievance_count,is_anomaly |
|
2023-10-01,Electrical,50,false |
|
2023-10-02,Electrical,150,true |
|
``` |
|
|
|
## 5. Multilingual Translation in Chatroom |
|
|
|
### 5.1 Translation Pairs (translation_pairs.csv) |
|
|
|
Fields: |
|
- pair_id (string) |
|
- source_language (string) |
|
- target_language (string) |
|
- source_sentence (string) |
|
- target_sentence (string) |
|
|
|
Example: |
|
```csv |
|
pair_id,source_language,target_language,source_sentence,target_sentence |
|
TP12345,Hindi,English,"toilet me paani nahi aa rha hain","There is no water coming in the toilet." |
|
``` |
|
|
|
### 5.2 Language Preferences (language_preferences.csv) |
|
|
|
Fields: |
|
- user_id (string) |
|
- preferred_language (string) |
|
|
|
Example: |
|
```csv |
|
user_id,preferred_language |
|
U12345,Hindi |
|
W67890,English |
|
``` |
|
|
|
## 6. Worker Job Recommendation |
|
|
|
### 6.1 Job Requests (job_requests.csv) |
|
|
|
Fields: |
|
- job_id (string) |
|
- type (string) |
|
- description (string) |
|
- urgency_level (string) |
|
- submission_timestamp (datetime) |
|
- hostel_name (string) |
|
- floor_number (integer) |
|
- room_number (string) |
|
|
|
Example: |
|
```csv |
|
job_id,type,description,urgency_level,submission_timestamp,hostel_name,floor_number,room_number |
|
J12345,Electrical,"Fan not working in room 204.",High,2023-10-02T08:15:00Z,Hostel A,2,204 |
|
``` |
|
|
|
### 6.2 Worker Profiles (worker_profiles.csv) |
|
|
|
Fields: |
|
- worker_id (string) |
|
- department (string) |
|
- current_workload (integer) |
|
- availability_status (string) |
|
- job_success_rate (float) |
|
|
|
Example: |
|
```csv |
|
worker_id,department,current_workload,availability_status,job_success_rate |
|
W67890,Electrical,3,Available,0.95 |
|
``` |
|
|
|
### 6.3 Historical Job Assignments (historical_job_assignments.csv) |
|
|
|
Fields: |
|
- job_id (string) |
|
- worker_id (string) |
|
- assignment_timestamp (datetime) |
|
- resolution_time (string) |
|
- outcome (string) |
|
- hostel_name (string) |
|
- floor_number (integer) |
|
- room_number (string) |
|
|
|
Example: |
|
```csv |
|
job_id,worker_id,assignment_timestamp,resolution_time,outcome,hostel_name,floor_number,room_number |
|
J12340,W67885,2023-09-25T09:00:00Z,2 hours,Resolved Successfully,Hostel A,1,101 |
|
``` |
|
|
|
### 6.4 Worker Locations (worker_locations.csv) |
|
|
|
Fields: |
|
- worker_id (string) |
|
- timestamp (datetime) |
|
- hostel_name (string) |
|
- floor_number (integer) |
|
- room_number (string) |
|
|
|
Example: |
|
```csv |
|
worker_id,timestamp,hostel_name,floor_number,room_number |
|
W67890,2023-10-02T08:00:00Z,Hostel A,2,210 |
|
``` |
|
``` |
|
data/ |
|
β |
|
βββ intelligent_routing/ |
|
β βββ grievances.csv |
|
β βββ staff.csv |
|
β βββ historical_assignments.csv |
|
β βββ floor_metrics.csv |
|
β βββ availability.csv |
|
β |
|
βββ sentiment_analysis/ |
|
β βββ grievance_sentiment.csv |
|
β |
|
βββ chatbot/ |
|
β βββ conversations.csv |
|
β βββ faqs.csv |
|
β |
|
βββ anomaly_detection/ |
|
β βββ grievance_logs.csv |
|
β βββ anomaly_indicators.csv |
|
β |
|
βββ multilingual_translation/ |
|
β βββ translation_pairs.csv |
|
β βββ language_preferences.csv |
|
β |
|
βββ job_recommendation/ |
|
βββ job_requests.csv |
|
βββ worker_profiles.csv |
|
βββ historical_job_assignments.csv |
|
βββ worker_locations.csv |
|
|
|
``` |