Spaces:
Sleeping
Sleeping
File size: 9,682 Bytes
51a50d5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 f7f56cb 5ddccf5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 |
---
title: IoT Sensor Data RAG for Smart Buildings
emoji: π’
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: "1.42.1"
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# IoT Sensor Data RAG for Smart Buildings
## π’ Problem Statement
Create a RAG system that processes IoT sensor data, maintenance manuals, and building specifications to provide predictive maintenance insights and operational optimization.
## π― Key Requirements
- β
**IoT sensor data ingestion and real-time processing**
- β
**Maintenance manual and building specification integration**
- β
**Predictive maintenance algorithm implementation**
- β
**Operational efficiency optimization recommendations**
- β
**Anomaly detection and alert systems**
## π Technical Challenges Solved
- β
**Real-time sensor data streaming and processing**
- β
**Multi-sensor data fusion and correlation**
- β
**Predictive modeling for equipment failure**
- β
**Building system integration and compatibility**
- β
**Energy efficiency optimization algorithms**
## ποΈ System Architecture
### Core Components
- **RAG Engine**: Vector database (ChromaDB) with Sentence-Transformers embeddings
- **IoT Data Processor**: Real-time sensor data streaming and anomaly detection
- **Predictive Analytics**: Equipment failure prediction and maintenance recommendations
- **Document Intelligence**: PDF/TXT processing with smart chunking strategies
- **Web Interface**: Modern Streamlit dashboard with Material design theme
### Technology Stack
- **Backend**: Python, Streamlit, ChromaDB
- **Embeddings**: Sentence-Transformers (all-MiniLM-L6-v2)
- **Vector Database**: ChromaDB with cosine similarity
- **LLM Integration**: Local Transformers + OpenAI API (optional)
- **Data Processing**: Pandas, NumPy, Scikit-learn
- **Visualization**: Plotly for real-time sensor monitoring
## π Features
### 1. Real-Time IoT Monitoring
- Live sensor data streaming simulation
- Multi-sensor data fusion (temperature, humidity, power consumption)
- Real-time anomaly detection using rolling z-score analysis
- Interactive time-series visualizations
### 2. Intelligent Document RAG
- PDF and TXT document ingestion
- Smart text chunking (500 tokens with 50 token overlap)
- Context-aware retrieval using vector similarity
- Source attribution and relevance scoring
### 3. Predictive Maintenance
- Equipment failure prediction algorithms
- Maintenance schedule optimization
- Energy efficiency recommendations
- Anomaly-based alert systems
### 4. Evaluation & Analytics
- Retrieval accuracy metrics
- Response latency measurement
- Document relevance scoring
- System performance monitoring
## π Quick Start
### Prerequisites
- Python 3.8+
- 8GB+ RAM (for local LLM models)
- Internet connection (for initial model downloads)
### Installation
```bash
# Clone the repository
git clone https://github.com/itsnewcoder/iot-smart-building-rag.git
cd iot-smart-building-rag
# Create virtual environment
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txt
```
### Configuration
Create a `.env` file in the root directory (optional):
```env
OPENAI_API_KEY=your_openai_api_key_here
```
### Run Locally
```bash
streamlit run app.py
```
**Access your app at:** `http://localhost:8501`
## π Project Structure
```
iot-smart-building-rag/
βββ app.py # Main Streamlit application
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ .streamlit/
β βββ config.toml # Streamlit theme configuration
βββ rag/ # RAG system core
β βββ __init__.py
β βββ ingest.py # Document ingestion & vector store
β βββ retrieval.py # Context retrieval engine
β βββ generate.py # LLM response generation
β βββ evaluate.py # System evaluation metrics
βββ models/ # Predictive models
β βββ __init__.py
β βββ predictive.py # Anomaly detection & maintenance
βββ data/ # Sample data
β βββ manuals/ # Maintenance manuals (PDF/TXT)
β βββ specs/ # Building specifications
β βββ sensors/ # IoT sensor data (CSV)
βββ .chroma/ # Vector database storage
```
## π§ Usage Guide
### 1. Dashboard Tab
- **Start Stream**: Begin real-time sensor data simulation
- **Live Monitoring**: View real-time sensor readings and trends
- **Anomaly Detection**: See detected anomalies with z-score analysis
- **Maintenance Tips**: Get AI-powered maintenance recommendations
### 2. RAG QA Tab
- **Ask Questions**: Query maintenance procedures and building specs
- **Context Retrieval**: View relevant document chunks and sources
- **AI Responses**: Get context-aware answers from local or OpenAI models
### 3. Evaluation Tab
- **Retrieval Testing**: Test system with custom queries
- **Performance Metrics**: View latency and relevance scores
- **Quality Assessment**: Evaluate RAG system effectiveness
### 4. Data Manager Tab
- **Document Index**: View indexed documents and sources
- **File Upload**: Add new PDFs/TXTs to the knowledge base
- **Vector Store**: Manage document embeddings and storage
## π Sample Queries
Try these example questions in the RAG QA tab:
- "How to reset chiller pump?"
- "What are the fault codes for HVAC systems?"
- "How to maintain building temperature sensors?"
- "What are the power consumption optimization tips?"
- "How to troubleshoot humidity sensor issues?"
## π― Evaluation Metrics
### Retrieval Quality
- **Relevance Scoring**: Cosine similarity-based ranking
- **Source Attribution**: Document source tracking
- **Context Retrieval**: Top-k document retrieval
### Performance Metrics
- **Response Latency**: End-to-end query processing time
- **Throughput**: Queries processed per second
- **Memory Usage**: Vector database storage efficiency
### RAG Effectiveness
- **Context Relevance**: Retrieved document quality
- **Answer Accuracy**: Response relevance to queries
- **Source Diversity**: Multiple document source utilization
## π Deployment
### HuggingFace Spaces (Recommended)
1. Create new Space at [huggingface.co/spaces](https://huggingface.co/spaces)
2. Choose **Streamlit** as SDK
3. Upload project files
4. Set environment variables in Space settings
### Streamlit Cloud
1. Push code to GitHub
2. Connect repository at [share.streamlit.io](https://share.streamlit.io)
3. Deploy automatically
### Local Deployment
```bash
# Production server
streamlit run app.py --server.port 8501 --server.address 0.0.0.0
```
## π Technical Implementation Details
### Embedding Strategy
- **Model**: `sentence-transformers/all-MiniLM-L6-v2`
- **Dimensions**: 384
- **Normalization**: L2 normalization for cosine similarity
- **Chunking**: 500 tokens with 50 token overlap
### Vector Database
- **Database**: ChromaDB
- **Similarity**: Cosine distance
- **Persistence**: Local file storage (.chroma directory)
- **Indexing**: HNSW algorithm for fast retrieval
### Anomaly Detection
- **Method**: Rolling z-score analysis
- **Window Size**: 50 data points
- **Threshold**: Z-score > 3.0
- **Metrics**: Temperature, humidity, power consumption
### Predictive Maintenance
- **Algorithm**: Rule-based heuristics + statistical analysis
- **Input**: Sensor data + anomaly patterns
- **Output**: Maintenance recommendations + efficiency tips
- **Real-time**: Continuous monitoring and updates
## π§ͺ Testing
### Local Testing
```bash
# Test RAG modules
python -c "from rag.ingest import ensure_vector_store; print('β
RAG Ready')"
# Test predictive models
python -c "from models.predictive import detect_anomalies; print('β
Models Ready')"
# Test full application
streamlit run app.py
```
### Sample Data
The system includes sample data for testing:
- **HVAC Sensor Data**: Temperature, humidity, power readings
- **Chiller Manual**: Maintenance procedures and fault codes
- **Building Specs**: System specifications and requirements
## π€ Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Academic Use
This project was developed as part of an academic RAG system implementation course. It demonstrates:
- **RAG Architecture**: Complete retrieval-augmented generation system
- **IoT Integration**: Real-time sensor data processing
- **Predictive Analytics**: Machine learning for maintenance
- **Vector Databases**: ChromaDB implementation
- **Modern Web UI**: Streamlit-based dashboard
## π Support
For questions or issues:
- **GitHub Issues**: [Create an issue](https://github.com/itsnewcoder/iot-smart-building-rag/issues)
- **Documentation**: Check this README and code comments
- **Community**: Streamlit and HuggingFace communities
## π Future Enhancements
- [ ] Real-time IoT device integration
- [ ] Advanced ML models for failure prediction
- [ ] Multi-modal document support (images, audio)
- [ ] API endpoints for external systems
- [ ] Mobile-responsive interface
- [ ] Advanced analytics dashboard
- [ ] Integration with building management systems
---
**Built with β€οΈ for Smart Building Intelligence**
*Last updated: January 2025*
|