Update README.md
Browse files
README.md
CHANGED
|
@@ -42,31 +42,32 @@ A powerful Model Context Protocol (MCP) server for intelligent content managemen
|
|
| 42 |
|
| 43 |
## π Complete File Structure
|
| 44 |
intelligent-content-organizer/
|
| 45 |
-
βββ app.py
|
| 46 |
-
βββ config.py
|
| 47 |
-
βββ mcp_server.py
|
| 48 |
-
βββ requirements.txt
|
| 49 |
-
βββ README.md
|
| 50 |
-
βββ .gitignore
|
| 51 |
-
βββ core/
|
| 52 |
-
β
|
| 53 |
-
β
|
| 54 |
-
β
|
| 55 |
-
β
|
| 56 |
-
β
|
| 57 |
-
βββ services/
|
| 58 |
-
β
|
| 59 |
-
β
|
| 60 |
-
β
|
| 61 |
-
β
|
| 62 |
-
β
|
| 63 |
-
β
|
| 64 |
-
βββ mcp_tools/
|
| 65 |
βββ init.py
|
| 66 |
-
βββ ingestion_tool.py
|
| 67 |
-
βββ search_tool.py
|
| 68 |
-
βββ generative_tool.py
|
| 69 |
-
βββ utils.py
|
|
|
|
| 70 |
|
| 71 |
## π― Key Features Implemented
|
| 72 |
|
|
@@ -80,12 +81,10 @@ intelligent-content-organizer/
|
|
| 80 |
|
| 81 |
## π₯ Demo Video
|
| 82 |
|
| 83 |
-
[πΉ Watch the demo video](https://
|
| 84 |
|
| 85 |
*The demo shows the MCP server in action, demonstrating document ingestion, semantic search, and Q&A capabilities, utilizing the configured LLM providers.*
|
| 86 |
|
| 87 |
-
## π οΈ Installation
|
| 88 |
-
|
| 89 |
### Prerequisites
|
| 90 |
|
| 91 |
- Python 3.9+
|
|
@@ -152,12 +151,11 @@ intelligent-content-organizer/
|
|
| 152 |
- `confidence` (string, optional): Confidence level in the answer (LLM-dependent, might not always be present).
|
| 153 |
|
| 154 |
π Performance
|
| 155 |
-
|
| 156 |
Embedding Generation: ~100-500ms per document chunk
|
| 157 |
Search: <50ms for most queries
|
| 158 |
Summarization: 1-5s depending on content length
|
| 159 |
Memory Usage: ~200-500MB base + ~1MB per 1000 document chunks
|
| 160 |
-
Supported File Types: PDF, TXT, DOCX, PNG, JPG, JPEG
|
| 161 |
|
| 162 |
|
| 163 |
|
|
|
|
| 42 |
|
| 43 |
## π Complete File Structure
|
| 44 |
intelligent-content-organizer/
|
| 45 |
+
βββ app.py # Main Gradio app and MCP server
|
| 46 |
+
βββ config.py # Configuration management
|
| 47 |
+
βββ mcp_server.py # MCP server tools (registration, serving logic)
|
| 48 |
+
βββ requirements.txt # Dependencies
|
| 49 |
+
βββ README.md # Documentation
|
| 50 |
+
βββ .gitignore # Git ignore rules
|
| 51 |
+
βββ core/ # Core processing logic
|
| 52 |
+
β βββ init.py
|
| 53 |
+
β βββ models.py # Data models (e.g., Document, Chunk)
|
| 54 |
+
β βββ document_parser.py # Document processing (PDF, TXT, DOCX, etc.)
|
| 55 |
+
β βββ text_preprocessor.py # Text cleaning and processing
|
| 56 |
+
β βββ chunker.py # Text chunking strategies
|
| 57 |
+
βββ services/ # Backend services
|
| 58 |
+
β βββ init.py
|
| 59 |
+
β βββ embedding_service.py # Sentence transformers integration
|
| 60 |
+
β βββ llm_service.py # Anthropic + Mistral LLM integration
|
| 61 |
+
β βββ ocr_service.py # Mistral OCR integration
|
| 62 |
+
β βββ vector_store_service.py # FAISS vector storage
|
| 63 |
+
β βββ document_store_service.py # Document metadata storage (e.g., SQLite, JSON files)
|
| 64 |
+
βββ mcp_tools/ # MCP tool definitions
|
| 65 |
βββ init.py
|
| 66 |
+
βββ ingestion_tool.py # Document ingestion tool for MCP
|
| 67 |
+
βββ search_tool.py # Semantic search tool for MCP
|
| 68 |
+
βββ generative_tool.py # AI generation tool for MCP
|
| 69 |
+
βββ utils.py # Utility functions for MCP tools
|
| 70 |
+
|
| 71 |
|
| 72 |
## π― Key Features Implemented
|
| 73 |
|
|
|
|
| 81 |
|
| 82 |
## π₯ Demo Video
|
| 83 |
|
| 84 |
+
[πΉ Watch the demo video](https://youtu.be/uBYIj_ntFRk)
|
| 85 |
|
| 86 |
*The demo shows the MCP server in action, demonstrating document ingestion, semantic search, and Q&A capabilities, utilizing the configured LLM providers.*
|
| 87 |
|
|
|
|
|
|
|
| 88 |
### Prerequisites
|
| 89 |
|
| 90 |
- Python 3.9+
|
|
|
|
| 151 |
- `confidence` (string, optional): Confidence level in the answer (LLM-dependent, might not always be present).
|
| 152 |
|
| 153 |
π Performance
|
|
|
|
| 154 |
Embedding Generation: ~100-500ms per document chunk
|
| 155 |
Search: <50ms for most queries
|
| 156 |
Summarization: 1-5s depending on content length
|
| 157 |
Memory Usage: ~200-500MB base + ~1MB per 1000 document chunks
|
| 158 |
+
Supported File Types: PDF, TXT, DOCX, PNG, JPG, JPEG
|
| 159 |
|
| 160 |
|
| 161 |
|